FLUX - A New Benchmark for Open Source Imagery

The open-source image domain has been in a slump since the SD3 model encountered issues, with a pressing need for a robust image model to ensure community development. Just in time, this model has arrived. Former Stability AI core member Robin Rombach has founded a new company and secured $32 million in funding. They have also directly released a series of image generation models, two of which are open source. Translation into English: Title: FLUX - A New Benchmark for Open Source Imagery Introduction: After the SD3 model encountered issues, the open-source image domain has been in a slump, with an urgent need for a powerful image model to ensure the community's development. Just in time, this model has arrived. Robin Rombach, a former core member of Stability AI, has founded a new company and secured $32 million in funding. They have also directly released a series of image generation models, two of which are open source.

Announcing Black Forest Labs
Black Forest Labs is launched with a mission to develop and advance state-of-the-art generative deep learning models for media like images and videos, aiming to be a fundamental part of future technologies. They want to make models widely available, educate the public, and build the industry standard for generative media. The FLUX.1 suite of text-to-image synthesis models is released as a first step.

The Black Forest Team
Comprising distinguished AI researchers and engineers with a good track record in different environments. Their innovations include VQGAN, Latent Diffusion, Stable Diffusion models, and Adversarial Diffusion Distillation. They believe in accessible models for innovation, collaboration, and transparency.

Funding
A $31 million Series Seed funding round was closed, led by Andreessen Horowitz, with participation from various investors and experts. Follow-up investments were received. The advisory board includes Michael Ovitz and Prof. Matthias Bethge.

Flux.1 Model Family
FLUX.1 [pro]: The top-performing variant for image generation, with excellent prompt following, visual quality, etc. Accessible via API, Replicate, and fal.ai, and offering enterprise solutions.
FLUX.1 [dev]: An open-weight, guidance-distilled model for non-commercial use, with similar quality to [pro] but more efficient. Weights on HuggingFace and can be tried on Replicate or Fal.ai. For commercial use, contact is required.
FLUX.1 [schnell]: The fastest model for local development and personal use, under an Apache2.0 license. Weights on Hugging Face and inference code on GitHub and in Diffusers, with integration for ComfyUI.

Transformer-powered Flow Models at Scale
FLUX.1 models are based on a hybrid architecture, scaled to 12B parameters. They improve on previous models using flow matching and incorporate rotary positional embeddings and parallel attention layers. A tech report will be published.

A new Benchmark for Image Synthesis
FLUX.1 defines a new state-of-the-art, surpassing popular models in multiple aspects and supporting diverse aspect ratios and resolutions.

Up Next: SOTA Text-to-Video for All
The FLUX.1 text-to-image model suite is a foundation for upcoming text-to-video systems that will enable high-definition and fast creation and editing.