FLUX.2 [dev]

Black Forest Labs' 32B open-weight image model - the most powerful open alternative for text-to-image, editing, and multi-reference generation with up to 10 reference images.

FLUX.2 [dev]

FLUX.2 [dev] is the open-weight flagship of Black Forest Labs' second-generation image family. At 32 billion parameters, it couples a rectified flow transformer with Mistral-3's 24B vision-language model to deliver state-of-the-art text-to-image generation, single-reference editing, and multi-reference composition - all in one model. It ranks #9 on LM Arena's image generation leaderboard with a score of 1149, the highest of any open-weight model.

TL;DR

  • 32B parameter rectified flow transformer + Mistral-3 24B VLM for world knowledge
  • State-of-the-art open-weight image generation - LM Arena rank #9 (score 1149)
  • Multi-reference support: combine up to 10 images for character, style, and object consistency
  • ~32K token context from the VLM enables detailed, multi-part prompts
  • Open weights on Hugging Face; non-commercial license for weights, API available for commercial use
  • 80+ GB VRAM full precision, ~14-18 GB with FP8/4-bit quantization on RTX 4090

The model was released on November 25, 2025, predating the rest of the FLUX.2 lineup. It set the architectural template that the Pro, Max, and klein variants all build on: a flow transformer that handles spatial logic while the VLM handles language understanding, world knowledge, and contextual reasoning. The combination supports ~32K tokens of prompt context, enabling detailed scene descriptions and multi-step instructions.

Key Specifications

SpecificationDetails
ProviderBlack Forest Labs
Model FamilyFLUX.2
Parameters32 billion (flow transformer) + 24B (Mistral-3 VLM)
ArchitectureRectified flow transformer + Mistral-3 24B VLM
VAERetrained from scratch for improved quality
Prompt Context~32K tokens
Max ResolutionUp to 4MP (2048x2048)
Multi-ReferenceUp to 10 images
Inference Steps12-20 (preview), 28-50 (production)
VRAM (Full)80+ GB
VRAM (FP8/4-bit)~14-18 GB (RTX 4090)
Inference Speed2-4 seconds (optimized infra)
LM Arena Rank#9 (score 1149)
Release DateNovember 25, 2025
LicenseFLUX Non-Commercial License (weights), API for commercial
Open WeightsYes (Hugging Face)

Benchmark Performance

MetricFLUX.2 DevFLUX.2 MaxNano Banana 2Midjourney V7
LM Arena Rank#9#4N/AN/A
LM Arena Score11491168N/AN/A
Parameters32B + 24B VLM32B + 24B VLMNot disclosedNot disclosed
Multi-ReferenceUp to 10Up to 10Up to 10Limited
Max Resolution4MP4MP4K2K
Text Rendering~60%Best-in-class~90%71%
Open WeightsYesNoNoNo
Fine-tuningLoRA supportedNoNoNo

FLUX.2 [dev] consistently outperforms all open-weight alternatives by a significant margin across text-to-image, single-reference editing, and multi-reference editing. The gap to closed models (FLUX.2 Max, Midjourney V7, Nano Banana Pro) is narrower - roughly 19 ELO points behind Max, which translates to perceptible but not dramatic quality differences in most use cases.

Text rendering at ~60% accuracy remains a weak point across the FLUX.2 family. Nano Banana 2 leads here at ~90%, making it the better choice for infographics, UI mockups, or any output requiring legible text.

Key Capabilities

Multi-Reference Composition

The standout feature. FLUX.2 [dev] can accept 2-10 reference images and combine them into a novel output while maintaining character identity, style consistency, and object fidelity. No fine-tuning required - the model handles reference matching at inference time through the VLM's contextual understanding.

Use cases: brand-consistent marketing materials, character sheets for animation, product visualization with consistent styling, and storyboarding with recurring characters.

Vision-Language Model Integration

The Mistral-3 24B VLM is not just a text encoder - it brings world knowledge and contextual reasoning to the generation process. This enables physically plausible lighting, accurate spatial relationships, and contextually appropriate material properties. The ~32K token context window supports detailed, structured prompts that were impossible with previous-generation CLIP-based encoders.

LoRA Fine-Tuning

FLUX.2 [dev] supports LoRA adapters for custom fine-tuning, enabling domain-specific specialization without retraining the full model. The community has produced 11+ fine-tunes and 12+ adapters on Hugging Face. GGUF quantized versions from Unsloth and City96 make fine-tuning and inference more accessible on consumer hardware.

Image Editing

Beyond generation, the model handles image-to-image editing: style transfer, inpainting, outpainting, object replacement, and lighting modification. Single-reference and multi-reference editing workflows use the same model weights.

Pricing and Availability

Access PointPricingStatus
Open Weights (Hugging Face)Free (non-commercial)Available
BFL APIPer-megapixel pricingAvailable
WaveSpeed AI$0.012/imageAvailable
ReplicatePer-second pricingAvailable
ComfyUIFree (local)Supported
DiffusersFree (local)Supported

For commercial deployment, use the BFL API or third-party providers. The open weights are licensed for non-commercial use only - research, personal projects, and evaluation.

Strengths

  • Highest-quality open-weight image generation model available (LM Arena #9)
  • Multi-reference composition with up to 10 images - no fine-tuning needed
  • 32K token context enables detailed, structured prompts
  • LoRA fine-tuning for domain specialization
  • Active quantization ecosystem (GGUF, FP8) brings VRAM to 14-18 GB
  • Retrained VAE delivers improved quality at the compression boundary
  • Full editing capabilities in the same model weights

Weaknesses

  • 80+ GB VRAM at full precision - requires A100/H100 or heavy quantization for local use
  • Non-commercial license restricts production use to API
  • Text rendering (~60%) significantly trails Nano Banana 2 (~90%)
  • 2-4 second generation - much slower than klein 4B's sub-second speed
  • Complex multi-reference prompts can produce inconsistent results at the edges
  • No web-grounded generation (unlike Nano Banana 2's Gemini integration)

Sources:

✓ Last verified March 14, 2026

FLUX.2 [dev]
About the author AI Benchmarks & Tools Analyst

James is a software engineer turned tech writer who spent six years building backend systems at a fintech startup in Chicago before pivoting to full-time analysis of AI tools and infrastructure.