FLUX.2 [klein] 9B sits between the fully open 4B variant and the much larger 32B FLUX.2 Dev. It generates images in roughly 0.5 seconds on an NVIDIA GB200 and about 2 seconds on a consumer RTX 5090 - slower than its 4B sibling but noticeably sharper on fine details and text. The tradeoff: it requires 19.6 GB of VRAM (distilled) and ships under a non-commercial license.

TL;DR

9B parameter rectified flow transformer, distilled to 4 steps for sub-second datacenter inference
~0.5s on GB200, ~2s on RTX 5090 - 40% slower than 4B but visibly better quality
19.6 GB VRAM (distilled), 21.7 GB (base) - needs an RTX 4090 or better for local use
Non-commercial license (FLUX Non-Commercial License) - no production deployment without a deal
Same unified architecture: text-to-image, editing, and multi-reference in one model

Key Specifications

Specification	Details
Provider	Black Forest Labs
Model Family	FLUX.2
Parameters	9 billion
Architecture	Rectified flow transformer + Mistral-3 24B VLM
Distillation	4-step (distilled variant)
VRAM (Distilled)	19.6 GB
VRAM (Base)	21.7 GB
Inference Speed (GB200)	~0.5 seconds
Inference Speed (RTX 5090)	~2 seconds
Base Inference (GB200)	~6 seconds
Base Inference (RTX 5090)	~35 seconds
Default Resolution	1024x1024
Release Date	January 15, 2026
License	FLUX Non-Commercial License
Open Weights	Yes (Hugging Face, non-commercial)

Benchmark Performance

Metric	FLUX.2 klein 4B	FLUX.2 klein 9B	FLUX.2 Dev (32B)	Nano Banana 2
Parameters	4B	9B	32B	Not disclosed
Inference (GB200)	~0.3s	~0.5s	2-4s	4-6s
VRAM (Distilled)	8.4 GB	19.6 GB	80+ GB	Cloud only
License	Apache 2.0	Non-commercial	Non-commercial	Proprietary
Fine-tuning	Yes	Limited	Yes (LoRA)	No
Text Rendering	Basic	Improved	~60%	~90%
Quality Tier	Good	Very Good	Excellent	Excellent

The 9B model occupies a specific niche: researchers and hobbyists who want better quality than the 4B but cannot run the full 32B Dev model locally. The non-commercial license limits its appeal for startups and production use cases, where the Apache-licensed 4B or the API-only Pro/Max variants are more practical choices.

Key Capabilities

Quality Step-Up from 4B

The additional 5 billion parameters deliver visible improvements in fine detail rendering, texture consistency, and text legibility. Character faces show fewer artifacts, fabric textures resolve more cleanly, and simple text prompts succeed more reliably. These gains are most apparent in portrait-style images and scenes with complex material interactions.

Unified Architecture

Like all FLUX.2 [klein] variants, the 9B model handles text-to-image generation, image-to-image editing, and multi-reference composition in a single set of weights. No model switching required between tasks.

Base Model for Fine-Tuning

The undistilled base variant (21.7 GB VRAM, ~6s on GB200) serves as a higher-quality starting point for custom fine-tuning. The slower inference is acceptable in training pipelines where output quality matters more than latency. Note that fine-tuning rights are subject to the non-commercial license terms.

Pricing and Availability

Access Point	Pricing	Status
Open Weights (Hugging Face)	Free (non-commercial)	Available
BFL API	Per-megapixel pricing	Available
ComfyUI	Free (local, non-commercial)	Supported
Diffusers (HuggingFace)	Free (local, non-commercial)	Supported

Strengths

Sub-second datacenter inference with noticeably better quality than the 4B variant
Visible improvements in face rendering, textures, and text compared to the 4B
Unified generation/editing/multi-reference architecture
Open weights available for download and local inference
19.6 GB VRAM is reachable on high-end consumer GPUs (RTX 4090)
Safety features: watermarking, C2PA signing, NSFW filtering

Weaknesses

Non-commercial license blocks production deployment without separate agreement
19.6 GB VRAM excludes most mid-range consumer GPUs
Text rendering still significantly below FLUX.2 Dev and Nano Banana 2
Quality gap vs. the 32B Dev model is substantial - not a substitute for production work
Limited community fine-tuning ecosystem due to license restrictions
40% slower than the 4B variant for real-time/interactive use cases

FLUX.2 [klein] 4B - The smaller, fully open Apache 2.0 sibling
Best AI Image Generators 2026 - Full market comparison
Nano Banana 2 - Google's competing image generation model

Sources: