FLUX.2 [klein] 9B
Black Forest Labs' 9B parameter distilled image model - sub-second generation with higher quality than the 4B variant, 19.6 GB VRAM, non-commercial license.
![FLUX.2 [klein] 9B](https://awesomeagents.ai/images/models/flux-2-klein-9b_hu_b55cd4f4fa04a0ed.jpg)
FLUX.2 [klein] 9B sits between the fully open 4B variant and the much larger 32B FLUX.2 Dev. It generates images in roughly 0.5 seconds on an NVIDIA GB200 and about 2 seconds on a consumer RTX 5090 - slower than its 4B sibling but noticeably sharper on fine details and text. The tradeoff: it requires 19.6 GB of VRAM (distilled) and ships under a non-commercial license.
TL;DR
- 9B parameter rectified flow transformer, distilled to 4 steps for sub-second datacenter inference
- ~0.5s on GB200, ~2s on RTX 5090 - 40% slower than 4B but visibly better quality
- 19.6 GB VRAM (distilled), 21.7 GB (base) - needs an RTX 4090 or better for local use
- Non-commercial license (FLUX Non-Commercial License) - no production deployment without a deal
- Same unified architecture: text-to-image, editing, and multi-reference in one model
Key Specifications
| Specification | Details |
|---|---|
| Provider | Black Forest Labs |
| Model Family | FLUX.2 |
| Parameters | 9 billion |
| Architecture | Rectified flow transformer + Mistral-3 24B VLM |
| Distillation | 4-step (distilled variant) |
| VRAM (Distilled) | 19.6 GB |
| VRAM (Base) | 21.7 GB |
| Inference Speed (GB200) | ~0.5 seconds |
| Inference Speed (RTX 5090) | ~2 seconds |
| Base Inference (GB200) | ~6 seconds |
| Base Inference (RTX 5090) | ~35 seconds |
| Default Resolution | 1024x1024 |
| Release Date | January 15, 2026 |
| License | FLUX Non-Commercial License |
| Open Weights | Yes (Hugging Face, non-commercial) |
Benchmark Performance
| Metric | FLUX.2 klein 4B | FLUX.2 klein 9B | FLUX.2 Dev (32B) | Nano Banana 2 |
|---|---|---|---|---|
| Parameters | 4B | 9B | 32B | Not disclosed |
| Inference (GB200) | ~0.3s | ~0.5s | 2-4s | 4-6s |
| VRAM (Distilled) | 8.4 GB | 19.6 GB | 80+ GB | Cloud only |
| License | Apache 2.0 | Non-commercial | Non-commercial | Proprietary |
| Fine-tuning | Yes | Limited | Yes (LoRA) | No |
| Text Rendering | Basic | Improved | ~60% | ~90% |
| Quality Tier | Good | Very Good | Excellent | Excellent |
The 9B model occupies a specific niche: researchers and hobbyists who want better quality than the 4B but cannot run the full 32B Dev model locally. The non-commercial license limits its appeal for startups and production use cases, where the Apache-licensed 4B or the API-only Pro/Max variants are more practical choices.
Key Capabilities
Quality Step-Up from 4B
The additional 5 billion parameters deliver visible improvements in fine detail rendering, texture consistency, and text legibility. Character faces show fewer artifacts, fabric textures resolve more cleanly, and simple text prompts succeed more reliably. These gains are most apparent in portrait-style images and scenes with complex material interactions.
Unified Architecture
Like all FLUX.2 [klein] variants, the 9B model handles text-to-image generation, image-to-image editing, and multi-reference composition in a single set of weights. No model switching required between tasks.
Base Model for Fine-Tuning
The undistilled base variant (21.7 GB VRAM, ~6s on GB200) serves as a higher-quality starting point for custom fine-tuning. The slower inference is acceptable in training pipelines where output quality matters more than latency. Note that fine-tuning rights are subject to the non-commercial license terms.
Pricing and Availability
| Access Point | Pricing | Status |
|---|---|---|
| Open Weights (Hugging Face) | Free (non-commercial) | Available |
| BFL API | Per-megapixel pricing | Available |
| ComfyUI | Free (local, non-commercial) | Supported |
| Diffusers (HuggingFace) | Free (local, non-commercial) | Supported |
Strengths
- Sub-second datacenter inference with noticeably better quality than the 4B variant
- Visible improvements in face rendering, textures, and text compared to the 4B
- Unified generation/editing/multi-reference architecture
- Open weights available for download and local inference
- 19.6 GB VRAM is reachable on high-end consumer GPUs (RTX 4090)
- Safety features: watermarking, C2PA signing, NSFW filtering
Weaknesses
- Non-commercial license blocks production deployment without separate agreement
- 19.6 GB VRAM excludes most mid-range consumer GPUs
- Text rendering still significantly below FLUX.2 Dev and Nano Banana 2
- Quality gap vs. the 32B Dev model is substantial - not a substitute for production work
- Limited community fine-tuning ecosystem due to license restrictions
- 40% slower than the 4B variant for real-time/interactive use cases
Related Coverage
- FLUX.2 [klein] 4B - The smaller, fully open Apache 2.0 sibling
- Best AI Image Generators 2026 - Full market comparison
- Nano Banana 2 - Google's competing image generation model
Sources:
✓ Last verified March 14, 2026
![FLUX.2 [klein] 9B](https://awesomeagents.ai/images/authors/james-kowalski_hu_7ab946b802bc1a95.jpg)