Google Launches Nano Banana 2 - Pro-Level Image Generation at Flash Speed, Free for Everyone
Google DeepMind's Nano Banana 2, built on Gemini 3.1 Flash, delivers Pro-quality image generation and editing at twice the speed and half the price, rolling out free to all Gemini users across 141 countries.

Google DeepMind just launched Nano Banana 2, the successor to the image generation model that added 10 million users to the Gemini app and became a cultural phenomenon with its viral 3D figurine trend. Built on Gemini 3.1 Flash, it matches Nano Banana Pro's quality while creating images roughly twice as fast and at half the API cost. And unlike its predecessor, it's free for all Gemini users - no subscription required.
The model ID is gemini-3.1-flash-image-preview. It's rolling out today across the Gemini app, Google Search AI Mode, Flow (Google's AI creative studio), Google AI Studio, the Gemini API, and Vertex AI.
Key Specs
| Spec | Value |
|---|---|
| Model ID | gemini-3.1-flash-image-preview |
| Architecture | Gemini 3.1 Flash (natively multimodal) |
| Max resolution | 4K (4096x4096) |
| Generation speed | 4-6 seconds (~2x faster than Pro) |
| API price | ~$0.067/image ($60/M output tokens) |
| Text rendering accuracy | ~90% |
| Character consistency | Up to 5 characters |
| Consumer access | Free in Gemini app |
| Watermarking | SynthID + C2PA |
What Nano Banana 2 Actually Is
Architecture
Nano Banana 2 isn't a standalone diffusion model bolted onto a language model. It's natively multimodal - image generation is built directly into the Gemini 3.1 Flash architecture. This means it has access to Gemini's reasoning engine, real-time web knowledge, and conversational context in a single inference pass.
This is the fundamental architectural difference between Nano Banana and everything else in the space. DALL-E is a separate model from GPT. Midjourney is standalone. Stable Diffusion and Flux are independent diffusion models. Nano Banana creates images from inside the same model that handles text, reasoning, and web search.
The practical effect: you can ask it to "generate an infographic of today's weather forecast for Seattle" and it'll pull real-time data and render it into an image. Try that with Midjourney.
The Lineage
| Version | Model ID | Release | Architecture |
|---|---|---|---|
| Nano Banana | Gemini 2.5 Flash Image | August 2025 | Gemini 2.5 Flash |
| Nano Banana Pro | Gemini 3 Pro Image | November 2025 | Gemini 3 Pro |
| Nano Banana 2 | Gemini 3.1 Flash Image | February 2026 | Gemini 3.1 Flash |
The naming comes from a DeepMind engineer who submitted the original model anonymously to the Arena leaderboard at 2 a.m. "Nano" suggested compact efficiency. "Banana" was meant to be an absurd distraction to disguise Google's involvement. The name stuck after the model went viral.
Generation and Editing
The full capability set:
Generation: Text-to-image, image-to-image transformation, conversational iterative refinement, web-grounded generation (pulls real-time data), infographics, diagrams from notes, data visualizations.
Editing: Inpainting (remove/replace objects), outpainting (extend scenes beyond original boundaries), style transfer, text rendering and translation, object removal, lighting changes, 3D-aware local edits with scene coherence (shadows, reflections, edges stay consistent).
Consistency: Maintains character resemblance for up to 5 characters and fidelity for up to 14 objects from input images in a single workflow. This matters for storyboarding, marketing campaigns, and any project requiring consistent characters across multiple outputs.
Benchmarks
| Metric | Nano Banana 2 | Nano Banana Pro | Midjourney V7 | DALL-E 3 |
|---|---|---|---|---|
| FID Score (lower = better) | 12.4 | ~12 | 15.3 | N/A |
| Generation speed | 4-6 sec | 8-12 sec | 20-30 sec | 15-25 sec |
| Text rendering accuracy | ~90% | ~94% | 71% | Moderate |
| Max resolution | 4K | 4K | 2K | 1024x1024 |
| Character consistency | 5 chars | 5 chars | Limited | Limited |
| Price per image | ~$0.067 | ~$0.134 | $0.01-0.10 | Bundled |
The FID score of 12.4 represents the best photorealism in the space. Text rendering at 90% accuracy is a significant improvement over competitors - Midjourney V7 manages 71% - though it trails Nano Banana Pro's 94%. The tradeoff is explicit: you lose a few percentage points of text accuracy for roughly double the speed and half the cost.
Small text legibility drops notably: 61% at 16px, 47% at 12px. If your use case involves fine print, Nano Banana Pro remains the better choice.
Pricing and Availability
Consumer
Nano Banana 2 is the new default image generation model across all Gemini app modes (Fast, Thinking, Pro). It's free for all users - the Pro-level image generation capabilities that were previously locked behind a subscription are now available to everyone. Google AI Pro and Ultra subscribers retain access to Nano Banana Pro for specialized tasks requiring maximum quality.
The model is also live in Google Search AI Mode and Lens across 141 countries, and in Flow at zero credit cost.
Developer
| Platform | Status | Pricing |
|---|---|---|
| Gemini API | Preview | $60/M output tokens (~$0.067/image) |
| Google AI Studio | Preview | Same as API |
| Vertex AI | Preview | Enterprise pricing |
| Google Antigravity | Preview | Same as API |
| Gemini CLI | Preview | Same as API |
At roughly $0.067 per image, Nano Banana 2 is approximately 50% cheaper than Nano Banana Pro ($0.134/image). Text tokens are 75% cheaper than Pro. A 1024x1024 image consumes about 1,290 tokens.
Safety
All outputs carry SynthID watermarks (invisible to the naked eye, detectable by specialized tools) and C2PA Content Credentials for provenance tracking. Google blocks watermark removal requests and images that violate content policies.
What To Watch
Text rendering tradeoff. The 4-point accuracy gap between Nano Banana 2 (~90%) and Pro (~94%) matters for production use cases involving legible text in images - marketing materials, signage, infographics with labels. For most consumer use cases, 90% is fine. For enterprise workflows where every word must be correct, Pro is still the better model.
"Preview" status. Every developer platform listing says "Preview." API behavior, rate limits, and pricing could change. If you are building production workflows on Nano Banana 2, assume the specs are provisional.
The Pomelli overlap. Google launched Pomelli - an AI product photography tool - recently. With Nano Banana 2 now offering editing, style transfer, and scene-coherent object manipulation, the product lines overlap. Which tool Google recommends for what will matter for developers choosing an integration path.
Competitive response. The original Nano Banana drove 10 million new Gemini users and 200 million image edits. It temporarily pushed Gemini ahead of ChatGPT in app downloads. OpenAI, Midjourney, and Stability AI will have to respond to a model that's simultaneously faster, cheaper, and free at the consumer tier. The image generation landscape just shifted again.
The model that was named at 2 a.m. as a joke to hide Google's identity on a leaderboard is now the default image generator for Google's entire consumer ecosystem across 141 countries. Nano Banana 2 isn't the highest-quality image model Google offers - that's still Nano Banana Pro. But it is the one that most people will actually use, because it's fast, free, and good enough for what most people want to do with AI image generation. In a market where Midjourney charges subscriptions and DALL-E is bundled with ChatGPT Plus, "free and nearly as good as the best" is a compelling argument.
Sources:
- Nano Banana 2 Announcement - Google Blog
- Build with Nano Banana 2 - Google Blog
- Google launches Nano Banana 2 model with faster image generation - TechCrunch
- Nano Banana 2 is a faster version of Nano Banana Pro - Engadget
- Nano Banana 2 brings Pro quality at Flash speeds - 9to5Google
- Gemini Image Model - Google DeepMind
