OpenAI's New Mini and Nano Slash GPT-5.4 Pricing
OpenAI released GPT-5.4 mini and nano on March 17, bringing near-flagship performance at 70% and 92% lower cost respectively.

OpenAI released two new models on March 17: GPT-5.4 mini and GPT-5.4 nano. Mini costs 70% less than the flagship GPT-5.4 and closes most of the performance gap on coding and reasoning benchmarks. Nano goes further - at $0.20 per million input tokens, it's the cheapest model in the GPT-5 family and undercuts Google's Gemini 3.1 Flash-Lite on price.
The pair extends the GPT-5.4 family that launched earlier this month with computer use and Office integration. What's different now is accessibility: mini is available to ChatGPT's free tier for the first time, and nano is a pure API product aimed at developers running high-volume, low-complexity tasks.
TL;DR
- GPT-5.4 mini scores 54.4% on SWE-Bench Pro vs 57.7% for the flagship, at 30% of the cost
- GPT-5.4 nano is $0.20/M input tokens - cheaper than Gemini 3.1 Flash-Lite ($0.25/M)
- Both models have a 400,000-token context window
- Mini runs more than 2x faster than the previous GPT-5 mini
- Nano is API-only; mini is available to ChatGPT Free and Go tier users
| Model | Input (per M tokens) | Output (per M tokens) | Context |
|---|---|---|---|
| GPT-5.4 nano | $0.20 | $1.25 | 400K |
| GPT-5.4 mini | $0.75 | $4.50 | 400K |
| GPT-5.4 (flagship) | $2.50 | $15.00 | 400K |
What Mini Can Do
Coding and Reasoning
Mini scores 54.4% on SWE-Bench Pro against the flagship's 57.7% - a gap of 3.3 percentage points at 70% lower cost. On GPQA Diamond, which tests graduate-level science reasoning, mini reaches 88.0% vs 93.0% for the full model. For most developer use cases, the difference isn't meaningful.
The agentic tool benchmarks are more striking. On Toolathlon, mini scores 42.9% vs GPT-5 mini's 26.9% - a 16-point jump. On MCP Atlas, it hits 57.7% against 47.6% for its predecessor. These are the evaluations that matter most for anyone building agentic AI workflows.
Computer Use
OSWorld-Verified puts mini at 72.1%, with the flagship at 75.0%. A 2.9-point gap suggests mini is a viable - and substantially cheaper - option for computer use tasks that don't require pushing the absolute ceiling.
Speed
OpenAI says mini runs more than 2x faster than the previous GPT-5 mini. Community throughput measurements put mini around 180-190 tokens per second and nano around 200 tokens per second. Both compare to roughly 55-60 tokens per second for older GPT-5 mini at normal priority.
OpenAI launched mini and nano on March 17 as the newest additions to the GPT-5.4 family.
Source: dataconomy.com
Nano: Built for Sub-Tasks
OpenAI's framing for nano is direct: use it for classification, data extraction, ranking, and coding subagents that handle simpler supporting tasks. At $0.20 per million input tokens, developer Simon Willison calculated he could describe 76,000 photos for around $52 total - about 0.069 cents per image.
For context, Google Gemini 3.1 Flash-Lite costs $0.25 per million input tokens. Nano is 20% cheaper on input, though output pricing differs ($1.25/M vs $1.50/M for Flash-Lite).
Nano is available via the API only. There's no ChatGPT interface access at launch.
Where These Models Sit in OpenAI's Stack
Mini now fills ChatGPT's Free and Go tiers as the primary "Thinking" option, serving as a rate-limit fallback for GPT-5.4 Thinking on paid plans. It's also generally available in GitHub Copilot as of the same release date. In Codex, mini counts as 30% of the GPT-5.4 quota.
Both models support the full capability suite: text and image inputs, tool use, function calling, web search, file search, computer use, and skills.
OpenAI has been releasing new GPT-5.4 family members at a faster pace through March 2026.
Source: the-decoder.com
What It Does Not Tell You
The Long-Context Performance Gap
The benchmark that most concerns practitioners isn't on the spec sheet in OpenAI's headline claims. On OpenAI's own MRCR v2 evaluation - 8 needles retrieved across 64K to 128K context - mini scores 47.7% while the flagship scores 86.0%. That's a 38-point gap at the context lengths where extended reasoning matters most.
For document review, legal analysis, or anything requiring coherent comprehension across long texts, mini is a different product category from the flagship. The 400K context window is available, but accuracy within it degrades substantially at depth.
Understanding what benchmarks do and don't measure matters here - our benchmarks guide covers the methodology behind evaluations like MRCR.
Nano Has Few Benchmarks
OpenAI published only two evaluation scores for nano: OSWorld-Verified at 39.0% and Terminal-Bench 2.0 at 46.3%. There are no coding scores, no GPQA results, and no SWE-Bench numbers. That's sparse for a model entering a competitive small-model market where Gemini and Claude's smaller variants publish detailed benchmark suites.
Developers have no basis for comparing nano to alternatives from other labs on reasoning or coding tasks. Whether that's because nano underperforms or because OpenAI hasn't focused on the comparison is unclear. Either way, the missing data is a gap worth noting before committing workloads to it.
The release confirms a pattern across the GPT-5 family: OpenAI is tiering capability by price more aggressively than at any previous model generation. Mini captures 94% of the flagship's SWE-Bench performance at 30% of the cost. The tradeoff is real but narrow on most benchmarks - except long-context retrieval, where the gap is wide enough to matter for a significant class of applications.
Sources: OpenAI announcement via 9to5Mac | Simon Willison | Adam Holter benchmark analysis | Dataconomy | GitHub Copilot Changelog
