OpenAI's New Mini and Nano Slash GPT-5.4 Pricing

OpenAI released two new models on March 17: GPT-5.4 mini and GPT-5.4 nano. Mini costs 70% less than the flagship GPT-5.4 and closes most of the performance gap on coding and reasoning benchmarks. Nano goes further - at $0.20 per million input tokens, it's the cheapest model in the GPT-5 family and undercuts Google's Gemini 3.1 Flash-Lite on price.

The pair extends the GPT-5.4 family that launched earlier this month with computer use and Office integration. What's different now is accessibility: mini is available to ChatGPT's free tier for the first time, and nano is a pure API product aimed at developers running high-volume, low-complexity tasks.

TL;DR

GPT-5.4 mini scores 54.4% on SWE-Bench Pro vs 57.7% for the flagship, at 30% of the cost
GPT-5.4 nano is $0.20/M input tokens - cheaper than Gemini 3.1 Flash-Lite ($0.25/M)
Both models have a 400,000-token context window
Mini runs more than 2x faster than the previous GPT-5 mini
Nano is API-only; mini is available to ChatGPT Free and Go tier users

Model	Input (per M tokens)	Output (per M tokens)	Context
GPT-5.4 nano	$0.20	$1.25	400K
GPT-5.4 mini	$0.75	$4.50	400K
GPT-5.4 (flagship)	$2.50	$15.00	400K

What Mini Can Do

Coding and Reasoning

Mini scores 54.4% on SWE-Bench Pro against the flagship's 57.7% - a gap of 3.3 percentage points at 70% lower cost. On GPQA Diamond, which tests graduate-level science reasoning, mini reaches 88.0% vs 93.0% for the full model. For most developer use cases, the difference isn't meaningful.

The agentic tool benchmarks are more striking. On Toolathlon, mini scores 42.9% vs GPT-5 mini's 26.9% - a 16-point jump. On MCP Atlas, it hits 57.7% against 47.6% for its predecessor. These are the evaluations that matter most for anyone building agentic AI workflows.

Computer Use

OSWorld-Verified puts mini at 72.1%, with the flagship at 75.0%. A 2.9-point gap suggests mini is a viable - and substantially cheaper - option for computer use tasks that don't require pushing the absolute ceiling.

Speed

OpenAI says mini runs more than 2x faster than the previous GPT-5 mini. Community throughput measurements put mini around 180-190 tokens per second and nano around 200 tokens per second. Both compare to roughly 55-60 tokens per second for older GPT-5 mini at normal priority.

OpenAI GPT-5.4 mini and nano announcement illustration OpenAI launched mini and nano on March 17 as the newest additions to the GPT-5.4 family. Source: dataconomy.com

Nano: Built for Sub-Tasks

OpenAI's framing for nano is direct: use it for classification, data extraction, ranking, and coding subagents that handle simpler supporting tasks. At $0.20 per million input tokens, developer Simon Willison calculated he could describe 76,000 photos for around $52 total - about 0.069 cents per image.

For context, Google Gemini 3.1 Flash-Lite costs $0.25 per million input tokens. Nano is 20% cheaper on input, though output pricing differs ($1.25/M vs $1.50/M for Flash-Lite).

Nano is available via the API only. There's no ChatGPT interface access at launch.

Where These Models Sit in OpenAI's Stack

Mini now fills ChatGPT's Free and Go tiers as the primary "Thinking" option, serving as a rate-limit fallback for GPT-5.4 Thinking on paid plans. It's also generally available in GitHub Copilot as of the same release date. In Codex, mini counts as 30% of the GPT-5.4 quota.

Both models support the full capability suite: text and image inputs, tool use, function calling, web search, file search, computer use, and skills.

OpenAI headquarters with company branding on the office wall OpenAI has been releasing new GPT-5.4 family members at a faster pace through March 2026. Source: the-decoder.com

What It Does Not Tell You

The Long-Context Performance Gap

The benchmark that most concerns practitioners isn't on the spec sheet in OpenAI's headline claims. On OpenAI's own MRCR v2 evaluation - 8 needles retrieved across 64K to 128K context - mini scores 47.7% while the flagship scores 86.0%. That's a 38-point gap at the context lengths where extended reasoning matters most.

For document review, legal analysis, or anything requiring coherent comprehension across long texts, mini is a different product category from the flagship. The 400K context window is available, but accuracy within it degrades substantially at depth.

Understanding what benchmarks do and don't measure matters here - our benchmarks guide covers the methodology behind evaluations like MRCR.

Nano Has Few Benchmarks

OpenAI published only two evaluation scores for nano: OSWorld-Verified at 39.0% and Terminal-Bench 2.0 at 46.3%. There are no coding scores, no GPQA results, and no SWE-Bench numbers. That's sparse for a model entering a competitive small-model market where Gemini and Claude's smaller variants publish detailed benchmark suites.

Developers have no basis for comparing nano to alternatives from other labs on reasoning or coding tasks. Whether that's because nano underperforms or because OpenAI hasn't focused on the comparison is unclear. Either way, the missing data is a gap worth noting before committing workloads to it.

The release confirms a pattern across the GPT-5 family: OpenAI is tiering capability by price more aggressively than at any previous model generation. Mini captures 94% of the flagship's SWE-Bench performance at 30% of the cost. The tradeoff is real but narrow on most benchmarks - except long-context retrieval, where the gap is wide enough to matter for a significant class of applications.

Sources: OpenAI announcement via 9to5Mac | Simon Willison | Adam Holter benchmark analysis | Dataconomy | GitHub Copilot Changelog