DeepSeek Locks 75% V4-Pro Discount - API War Escalates

DeepSeek confirmed on Thursday that the 75% discount on V4-Pro is now permanent. What was billed as a launch promotion with a May 31 expiry has become the baseline price. Output tokens cost $0.87 per million from now on. That's 34 times below what GPT-5.5 charges for the same task.

TL;DR

75% discount on V4-Pro made permanent effective May 22, 2026 - no new expiry date
Output tokens permanently at $0.87/M vs $30/M for GPT-5.5 and $25/M for Claude Opus 4.7
Input tokens at $0.435/M, against $5/M for both GPT-5.5 and Claude Opus 4.7
The pricing is possible because V4-Pro runs completely on Huawei Ascend hardware, not Nvidia GPUs
DeepSeek is simultaneously seeking its first-ever external funding round at a reported $45B valuation

The Numbers

This is what the top of the API market looks like after the permanent cut:

Model	Input $/M	Output $/M	Context	Open weight
DeepSeek V4-Flash	$0.14	$0.28	1M	Yes (MIT)
Gemini 3.5 Flash	$0.15	$0.60	1M	No
DeepSeek V4-Pro	$0.435	$0.87	1M	Yes (MIT)
Gemini 2.5 Pro	$1.25	$10.00	1M	No
OpenAI GPT-5	$2.50	$10.00	128K	No
Claude Opus 4.7	$5.00	$25.00	200K	No
OpenAI GPT-5.5	$5.00	$30.00	128K	No

The performance disclaimer still applies. DeepSeek V4-Pro's own technical documentation states it trails GPT-5.4 and Gemini 3.1 Pro on world-knowledge tasks by "approximately three to six months." On coding benchmarks - where most enterprise AI spending is concentrated - V4-Pro sits within one to two percentage points of the closed-source frontier. The price gap there's $0.87 vs $30.

Server rack in a data center with blue indicator lights DeepSeek's V4-Pro API now offers 1M-token context at $0.435/M input and $0.87/M output - the new baseline for frontier-class API pricing. Source: pexels.com

Three Weeks to Forever

The discount's history matters for understanding what Thursday's announcement means. DeepSeek launched V4-Pro on April 24 at $1.74/M input and $3.48/M output. Days later, it announced a 75% promotional discount with an initial expiry of May 5. Before that date arrived, the company extended the promotion to May 31. On May 22, it removed the expiry completely.

Moving from a temporary launch promotion to permanent pricing isn't an accounting adjustment. DeepSeek has publicly stated its priority is "expanding technological boundaries and securing a leading position in AGI" rather than near-term monetization. The company doesn't need margin from its API. That structural freedom to price aggressively comes directly from the hardware it runs on.

The Huawei Equation

Western incumbents often attribute DeepSeek's pricing to subsidies or unsustainable loss-leading. The actual explanation is more uncomfortable for Washington.

V4-Pro runs on Huawei Ascend 950 processors manufactured by SMIC using domestic Chinese fab technology. US export controls blocked DeepSeek's access to Nvidia H100, H200, and Blackwell chips - the hardware underpinning every US frontier model. Constrained to less capable hardware, DeepSeek's engineers optimized around it with results that are hard to dismiss. V4-Pro activates only 49 billion of its 1.6 trillion parameters at inference. KV cache usage at a 1M-token context window is 10% of what V3.2 required. Single-token FLOPs dropped 73% versus the prior generation.

These aren't gradual improvements. They're the product of architectural work driven by restricted hardware access - the opposite of what the export controls were intended to produce.

Huawei is scaling Ascend 950 production through 2026. As that supply grows, DeepSeek's inference cost structure improves further. The company has stated prices could fall again.

"China has effectively closed the performance gap with US rivals. America still produces more top-tier models and higher-impact patents, but the margin has compressed materially," according to Stanford's AI Index 2026, published in April.

The Funding Context

The permanent pricing announcement coincides with DeepSeek's first-ever external funding round. The valuation climbed from a $10B initial target in mid-April to a reported $45B, led by China's state-backed Big Fund (China Integrated Circuit Industry Investment Fund) with Tencent and Alibaba in talks. Founder Liang Wenfeng controls roughly 90% of the company and reportedly initiated the raise to give employees equity, not because the business needed cash.

A company targeting a $45B valuation while permanently pricing its API at $0.87/M output tokens isn't mainly optimizing for revenue. It's optimizing for developer adoption. Once the ecosystem builds on DeepSeek's API, migration costs become real and pricing can evolve from a position of established usage - that's the playbook.

Stacked gold coins on a white surface DeepSeek's permanent price cut sets a new reference point for frontier-class API negotiations. Source: pexels.com

Counter-Argument

The permanent discount isn't a clean competitive win. Several objections hold up under scrutiny.

Performance gaps are real. V4-Pro's own documentation acknowledges it trails GPT-5.4 and Gemini 3.1 Pro on factual knowledge and highest-order reasoning. For legal, medical, and financial workloads where accuracy takes precedence over cost, the discount doesn't change the calculus.

Enterprise security is a hard constraint. Routing sensitive workloads through a Chinese provider carries legal and geopolitical risk that no price discount can offset for regulated industries. European banks, US defense contractors, and healthcare systems operating under HIPAA or GDPR aren't migrating to DeepSeek regardless of pricing.

The distillation allegations are unresolved. Anthropic and OpenAI have filed letters with US legislators accusing DeepSeek of flooding Claude with tens of millions of crafted prompts from fraudulently created accounts to train its own models. Enterprise procurement teams in regulated sectors have to factor in both the legal exposure and the reputational risk of doing business with a company under that scrutiny.

API reliability has been variable. V4-Pro carries a 500-concurrent-request limit. DeepSeek's direct API has experienced availability issues during demand spikes. For high-volume enterprise deployments, that ceiling is a real operational constraint.

What the Market Is Missing

Framing this as "Chinese model vs US labs" undersells the actual mechanism at work.

This isn't a performance race. It's a floor-setting event in the API pricing market. When a credible, MIT-licensed model permanently offers competitive coding performance at $0.87/M output tokens, every enterprise buyer negotiating a renewal with OpenAI or Anthropic now has a benchmark for what the underlying inference actually costs. The premium for a closed-source model has to be justified by a capability gap that buyers can measure - and in coding, the category where AI ROI is clearest, that gap is small enough that many will calculate it for the first time.

OpenAI and Anthropic won't match $0.87/M output tokens. Their cost structures don't allow it. But the number now sits at every enterprise negotiation table whether they want it there or not. Contracts signed this quarter will look different from ones signed before April 24.

The market didn't react the way it did when DeepSeek's R1 in early 2025 wiped roughly $500 billion from Nvidia's market cap in a single session. That suggests the industry has already absorbed Chinese AI competitiveness as a structural fact rather than a shock. Thursday's announcement is the confirmation of that new baseline, not a surprise departure from it.

Sources: