Chinese Models Claim 60% of OpenRouter Token Traffic

A year ago, Chinese AI models processed less than 2% of the tokens flowing through OpenRouter. As of May 2026, six Chinese providers collectively account for more than 60% of the platform's weekly traffic, according to data from Digital Applied's Q2 2026 landscape report and backed up by multiple independent analyses.

The shift happened fast enough that no US-based provider currently holds a top-three position in weekly token volume. OpenAI holds roughly 7.5% of the platform, while Xiaomi alone commands three times that share.

TL;DR

Six Chinese providers now account for 60%+ of OpenRouter's weekly token volume, up from under 2% in early 2025
Xiaomi's MiMo-V2-Pro holds 21.1% market share - three times OpenAI's platform share
Cost is the primary driver: Chinese flagships charge $0.30/M input tokens vs $5/M for Claude Opus
Chinese models handle 49% of all coding tokens on the platform
47% of OpenRouter's users are American, making this a story of US developers choosing Chinese models

The Current Rankings

OpenRouter publishes weekly token consumption data. The April 2026 snapshot shows the following:

Provider	Flagship Model	Weekly Tokens	Share
Xiaomi	MiMo-V2-Pro	4.21T	21.1%
Alibaba	Qwen 3.6 Plus	2.77T	13.9%
MiniMax	MiniMax M2.7	1.62T	8.1%
Zhipu AI	GLM-5	1.12T	5.6%
DeepSeek	DeepSeek V3.2	1.11T	5.6%
StepFun	Step 3.5 Flash	1.07T	5.3%
OpenAI	GPT-5.5	~1.5T	7.5%

The six Chinese providers combine to 59.6%, with other smaller Chinese models pushing the total above 60%. The OpenRouter market share chart shows a "Market Share" section by model author - the visual is led by non-US providers.

Total platform volume has reached roughly 12.1 trillion tokens per week. That's a 12.7-fold increase from a year prior. Chinese models didn't steal existing traffic; they captured the majority of new demand as developer API usage scaled.

How the Overtake Happened

The February 2026 inflection is well-documented. During the week of February 9-15, Chinese models hit 4.12 trillion tokens, surpassing US models' 2.94 trillion - the first time Chinese models passed US volume on the platform. Two weeks later, Chinese volume reached 5.16 trillion against US models' 2.7 trillion. By mid-February, the combined share briefly hit 61%.

What drove it: MiniMax launched M2.5 on February 13, and Kimi K2.6 from Moonshot AI had just shipped its open-weight agent swarm architecture in late January. Zhipu released GLM-5 on February 12. Three significant Chinese releases in about three weeks, all optimized for agent and coding workloads.

The pace hasn't let up. During the week of April 27 - May 3, Chinese models contributed 7.942 trillion tokens - a 81.7% week-over-week surge - while US model usage fell 34.6% in the same period.

OpenRouter weekly token volume chart showing dramatic growth from mid-2025 to April 2026, with the chart reaching over 20T tokens per week OpenRouter's weekly token volume chart. The explosion starting in late 2025 and accelerating through early 2026 reflects the Chinese model surge. Source: openrouter.ai

The Cost Equation

Running a benchmark evaluation with Claude costs an estimated $4,811. The same evaluation with Zhipu's GLM costs $544, per CNBC reporting this week. That's a 9x gap for identical computational work.

The per-token gap is similar. MiniMax M2.7 and GLM-5 both charge $0.30 per million input tokens. Claude Opus charges $5 per million, about 16 times more. For output tokens, the spread widens further - 9.8x to 22.7x cheaper depending on the model tier.

A practical example from the ainchina.com analysis: a daily research agent processing 5 million tokens costs $125 per day with Claude and $0.70 with DeepSeek V3.2. At that ratio, cost isn't a factor to optimize around - it's a factor that changes architectural decisions entirely.

The quality delta doesn't justify the price delta, which is the critical point. MiniMax M2.5 scored 80.2% on SWE-Bench Verified at launch. Claude Opus at the same time scored 80.8%. For most workloads, the difference is negligible. Kimi K2.6 subsequently became the first open-weight model to beat GPT-5.4 on SWE-Bench Pro.

"Chinese share crossed 45% of OpenRouter traffic. Xiaomi alone has 3x OpenAI's share." - Digital Applied, April 2026

Coding Is the Beachhead

The category breakdown matters. Xiaomi's MiMo-V2-Pro and Alibaba's Qwen 3.6 Plus together account for about 49% of all coding tokens on the platform, according to Digital Applied. Coding is where developers make their highest-cost, highest-volume API calls - exactly the use case where price-performance ratio drives switching.

The 2025 OpenRouter State of AI report noted that Claude held over 60% of coding queries for most of 2025. That majority is now with Chinese models. When the highest-value use case switches, the rest tends to follow.

Two developers reviewing code on a laptop, illustrating the coding workflows where Chinese AI models have replaced US alternatives for many developer teams Coding and agent automation are the primary drivers of Chinese model adoption on OpenRouter, with two Chinese providers now handling nearly half of all coding tokens. Source: pexels.com

The 47% American user base statistic is what puts this story in sharp focus. This isn't adoption in markets where US cloud services are harder to access. Silicon Valley startups, European SaaS companies, and Indian outsourcing firms are all choosing Chinese models through a routing layer deliberately designed to make swapping easy.

What This Measures - and What It Doesn't

What They Measured

OpenRouter counts API calls routed through its platform by token volume. It covers developers who actively choose OpenRouter as an abstraction layer - mostly startups, research teams, and automated workflows. The data updates weekly and is one of the more reliable public proxies for developer model preferences.

What They Didn't

OpenRouter doesn't capture enterprise traffic. Anthropic, OpenAI, and Google each have direct enterprise contracts that bypass the platform entirely. Microsoft's Copilot, GitHub Copilot, and Google's Gemini deployments in Search and Workspace run on separate infrastructure.

Chinese model usage on OpenRouter also doesn't directly map to production deployments. Developers testing cheap alternatives for evaluation will inflate counts without those models shipping to end users. Still, the April 27 - May 3 surge to 81.7% week-over-week growth isn't evaluation traffic - that's production load.

Should You Care?

If you're building on US models as your primary API backend, the cost pressure is now structural. Chinese providers have held comparable quality at 10-17x lower pricing for several months - this isn't a promotional flash. Qwen 3.6's open weights and MiMo-V2-Pro's Apache 2.0 license also mean self-hosting is viable, which removes even the API cost completely.

For infrastructure teams, the OpenRouter numbers give you a useful calibration point. If the developer community has already run this experiment at scale and reached a verdict - 60% of weekly tokens going to Chinese models - that's a real signal about which models are winning on the quality-per-dollar axis.

Anthropic's own policy paper, cited in press coverage this week, acknowledges the US maintains only "a several months ahead" lead over Chinese counterparts. The OpenRouter data suggests that gap isn't translating into developer preference when the price difference runs to an order of magnitude.

Sources: