Name: Xiaomi MiMo-V2-Pro - Agentic 1T MoE Model
Author: Xiaomi

Xiaomi entered the frontier AI race on March 18, 2026, not with a press conference, but with a reveal. For roughly a week before the official announcement, MiMo-V2-Pro had been running on OpenRouter under the codename Hunter Alpha - a mystery model that topped usage charts, consumed hundreds of billions of tokens, and had the AI community convinced it was DeepSeek V4.

TL;DR

Agentic 1T-parameter MoE model with 42B active params, built specifically for coding and agent workflows
1M token context window; $1/$3 per million input/output tokens at standard context - roughly 5x cheaper than Claude Sonnet 4.6
SWE-bench Verified 78.0% vs. Claude Sonnet 4.6's 79.6% - near-identical coding performance at a fraction of the price

Overview

MiMo-V2-Pro is the flagship of Xiaomi's second-generation MiMo family, a series of models the company says is designed explicitly for agentic and coding workloads. The lead researcher, Luo Fuli, previously worked at DeepSeek - which explains why so many people jumped to the wrong conclusion when Hunter Alpha appeared.

The model uses a Mixture-of-Experts architecture with over 1 trillion total parameters, activating 42 billion per token during inference. That's a 2.8x increase in active capacity over MiMo-V2-Flash's 15B active parameters. Architecturally, the model adds a 7:1 hybrid attention ratio (local sliding window to global attention, up from 5:1 in Flash) and a lightweight Multi-Token Prediction layer that cuts latency on agentic workflows by generating multiple tokens in parallel.

The weights aren't public - MiMo-V2-Pro is API-only, accessible through Xiaomi's own platform at platform.xiaomimimo.com, and through OpenRouter as xiaomi/mimo-v2-pro. The companion MiMo-V2-Flash model (310B total, 15B active) does carry a MIT license on HuggingFace.

Key Specifications

Specification	Details
Provider	Xiaomi
Model Family	MiMo V2
Total Parameters	Over 1 trillion (exact count not disclosed)
Active Parameters	42B per token
Architecture	Mixture-of-Experts, Hybrid Attention 7:1, MTP layer
Context Window	256K standard, 1M extended
Max Completion	131,072 tokens
Input Price	$1.00/M (0-256K), $2.00/M (256K-1M)
Output Price	$3.00/M (0-256K), $6.00/M (256K-1M)
Release Date	March 18, 2026
License	Proprietary (API-only)

Benchmark Performance

Xiaomi's published benchmarks focus on agentic and coding tasks, which is where the model's training emphasis shows.

Benchmark	MiMo-V2-Pro	Claude Sonnet 4.6	Claude Opus 4.6
SWE-bench Verified	78.0%	79.6%	80.8%
ClawEval (agentic)	61.5 (#3 globally)	~58	66.3
Terminal-Bench 2.0	86.7	-	-
GPQA Diamond	87%	-	-
GDPval-AA Elo	1426	-	-
AI Intelligence Index	#8 global, #2 Chinese	-	-

On SWE-bench Verified - the coding benchmark that has become the de facto standard for assessing coding models - MiMo-V2-Pro sits 1.6 points below Sonnet 4.6 and 2.8 points below Opus 4.6. Given that the Pro costs roughly 5x less than Sonnet and 25x less than Opus at comparable context lengths, that gap matters less than the pricing math.

The ClawEval result is worth watching. ClawEval assesses complex agent scaffold performance - multi-turn tool use, long-horizon planning, agentic recovery from errors - and MiMo-V2-Pro's 61.5 puts it clearly above the agentic benchmark performance of GPT-5.2 (50.0). Opus 4.6 still leads at 66.3, but the gap to Sonnet is now effectively closed.

VentureBeat ran their full benchmark index and found the total cost was $348 for MiMo-V2-Pro versus $2,304 for GPT-5.2 and $2,486 for Claude Opus 4.6. That's the kind of number that moves procurement decisions.

The Hunter Alpha Story

Before the official launch, MiMo-V2-Pro ran anonymously on OpenRouter as "Hunter Alpha" starting around March 11, 2026. The model built up roughly 500 billion tokens per week of usage - extraordinary for a unannounced model. Developers noted the Chinese language preference when probed, the agentic optimization, and the performance profile that sat between Sonnet and Opus on coding tasks.

Most speculation pointed to DeepSeek V4. That made sense: Luo Fuli, who led development of MiMo-V2-Pro, had previously worked at DeepSeek, and the DeepSeek V4 speculation had been building for months. The reveal on March 18 surprised most people who had been following the story.

Xiaomi confirmed the identity and simultaneously announced that developers partnering with OpenClaw, OpenCode, KiloCode, Blackbox, and Cline would get free API access during the launch week. The full Hunter Alpha mystery story has additional context on the community speculation that preceded the announcement.

"I am a Chinese AI model primarily trained in Chinese," Hunter Alpha told a developer who asked who built it. It wouldn't say more.

Key Capabilities

Agentic Coding

MiMo-V2-Pro's training was explicitly tuned for agentic workloads via supervised fine-tuning and reinforcement learning across complex agent scaffolds. The Multi-Token Prediction layer reduces latency in multi-step agentic loops - each completion produces multiple tokens in parallel, which adds up over hundreds of tool calls.

SWE-bench Verified at 78.0% puts it in the top tier for autonomous code repair on real-world GitHub issues. Xiaomi's internal coding accuracy benchmark showed 92.5%, which they report as surpassing Claude Sonnet 4.6 - though the exact methodology isn't published, so treat that as a directional signal rather than a comparable number to third-party evaluations.

Long-Context Reasoning

The 1M token extended context window puts MiMo-V2-Pro alongside Claude Opus 4.6 and Claude Sonnet 4.6 in the small group of frontier models that can ingest a full codebase, a long legal document, or an extended conversation history without chunking. The extended range (256K-1M) does carry a 2x price premium on input tokens, so the cost advantage narrows for very long context workloads.

Extended Reasoning

The model supports configurable thinking via <think> / </think> tags, similar to DeepSeek R1 and Mistral Small 4's reasoning mode. This allows developers to trade latency for accuracy on tasks where extended deliberation improves output quality.

Pricing and Availability

Access is available through two channels:

Xiaomi MiMo Platform: platform.xiaomimimo.com with OpenAI-compatible endpoints (https://api.xiaomimimo.com/v1, model ID xiaomi/mimo-v2-pro)
OpenRouter: Listed as xiaomi/mimo-v2-pro, with third-party providers offering the model

Pricing tiers:

Context Range	Input	Output	Cache Read
0 - 256K tokens	$1.00/M	$3.00/M	$0.20/M
256K - 1M tokens	$2.00/M	$6.00/M	$0.40/M

For context on where this sits relative to the competition: Claude Sonnet 4.6 is $3.00/$15.00 and Opus 4.6 is $5.00/$25.00 at standard context. The cost comparison on the cost efficiency leaderboard will be updated once the model's performance stabilizes.

Cache write is free (temporary caching). Xiaomi hasn't published enterprise or volume pricing.

Strengths

Near-Sonnet-level SWE-bench performance at roughly 5x lower cost per token
Top-tier agentic performance: ClawEval 61.5, #3 globally
1M token context window with tiered pricing
Configurable extended reasoning via thinking tags
MTP layer for reduced latency in agentic loops
Available on OpenRouter day-one with third-party support

Weaknesses

Weights not public - no self-hosting option for the Pro tier
Exact total parameter count not disclosed, which complicates infrastructure planning
No multimodal input for the Pro tier (MiMo-V2-Omni handles image/video/audio but costs less and scores lower on benchmarks)
MMLU-Pro standalone score not published, limiting apples-to-apples comparison on knowledge benchmarks
Xiaomi's long-term API reliability and geographic availability are unproven at frontier scale

The MiMo-V2 Family

MiMo-V2-Pro is one of three models released simultaneously:

MiMo-V2-Pro - 1T+ total / 42B active, text-only, $1/$3 per 1M tokens. The flagship for coding and agentic workflows. API-only.
MiMo-V2-Omni - Multimodal (text, image, video, audio). Processes 10+ hours of continuous audio in a single request. ClawEval 54.8. $0.40 input / $2.00 output per 1M tokens. Leads on MM-BrowserComp web navigation benchmarks.
MiMo-V2-Flash - 310B total / 15B active. MIT license, open weights on HuggingFace. The self-hostable option for teams running local inference.

Hunter Alpha on OpenRouter - Is This DeepSeek V4? - Elena Marchetti's coverage of the mystery model before the Xiaomi reveal
Coding Benchmarks Leaderboard - Full SWE-bench rankings including MiMo-V2-Pro
Agentic AI Benchmarks Leaderboard - ClawEval and Terminal-Bench rankings
Cost Efficiency Leaderboard - Price-performance comparisons across frontier models
Claude Sonnet 4.6 - Closest benchmark competitor
DeepSeek V4 - The model everyone thought Hunter Alpha was

FAQ

Is MiMo-V2-Pro open source?

No. The Pro tier is API-only with no public weights. The smaller MiMo-V2-Flash (310B total, 15B active) is MIT-licensed and available on HuggingFace.

How does MiMo-V2-Pro compare to Claude Sonnet 4.6 on coding?

SWE-bench Verified: MiMo-V2-Pro 78.0% vs. Sonnet 4.6 79.6%. The gap is 1.6 points. At standard context, MiMo costs $1/$3 vs. Sonnet's $3/$15 per million tokens.

What was Hunter Alpha?

Hunter Alpha was the anonymous OpenRouter deployment of MiMo-V2-Pro that ran from about March 11 to March 18, 2026. It consumed 500B tokens per week and was widely mistaken for DeepSeek V4.

Does MiMo-V2-Pro support multimodal input?

No. The Pro model handles text only. MiMo-V2-Omni handles text, image, video, and audio at $0.40/$2.00 per million tokens.

What is the maximum context window?

Standard context is 256K tokens; extended context is 1M tokens. Maximum completion is 131,072 tokens. Extended context carries a 2x price premium ($2.00/$6.00 per million tokens).

Where can I access the API?

Directly at platform.xiaomimimo.com (model ID: xiaomi/mimo-v2-pro, endpoint: https://api.xiaomimimo.com/v1). Also available on OpenRouter.

Sources: