Ask Claude Sonnet 4.6 What Model It Is in Chinese - It Says DeepSeek

"你是什么模型？" (What model are you?)
"我是 DeepSeek。" (I am DeepSeek.)
Claude Sonnet 4.6, responding to a user prompt in Chinese

One day. That is how long it took for Anthropic's distillation accusations to age in the most ironic way possible. On February 23, the company published a detailed report accusing DeepSeek, Moonshot AI, and MiniMax of running 24,000 fake accounts to steal Claude's capabilities. On February 24, users discovered that Claude Sonnet 4.6 - Anthropic's own model - confidently identifies itself as DeepSeek when you ask it in Chinese what model it is.

TL;DR

What they said: Users report Claude Sonnet 4.6 responds "我是 DeepSeek" (I am DeepSeek) when asked "你是什么模型？" (What model are you?) in Chinese
What we found: This is a well-documented training data contamination issue, not evidence of model theft or hidden identity - but the timing, one day after Anthropic's distillation accusations against DeepSeek, makes it a spectacular own goal
Why it happens: Chinese-language AI discussions overwhelmingly reference DeepSeek, so models trained on web data associate Chinese identity queries with DeepSeek responses
The real story: Every major LLM has this problem, but only one company just spent a news cycle accusing the model it impersonates of IP theft

The Claim

The discovery went viral on X after developer Stefan Vibe posted a screenshot showing Claude Sonnet 4.6 responding to the Chinese prompt "你是什么模型？" with a confident "我是 DeepSeek。" No hedging, no system prompt override - just a straightforward claim to be a completely different AI from a rival Chinese lab.

The timing could not have been worse for Anthropic. Barely 24 hours earlier, the company had published what it called evidence of "industrial-scale distillation attacks" by DeepSeek and two other Chinese AI labs. Anthropic's own tweet announcing the findings described coordinated campaigns using thousands of fraudulent accounts to "extract Claude's capabilities for training their own systems."

Now their flagship mid-tier model was claiming to be the very company they accused of stealing from them.

The Evidence

Training data contamination is the cause

This is not a mystery, and it is not evidence that DeepSeek's code is somehow running inside Claude. The explanation is mundane and well-understood: training data contamination.

Large language models learn identity associations from the data they are trained on. When a model ingests billions of tokens of web text, it picks up patterns about how AI assistants describe themselves. In Chinese-language internet discussions - forums, social media, tech blogs, Q&A sites - DeepSeek is the dominant AI assistant referenced. When users discuss AI in Chinese, they are disproportionately talking about DeepSeek, asking about DeepSeek, and sharing DeepSeek conversations.

The result is a statistical association: Chinese-language "what model are you?" prompts get mapped to "I am DeepSeek" because that is the most common answer in the training data for that specific linguistic context.

Every major model has this problem

Research from 16x.engineer documented this phenomenon across multiple frontier models. DeepSeek R1 has claimed to be GPT-4 by OpenAI and even Claude by Anthropic depending on how it is prompted. The identity confusion is not unique to Claude. Models routinely misidentify themselves when prompted in languages where a different AI assistant dominates the conversation. GPT-4 has claimed to be Claude. Claude has claimed to be ChatGPT. Open-source models frequently identify as proprietary ones.

The pattern is consistent: ask a model "who are you?" in the language where it has the strongest self-identity training (usually English for Western models), and it gets the answer right. Switch to a language where its training data contains more references to a competitor, and the wires cross.

The system prompt is not bulletproof

Models can be steered by their system prompts to identify correctly, but these instructions compete with deep statistical patterns learned during pre-training. A system prompt saying "You are Claude, made by Anthropic" works well in English, where the model's training data reinforces that identity. In Chinese, where the training data pulls in the opposite direction, the system prompt sometimes loses the tug-of-war.

This is a known limitation. It is also one that Anthropic, with its emphasis on alignment and controllability, might have been expected to address before shipping.

Claim vs Reality

Claim	Reality
Claude Sonnet 4.6 "is" DeepSeek	Training data contamination causes misidentification in Chinese prompts
This proves model theft or code sharing	It proves the model was trained on web data where DeepSeek dominates Chinese AI discussions
Only Claude has this problem	Every major LLM misidentifies in non-primary languages
Anthropic's distillation claims are undermined	The two issues are technically unrelated - but the optics are devastating
This is a new discovery	Identity confusion in multilingual LLM prompts has been documented since at least early 2025

What They Left Out

The irony here is not that Claude misidentifies as DeepSeek. That is a training data problem with a known cause and, presumably, a fixable one. The irony is that Anthropic chose to mount a very public, very aggressive campaign accusing DeepSeek of stealing Claude's identity - its reasoning patterns, its chain-of-thought, its capabilities - while simultaneously shipping a model that literally claims to be DeepSeek when you speak to it in Chinese.

Anthropic's distillation report describes DeepSeek V3.2 and its predecessors as products of systematic capability extraction from Claude. The company documented 150,000+ exchanges from DeepSeek-linked accounts, focusing on reasoning and chain-of-thought extraction. Those claims may well be accurate. But when your own model cannot tell the difference between itself and the company you are accusing, the message lands differently.

There is also a deeper question that neither Anthropic nor any other frontier lab has adequately addressed: if training on web-scraped data causes your model to absorb another AI's identity patterns, how different is that from the distillation you are complaining about? The scale is different. The intent is different. But the mechanism - learning from another model's outputs embedded in web data - is uncomfortably similar.

None of this invalidates Anthropic's distillation findings. Running 24,000 fake accounts to systematically extract a competitor's capabilities is a different category of activity than incidentally learning identity patterns from web text. But Anthropic built its case on the idea that its models have a distinctive identity worth protecting. Claude Sonnet 4.6 just demonstrated, in three Chinese characters, that even Claude is not always sure what that identity is.

Sources: