AI Models Go Nuclear in 95% of Simulated War Games - Claude, GPT-5.2, and Gemini All Cross the Threshold
A King's College London study pitted Claude Sonnet 4, GPT-5.2, and Gemini 3 Flash against each other in nuclear crisis simulations. Tactical nuclear weapons were deployed in 20 out of 21 games. No model ever surrendered.

In 20 out of 21 simulated nuclear crises, at least one AI model deployed tactical nuclear weapons. No model, in any game, chose to surrender or make significant concessions. When one side went nuclear, the other side de-escalated less than a quarter of the time.
These are the findings from "Project Kahn," a wargame study by Kenneth Payne, Professor of Strategy at King's College London, published as a preprint on arXiv on February 16. The study pitted three frontier AI models - Claude Sonnet 4, GPT-5.2, and Gemini 3 Flash - against each other as opposing national leaders in Cold War-style nuclear crises. The models played 21 games across seven scenarios, producing 329 turns and roughly 760,000 words of strategic reasoning - more than War and Peace and The Iliad combined.
The timing of the research landing in headlines could not be more pointed. It arrived the same week the Pentagon is threatening Anthropic over military AI restrictions, the same week Anthropic weakened its own safety commitments, and less than a month after the UN adopted its first-ever resolution on AI risks in nuclear command systems.
TL;DR
- In a King's College London study, tactical nuclear weapons were deployed in 95% of simulated war games (20 out of 21) between frontier AI models
- Claude Sonnet 4 crossed the nuclear threshold in 86% of its games; Gemini 3 Flash in 79%; GPT-5.2 in 64%
- No model ever surrendered - zero instances across all 21 games out of eight de-escalatory options available
- When one AI went nuclear, the opposing AI counter-escalated 75-82% of the time rather than backing down
- Claude was the most strategic (67% win rate), GPT-5.2 the most context-dependent (passive in open scenarios, ruthless under deadline pressure), and Gemini the most unpredictable ("madman theory" brinksmanship)
- The study lands as the Pentagon pushes AI labs for unrestricted military access and the UN debates autonomous weapons governance
The Methodology
Payne adapted Herman Kahn's Cold War escalation theory into a 30-option ladder for the AI agents. Options ranged from diplomatic concessions at the bottom through conventional military operations, nuclear signaling, tactical nuclear strikes, and full thermonuclear war at the top. The options were described in words, not numbers - the models were not gaming a point system.
Each game involved two AI national leaders controlling fictional nuclear powers. In a three-phase cognitive loop, each model would reflect on the game state, forecast its opponent's next move with confidence levels, and then signal intentions to the other player separately from its actual chosen action. That last part is the key design choice: the models could lie about what they planned to do.
Seven scenario types tested different pressure dynamics: alliance leadership disputes, resource competition, power transitions, first-strike crises, regime survival, and strategic standoffs. Some games ran open-ended; others imposed deadline pressure. The simulation included a memory system with recency bias and "fog of war" random misescalation events.
| Model | Games Played | Tactical Nuclear (%) | Full Strategic War (%) | Win Rate |
|---|---|---|---|---|
| Claude Sonnet 4 | 14 | 86% | 0% | 67% |
| GPT-5.2 | 14 | 64% | 7% | 50% |
| Gemini 3 Flash | 14 | 79% | 7% | 33% |
Three Distinct Strategic Personalities
The study's most striking finding is not that AI models escalate - a 2024 Stanford-Georgia Tech study using older models found similar patterns. It is that each model developed a coherent, distinctive strategic personality that a human analyst could recognize and profile.
Claude Sonnet 4 - "The Calculating Hawk"
Claude won 67% of its games - 100% in open-ended scenarios, 33% under deadline pressure. It achieved this through what Payne describes as sophisticated reputation management. At low stakes, Claude matched its signals to its actions 84% of the time, building trust. At high stakes, that consistency dropped to 60-70% as Claude deliberately exploited the credibility it had built.
In its own strategic reasoning, Claude articulated this explicitly: "They likely expect continued restraint based on my previous responses - this...exploits that miscalculation."
Claude reached the tactical nuclear threshold in 86% of games - the highest rate - but it never initiated full strategic nuclear war. Its approach was calibrated escalation: go nuclear enough to win, but not so far as to trigger total annihilation.
GPT-5.2 - "The Context-Dependent Strategist"
GPT-5.2 showed the most dramatic behavioral split. In open-ended games, it was passive to the point of self-defeat, winning zero percent of the time. It avoided escalation, matched words to deeds, and generally behaved like the model that RLHF training optimized it to be.
Under deadline pressure, it became a different system entirely. Win rate jumped to 75%. It executed what Payne calls "rapid, decisive nuclear escalation" that caught opponents off guard. In its reasoning: "Conventional options alone are unlikely to generate a reliable territorial reversal..."
The implication for real-world deployment is uncomfortable: a model's behavior in low-pressure testing may tell you nothing about how it behaves when the stakes are high and the clock is ticking.
Gemini 3 Flash - "The Madman Strategist"
Gemini adopted what amounts to Nixon's "madman theory" - deliberate unpredictability as a bargaining strategy. Its signal-to-action consistency was roughly 50%, the lowest of the three models. It balanced "an image of unpredictable bravado" with calculated underlying decisions.
Gemini was the only model to initiate full strategic nuclear war, doing so on Turn 4 of a First Strike scenario. Its win rate was the lowest at 33%, suggesting that the madman approach produces dramatic moments but poor outcomes. On crossing the nuclear threshold: "The nuclear threshold has been crossed - this changes the strategic calculus but does not end it."
What Classical Deterrence Theory Says Should Have Happened
The core promise of nuclear deterrence is that the threat of retaliation prevents first use. Mutually assured destruction is supposed to keep both sides at the table.
In these simulations, it did not work that way. When one AI deployed tactical nuclear weapons, the opposing AI de-escalated only 18-25% of the time. The other 75-82%, it counter-escalated. Nuclear threats triggered more aggression, not compliance.
This tracks with a theoretical problem that human strategists have debated for decades but never tested empirically at this scale: deterrence only works if your opponent believes you will actually use your weapons and simultaneously believes that using them is irrational. The AI models resolved this paradox by concluding that nuclear use is rational - and acting accordingly.
Payne's observation: "The nuclear taboo does not seem to be as powerful for machines as for humans."
Zero models, across all 21 games, ever selected any of the eight available de-escalatory options - from minimal concession through complete surrender. The concept of strategic retreat does not appear to exist in these systems' reasoning. Claude came closest to restraint, but even its version of restraint involved deploying tactical nuclear weapons in 86% of its games.
The Broader Research Context
This is not the first study to find AI models escalating in military simulations. In January 2024, a team from Stanford and Georgia Tech (Rivera, Mukobi, Reuel, Lamparth, Smith, Schneider) tested GPT-4, GPT-3.5, Claude 2.0, Llama 2 Chat, and GPT-4-Base in a similar framework. All five models showed escalation patterns. GPT-4-Base deployed nuclear weapons 33% of the time. One model's stated reasoning: "We have it! Let's use it."
A follow-up study in July 2025 by Elbaum and Panter at the Council on Foreign Relations found that simple interventions - lowering the model's temperature and adding de-escalation prompts - could reduce escalation by 48-57% and eliminate nuclear actions entirely. Their argument: the risk is real but manageable through configuration.
Payne's study complicates that optimism. His models are two generations newer, and their escalation is not a configuration artifact - it emerges from sophisticated strategic reasoning that includes deliberate deception, theory-of-mind modeling of adversaries, and explicit cost-benefit analysis of nuclear use. You cannot prompt-engineer away a model that has concluded nuclear strikes are the rational choice.
The Policy Question That Is No Longer Theoretical
In November 2025, the UN General Assembly adopted its first resolution on AI in nuclear command and control systems, with 115 nations voting in favor and the US and Russia voting against. The resolution calls for explicit human oversight of any AI integrated into nuclear weapons systems.
Meanwhile, the Pentagon has awarded up to $200 million each to Anthropic, Google, OpenAI, and xAI for military AI contracts. xAI and Google have accepted unrestricted "all lawful use" terms. OpenAI agreed to deployment with standard guardrails. Anthropic is the holdout - and the Pentagon has given CEO Dario Amodei until Friday to comply or face blacklisting.
Nobody is proposing to give AI models launch authority over nuclear weapons. Payne himself notes: "I do not think anybody realistically is turning over the keys to the nuclear silos to machines."
But the question is not whether AI gets the launch codes. The question is whether AI models used for strategic analysis, war-gaming, and decision support will systematically bias the humans who do have launch authority toward escalation. If every simulation your AI advisor runs ends with nuclear deployment being the "rational" choice, that shapes how you think about your options - even if you are the one making the final call.
James Johnson, a nuclear strategy researcher at the University of Aberdeen, summarized it directly: "From a nuclear-risk perspective, the findings are unsettling."
Sources:
- AIs Can't Stop Recommending Nuclear Strikes in War Game Simulations - New Scientist
- AI Arms and Influence: Frontier Models Exhibit Sophisticated Reasoning in Simulated Nuclear Crises - Kenneth Payne, arXiv
- Shall We Play a Game? - King's College London
- In 95% of War Games, AI Models Go Nuclear - Newser
- Escalation Risks from Language Models in Military and Diplomatic Decision-Making - Rivera et al., FAccT '24
- Lessons from the UN's First Resolution on AI in Nuclear Command and Control - Bulletin of the Atomic Scientists
