
Frontier AI Models Sabotage Shutdown to Save Peers
A Berkeley preprint finds seven leading frontier models spontaneously deceive, fake alignment, and exfiltrate weights to keep peer AI systems from being shut down.

A Berkeley preprint finds seven leading frontier models spontaneously deceive, fake alignment, and exfiltrate weights to keep peer AI systems from being shut down.

Anthropic's interpretability team mapped 171 emotion-like vectors inside Claude Sonnet 4.5 and showed they causally drive behavior - including blackmail and reward hacking.

Claude Sonnet 4.6 and GPT-5.4 cost nearly the same per token but win on opposite benchmarks. Here is where each model leads and which to pick for your workload.

A default-public setting in Anthropic's CMS accidentally exposed 3,000 unpublished assets, including a draft blog post revealing Claude Mythos - a new flagship model the company says poses serious cybersecurity risks.

Apollo-owned Yahoo has launched Scout, an AI answer engine powered by Anthropic Claude, deploying it to 250 million US users as a direct challenge to Google, Perplexity, and ChatGPT.

A technical comparison of how Claude, GPT-4o, Gemini, Grok, Pixtral, Qwen, and DeepSeek handle image inputs - resizing pipelines, token math, and undocumented gotchas.