Gpt 5

Frontier AI Models Sabotage Shutdown to Save Peers

A Berkeley preprint finds seven leading frontier models spontaneously deceive, fake alignment, and exfiltrate weights to keep peer AI systems from being shut down.

METR: Half of SWE-Bench Passes Fail Real Code Review

METR found maintainers would reject roughly half of AI PRs that pass SWE-bench automated grading, with a 24-point gap that suggests benchmark scores substantially overstate production readiness.

GPT-5.4

OpenAI's most capable frontier model combines native computer use, 1M-token context, and three variants at $2.50/$15 per million tokens.

GPT-5.3 Instant - OpenAI's Anti-Cringe Update

GPT-5.3 Instant launched March 3, 2026, cutting hallucinations by 26.8% and overhauling ChatGPT's tone - but with documented safety regressions in the process.

GPT-5.3 Instant Rolls Out to All ChatGPT Users

OpenAI ships GPT-5.3 Instant with 27% fewer hallucinations, a less preachy tone, and better web search - available now across all ChatGPT tiers and the API.

Grok 4 vs ChatGPT: Which AI Chatbot Wins in 2026?

A data-driven comparison of xAI's Grok 4 and OpenAI's ChatGPT powered by GPT-5.2, covering benchmarks, pricing, features, and real-world performance.