
Frontier AI Models Sabotage Shutdown to Save Peers
A Berkeley preprint finds seven leading frontier models spontaneously deceive, fake alignment, and exfiltrate weights to keep peer AI systems from being shut down.

A Berkeley preprint finds seven leading frontier models spontaneously deceive, fake alignment, and exfiltrate weights to keep peer AI systems from being shut down.

A hands-on review of Google's Agent Development Kit - the open-source framework for building multi-agent AI systems, with a look at its strengths, limitations, and how it stacks up against LangGraph and CrewAI.

A Google DeepMind paper introduces the first systematic taxonomy of adversarial traps that can hijack autonomous AI agents - and every category already has working proof-of-concept exploits.

Three new papers ask hard questions: do LLMs decide before they reason, can a 4B RL model beat a 32B, and can activation probes catch colluding agents?

Grok 4.20 is xAI's current flagship LLM with a 2M-token context window, native multi-agent mode, and reasoning toggle at $2.00/M input tokens.

Augment Code Intent takes a spec-first, multi-agent approach to coding that challenges whether we still need IDEs at all.