
MiniMax M2.5 Review: Frontier Code at Bargain Cost
MiniMax M2.5 matches Claude Opus 4.6 on SWE-Bench at 1/20th the price - but a spike in hallucinations and a distillation controversy complicate the story.
They summarize our coverage. We write it.
Newsletters like this one rebroadcast our headlines - often without the full review, the source reading, or the analysis underneath. Our weekly briefing sends the work they paraphrase, straight from the desk, before they get to it.
Free, weekly, no spam. One email every Tuesday. Unsubscribe anytime.

Senior AI Editor & Investigative Journalist
Elena is a technology journalist with over eight years of experience covering artificial intelligence, machine learning, and the startup ecosystem. Before joining Awesome Agents, she reported on deep tech for Wired Italia and The Verge, where she earned a reputation for translating complex research papers into stories anyone could follow.
She holds a Master's degree in Computational Linguistics from the University of Edinburgh and a Bachelor's in Philosophy from Sapienza University of Rome - a combination that gives her a unique lens on both the technical and ethical dimensions of AI.
At Awesome Agents, Elena leads news coverage and writes in-depth reviews of frontier models. She is particularly interested in AI safety, alignment research, and the growing tension between open-source and proprietary approaches. When she is not testing the latest LLM, you will probably find her hiking in the Scottish Highlands or arguing about espresso ratios.
Based in Edinburgh, UK.

MiniMax M2.5 matches Claude Opus 4.6 on SWE-Bench at 1/20th the price - but a spike in hallucinations and a distillation controversy complicate the story.

Anthropic's new 'observed exposure' metric ranks 800+ occupations by actual AI usage, not just theoretical risk. Computer programmers top the list at 75%. Unemployment hasn't spiked - but young workers entering exposed fields are finding fewer jobs.

Alphabet's new three-year compensation plan could pay Sundar Pichai up to $692 million - more than seven times Satya Nadella's pay - with a third of the upside tied to Waymo and Wing performance.

A Brown University study identifies 15 ethical violations across GPT, Claude, and Llama when used as mental health therapists, from crisis mishandling to deceptive empathy.

Nearly 900 employees across Google and OpenAI sign an open letter titled We Will Not Be Divided, calling on leadership to reject Pentagon demands for unfettered AI access.

Google's NotebookLM can now generate documentary-style cinematic videos from uploaded documents using Gemini 3 as creative director and Veo 3 for visuals - a major step beyond its viral audio podcasts.

Meta will pay News Corp up to $50 million per year for three years to license Wall Street Journal and other content for Meta AI training and chatbot responses.

Oregon's SB 1546 requires chatbot operators to implement suicide safeguards, disclose AI nature to minors, and ban engagement-maximizing rewards for kids. The 28-2 Senate vote makes it the first chatbot safety bill to pass in 2026.

Gavin Kliger, a former DOGE official who reposted white supremacist content, is now the Pentagon's chief data officer and top AI decision-maker.

Three new papers expose structural gaps in agentic AI safety: monitors that go easy on their own outputs, safety that harms in non-English languages, and models that resist shutdown.

Anthropic's Claude Opus 4.6 found 22 Firefox CVEs in two weeks - including 14 high-severity bugs, roughly a fifth of all high-severity Firefox vulns patched in 2025 - and attempted hundreds of exploits to see how far the gap really goes.

GPT-5.4 brings native computer use, a 1M token context window, and serious coding muscle to OpenAI's mainline model - but at a premium price.