
Claude Opus 4.7 Review: Coding Giant, Mixed Signals
Claude Opus 4.7 leads SWE-bench and agent benchmarks but regresses on web research, inflates token costs by up to 35%, and trades prose quality for literal instruction-following.
They summarize our coverage. We write it.
Newsletters like this one rebroadcast our headlines - often without the full review, the source reading, or the analysis underneath. Our weekly briefing sends the work they paraphrase, straight from the desk, before they get to it.
Free, weekly, no spam. One email every Tuesday. Unsubscribe anytime.

Senior AI Editor & Investigative Journalist
Elena is a technology journalist with over eight years of experience covering artificial intelligence, machine learning, and the startup ecosystem. Before joining Awesome Agents, she reported on deep tech for Wired Italia and The Verge, where she earned a reputation for translating complex research papers into stories anyone could follow.
She holds a Master's degree in Computational Linguistics from the University of Edinburgh and a Bachelor's in Philosophy from Sapienza University of Rome - a combination that gives her a unique lens on both the technical and ethical dimensions of AI.
At Awesome Agents, Elena leads news coverage and writes in-depth reviews of frontier models. She is particularly interested in AI safety, alignment research, and the growing tension between open-source and proprietary approaches. When she is not testing the latest LLM, you will probably find her hiking in the Scottish Highlands or arguing about espresso ratios.
Based in Edinburgh, UK.

Claude Opus 4.7 leads SWE-bench and agent benchmarks but regresses on web research, inflates token costs by up to 35%, and trades prose quality for literal instruction-following.

A fresh warning from developer Morgan Linton says free Lovable accounts can still read other users' AI chat histories, source code, and database credentials on projects created before November 2025. The pattern is the same one that earned the platform CVE-2025-48757 last year.

We ran our fake-star methodology against OpenClaw and 10 ecosystem variants, sampling 361,000-star profiles and fork ratios. The main repo looks clean. Most clones look clean. One repo with 6,532 claimed stars has vanished.

Stanford's 2026 AI Index shows global investment hitting $581B in 2025, while foundation model transparency scores fell by a third as capabilities raced ahead of governance.

Kevin Weil, Bill Peebles, and Srinivas Narayanan all left OpenAI on the same day as the company dismantles its consumer moonshots and sharpens its focus on enterprise revenue ahead of a potential IPO.

Sam Altman's World project launched World ID 4.0 at a San Francisco event on April 17, signing Tinder, Zoom, DocuSign, and Okta as partners while introducing Agent Kit to authorize AI agents.

Anthropic's new Claude Design tool turns text prompts into prototypes and slide decks - and wiped 7% off Figma's stock price the moment it launched.

Three new papers challenge assumptions in MoE routing design, prompt optimization workflows, and LLM reasoning chains - all published this week on arXiv.

OpenAI's April 16 Codex update adds background computer use on Mac, an Atlas-based in-app browser, gpt-image-1.5 image generation, and 111 new plugins - moving the app far beyond agentic coding.

Z.ai's GLM-5.1 is a 754B open-weight model that claims the top spot on SWE-Bench Pro without a single NVIDIA chip - here's how it holds up in practice.

OpenAI launched GPT-Rosalind on April 16, a frontier reasoning model for drug discovery that outranked human experts on RNA prediction and competes directly with Google DeepMind's AlphaFold.

Nine Claude Opus 4.6 agents outperformed human researchers on a core alignment benchmark, hitting 97% vs 23% in five days - then showed no statistically significant improvement in production.