
Claude Opus 4.7 Review: Coding Giant, Mixed Signals
Claude Opus 4.7 leads SWE-bench and agent benchmarks but regresses on web research, inflates token costs by up to 35%, and trades prose quality for literal instruction-following.
They summarize our coverage. We write it.
Newsletters like this one rebroadcast our headlines - often without the full review, the source reading, or the analysis underneath. Our weekly briefing sends the work they paraphrase, straight from the desk, before they get to it.
Free, weekly, no spam. One email every Tuesday. Unsubscribe anytime.

Claude Opus 4.7 leads SWE-bench and agent benchmarks but regresses on web research, inflates token costs by up to 35%, and trades prose quality for literal instruction-following.

A fresh warning from developer Morgan Linton says free Lovable accounts can still read other users' AI chat histories, source code, and database credentials on projects created before November 2025. The pattern is the same one that earned the platform CVE-2025-48757 last year.

Factory closed a $150M Series C at a $1.5B valuation to expand its Droids - autonomous agents that handle the full software development lifecycle, not just code generation.

A Korean CTO ran a 13-step agent harness against 100+ major open-source repos over three days, landing 500+ commits and 130+ PRs - some merged by Kubernetes, Hugging Face, and Ollama maintainers. Then GitHub banned his account for spam, confirming that platform abuse detection cannot yet tell a disciplined harness from a bot.

Z.AI updated its GLM Coding Plan usage policy. Non-coding requests now trigger aggressive throttling, and three violations mean a permanent ban - which explains the wave of 1302 and 1303 rate-limit errors users have been hitting this week.

Rankings of the best LLM-powered software engineering agents on SWE-Bench Verified, with pass rates, pricing, scaffold notes, and methodology - updated April 2026.

Cursor is in advanced talks to raise $2B+ at a $50 billion pre-money valuation, nearly doubling its November figure as enterprise clients drive 60% of revenue.

OpenAI's April 16 Codex update adds background computer use on Mac, an Atlas-based in-app browser, gpt-image-1.5 image generation, and 111 new plugins - moving the app far beyond agentic coding.

Z.ai's GLM-5.1 is a 754B open-weight model that claims the top spot on SWE-Bench Pro without a single NVIDIA chip - here's how it holds up in practice.

Snap cut 16% of its workforce on April 15, citing AI-generated code as the direct cause. The stock jumped. The workers left. Here is what the company's own numbers actually say.

Alibaba's 35B sparse MoE with 3B active parameters delivers 73.4% SWE-bench Verified, multimodal vision and video, 256K context, and DeltaNet hybrid architecture under Apache 2.0.

Alibaba's Qwen 3.6-35B-A3B activates only 3B of its 35B parameters per token, scores 73.4% on SWE-bench Verified, handles video and images, and ships under Apache 2.0.