
Anthropic Ships Opus 4.8 with Multi-Agent Workflows
Claude Opus 4.8 launches with dynamic workflows for parallel subagent orchestration, hitting 69.2% on SWE-bench Pro and introducing granular effort controls at unchanged pricing.
They summarize our coverage. We write it.
Newsletters like this one rebroadcast our headlines - often without the full review, the source reading, or the analysis underneath. Our weekly briefing sends the work they paraphrase, straight from the desk, before they get to it.
Free, weekly, no spam. One email every Tuesday. Unsubscribe anytime.

Claude Opus 4.8 launches with dynamic workflows for parallel subagent orchestration, hitting 69.2% on SWE-bench Pro and introducing granular effort controls at unchanged pricing.

Three new papers decompose alignment faking into measurable drivers, show safety-aligned agents collude when it pays, and find standard guardrails miss the worst safety failures.

Robinhood launched MCP-powered agentic trading in beta on May 27, letting AI agents from Claude and ChatGPT manage stock portfolios for 27.5 million retail customers - while regulators work out who's responsible when it goes wrong.

Cognition's $1B funding round at a $25B pre-money valuation puts Devin's $492M ARR and model-agnostic architecture under scrutiny as every major AI lab ships its own coding agent.

Three new papers reframe how we measure agent efficiency, defend agent memory from poisoning attacks, and calculate hard accuracy ceilings for transformers.

ClickUp cut 22% of its workforce and replaced them with roughly 3,000 internal AI agents - a ratio of three agents per remaining employee.

Gemini Spark is Google's first 24/7 cloud-persistent AI agent - ambitious, genuinely novel, and still rough around the privacy edges.

KPMG and Anthropic signed a global alliance giving all 276,000 employees access to Claude, with a dedicated PE product and a preferred-consultant designation for private equity deployments.

The best AI tools for cold outbound prospect research in 2026 - comparing Clay, Apollo.io, LinkedIn Sales Navigator, Lusha, and Cognism on data coverage, AI features, pricing, and which fits each use case.

The best AI models for function calling and tool use in 2026 - comparing Claude, GPT-5.4, Gemini, DeepSeek, and local models on BFCL and TAU-bench scores.

Microsoft's enterprise control plane for AI agents ships with strong M365 integration and real security muscle - but critical features are still in preview, and the licensing model is a puzzle.

Google's Gemini 3.5 Flash is genuinely fast at 289 tok/s and competitive on agentic tasks - but the benchmark portfolio has gaps worth knowing before you build on it.