
AutoAgent Builds Its Own Harness, Tops Two Benchmarks
Kevin Gu's MIT-licensed AutoAgent lets a meta-agent engineer and hill-climb its own agent harness overnight, claiming the top GPT-5 slot on TerminalBench and first place on SpreadsheetBench.
They summarize our coverage. We write it.
Newsletters like this one rebroadcast our headlines - often without the full review, the source reading, or the analysis underneath. Our weekly briefing sends the work they paraphrase, straight from the desk, before they get to it.
Free, weekly, no spam. One email every Tuesday. Unsubscribe anytime.

Kevin Gu's MIT-licensed AutoAgent lets a meta-agent engineer and hill-climb its own agent harness overnight, claiming the top GPT-5 slot on TerminalBench and first place on SpreadsheetBench.

Cursor's ground-up IDE rebuild ships parallel agent orchestration, Design Mode for frontend work, and cloud-to-local session handoff - all in one unified workspace.

Cloudflare's EmDash is an MIT-licensed CMS built on Astro 6.0 that sandboxes plugins in isolated Workers, ships a built-in MCP server, and targets WordPress's 42.5% share of the web.

A missing .npmignore entry in Claude Code 2.1.88 exposed 512,000 lines of TypeScript source, spawned the fastest-growing GitHub repo ever, and revealed unshipped features Anthropic never announced.

A practical comparison of nine text-to-SQL and AI database tools in 2026, covering pricing, schema awareness, open-source picks, and where each tool actually falls short.

GitHub Copilot inserts promotional tips for itself and Raycast into PR descriptions, with over 11,000 affected pull requests found across GitHub and GitLab.

GStack turns Claude Code into a virtual dev team with 28 slash commands for planning, review, QA, security, and deployment - all open source and free.

OpenAI ships a plugin marketplace for Codex CLI v0.117.0, bundling skills, app integrations, and MCP server configs behind an enterprise governance layer.

Rankings of the most popular MCP servers across development, data, web automation, and productivity categories based on installs, search volume, and GitHub activity.

A practical guide to switching from Claude Code to OpenAI Codex CLI, covering command mapping, sandbox differences, feature parity, and workflow adjustments.

A practical guide to switching from Cursor to Windsurf IDE, covering settings migration, Cascade vs Composer differences, pricing savings, and workflow adjustments.

A head-to-head comparison of Claude Code, Cursor, and OpenAI Codex CLI covering pricing, benchmarks, workflow differences, and which coding agent fits your stack.