
OpenAI Gives Codex Desktop Control and 111 Plugins
OpenAI's April 16 Codex update adds background computer use on Mac, an Atlas-based in-app browser, gpt-image-1.5 image generation, and 111 new plugins - moving the app far beyond agentic coding.

OpenAI's April 16 Codex update adds background computer use on Mac, an Atlas-based in-app browser, gpt-image-1.5 image generation, and 111 new plugins - moving the app far beyond agentic coding.

A hands-on comparison of the best AI browser agents in 2026 - Perplexity Comet, Dia, Opera Neon, Chrome Gemini, Brave Leo, Fellou, and more - rated on agentic task depth, privacy, price, and platform support.

Claude Opus 4.6 leads SWE-bench Verified at 80.8% and OSWorld at 72.7% for agentic tasks, while GPT-5.4 ties for computer use; no single model dominates every workflow type.

GPT-5.4 leads OSWorld-Verified at 75.0% for desktop computer use while Claude Sonnet 4.6 matches human performance at 72.5% for half the price.

Anthropic's mid-tier model matches Opus 4.6 on computer use, leads all models on office productivity tasks, and costs five times less than the flagship at $3/$15 per million tokens.

Rankings of the best AI models and agent frameworks on computer use benchmarks - OSWorld, OSWorld-Verified, and ScreenSpot-Pro - updated March 2026.

GPT-5.4 brings native computer use, a 1M token context window, and serious coding muscle to OpenAI's mainline model - but at a premium price.

GPT-5.4 leads on computer use and enterprise productivity. Gemini 3.1 Pro leads on science reasoning and math at 20% lower cost. A benchmark-by-benchmark comparison.

GPT-5.4 leads on computer use and enterprise productivity at half the price. Claude Opus 4.6 leads on coding, agent teams, and long-context retrieval. Here is where each model wins.

OpenAI's most capable frontier model combines native computer use, 1M-token context, and three variants at $2.50/$15 per million tokens.

OpenAI ships GPT-5.4 with built-in computer use that beats human desktop performance, a 1 million token context window, and native Excel and Google Sheets integrations.

Perplexity's new Computer product breaks tasks into sub-agents routed across Claude, Gemini, GPT-5.2, and Grok, running autonomously for days or months in isolated cloud sandboxes. Available now for Max subscribers at $200/month.