Claude Sonnet 5

Anthropic's latest Sonnet-class model brings near-Opus coding performance to mid-tier pricing, with major agentic search and computer use gains over Sonnet 4.6.

Claude Sonnet 5

Overview

Claude Sonnet 5, released June 30, 2026, is Anthropic's most capable Sonnet-class model. It sits below Opus 4.8 on complex multi-step reasoning tasks, but the gap has narrowed considerably - especially on coding, agentic search, and computer use, where Sonnet 5 lands within a few percentage points of the Opus tier at one-third the per-token cost.

TL;DR

  • SWE-bench Verified: 85.2%; BrowseComp: 84.7% single-agent (best among non-Opus models)
  • 1M context, 128k max output, adaptive thinking, $2/$10 per million tokens intro pricing through Aug 31, 2026
  • Substantially better agentic search and computer use than Claude Sonnet 4.6 at the same standard price point

The previous Sonnet, 4.6, made headlines by matching its Opus counterpart on office productivity tasks. Sonnet 5 extends that pattern into more demanding territory: agentic search, long-horizon coding, and professional task automation. On BrowseComp - the benchmark measuring a model's ability to find hard-to-find information through autonomous web research - Sonnet 5 scores 84.7% (single-agent), trailing only GPT-5.5 (84.4%) among publicly reported results and clearly ahead of Sonnet 4.6 (76.2%).

Anthropic is positioning this as the default model for developers who want near-flagship performance without flagship pricing. It's available immediately on all plans - Free, Pro, Max, Team, and Enterprise - and carries introductory pricing of $2 per million input tokens and $10 per million output tokens through August 31, 2026, reverting to the standard $3/$15 afterward.

Key Specifications

SpecificationDetails
ProviderAnthropic
Model FamilyClaude
ParametersNot disclosed
Context Window1,000,000 tokens
Max Output128,000 tokens (up to 300k via Batch API beta)
Input Price$2.00/M tokens intro (through Aug 31, 2026); $3.00/M standard
Output Price$10.00/M tokens intro (through Aug 31, 2026); $15.00/M standard
Release DateJune 30, 2026
Training CutoffJanuary 2026
LicenseProprietary
Model IDclaude-sonnet-5
Adaptive ThinkingYes (defaults to high effort on API and Claude Code)
Input ModalitiesText, images

Benchmark Performance

All Sonnet 5 results below use adaptive thinking at max effort unless noted, averaged over 5 trials, from the official system card published June 30, 2026.

BenchmarkSonnet 5Sonnet 4.6Opus 4.8GPT-5.5
SWE-bench Verified85.2%79.6%--
SWE-bench Pro63.2%58.1%-58.6%
Terminal-Bench 2.180.4%67.0%-83.4% (Codex CLI)
BrowseComp (single agent)84.7%76.2%-84.4%
HLE (with tools)57.4%46.8%-52.2%
OSWorld-Verified81.2%78.5%-78.7%
FrontierCode v138.8%15.1%-25.5%
GDPval-AA v2 (Elo)1,6091,381-1,492
CursorBench61.2%49.0%63.8%-
USAMO 202679.5%55.0%96.7%-
ArXivMath (with tools)72.2%-71.0%72.2%

Several numbers stand out. The FrontierCode v1 jump from 15.1% to 38.8% is the largest single-benchmark gain in the table - a 2.6x improvement on an agentic coding benchmark created by Cognition, where tasks are derived from real pull requests in open-source repos with no human intervention allowed. The USAMO 2026 score of 79.5% (mathematical olympiad proofs, judged by a panel of frontier models) is strong for a Sonnet-class model, though it trails Opus 4.8 at 96.7%. On GDPval-AA, the office productivity Elo leaderboard, Sonnet 5 (1,609) beats Sonnet 4.6 (1,381) and GPT-5.5 (1,492) - continuing the pattern from its predecessor of leading on knowledge-work automation tasks.

The Terminal-Bench 2.1 result (80.4%) is where Sonnet 5 most clearly closes the gap with Codex CLI (83.4%). Prior Sonnet versions trailed the OpenAI coding tools by a wider margin on terminal-based multi-language workflows; a 3-point gap at this level is within practical parity for most deployments.

Key Capabilities

Agentic Coding and Long-Horizon Tasks

At 85.2% on SWE-bench Verified, Sonnet 5 is Anthropic's highest-scoring Sonnet on that benchmark - 5.6 points above Sonnet 4.6. The SWE-bench Pro result (63.2% vs. 58.1%) reflects a harder suite of problems drawn from actively maintained repositories with multi-file diffs and reduced ground-truth leakage. FrontierCode, which gives agents a binary and asks them to reconstruct the source without decompilation tools, jumped from 15.1% to 38.8% - the kind of gain that matters if you're running Claude Code against unfamiliar codebases or large-scale refactors. CursorBench scores were measured independently by Cursor (61.2% for Sonnet 5 vs. 63.8% for Opus 4.8), confirming that the model is competitive in production IDE workflows with the Opus tier. For a broader view of where Sonnet 5 fits in the coding rankings, see the coding benchmarks leaderboard.

ProgramBench - where models rebuild entire programs from a binary - shows Sonnet 5 scoring 76-86% across episodes, versus 52-74% for Sonnet 4.6 and 80-90% for Opus 4.8. That's a meaningful narrowing on a benchmark specifically designed to stress long-context reasoning over full software architecture.

Agentic Search and Computer Use

BrowseComp measures a model's ability to answer hard research questions through autonomous web browsing. Sonnet 5's single-agent score of 84.7% is effectively tied with GPT-5.5 (84.4%) and ahead of the previous Sonnet by 8.5 points. On OSWorld-Verified, which tests autonomous computer use across desktop tasks, Sonnet 5 scores 81.2% - up from 78.5% on Sonnet 4.6 and ahead of GPT-5.5 (78.7%). These two results together make the case that Sonnet 5 is now a credible choice for production computer use workflows, not just a stepping stone to Opus. The computer use leaderboard tracks this category in detail.

The system card highlights improved prompt injection robustness as part of the agentic safety work. Sonnet 5 is better than Sonnet 4.6 at identifying and resisting injected instructions in web content and tool outputs - an important property for any model being used in browser automation.

Professional and Knowledge Work

GDPval-AA Elo of 1,609 leads all models in the benchmark table, ahead of GPT-5.5 (1,492) and Sonnet 4.6 (1,381). HealthBench Professional at 57.8% (vs. 44.2% for Sonnet 4.6 and 51.8% for GPT-5.5) shows meaningful improvement on clinical and professional healthcare tasks. Legal Agent Benchmark scores 8.9 on the full public set (vs. 8.0 for Sonnet 4.6), with the harder Harvey held-out set at 5.8 vs. 5.4. These are niche but important enterprise benchmarks; the gains are consistent rather than dramatic. For current model rankings across professional domains, see the overall LLM rankings for June 2026.

Pricing and Availability

TierInputOutput
Intro pricing (through Aug 31, 2026)$2.00/M$10.00/M
Standard pricing$3.00/M$15.00/M
Batch API (50% off standard)$1.50/M$7.50/M

Prompt caching saves up to 90% on repeated context. US-only inference is available at 1.1x standard pricing. The model is the default on Free and Pro tiers of claude.ai, accessible on Max, Team, and Enterprise, and available through the Anthropic API, Amazon Bedrock (anthropic.claude-sonnet-5), Google Cloud (claude-sonnet-5), and Microsoft Foundry.

Adaptive thinking defaults to high effort on the API and Claude Code. Setting effort explicitly is recommended for cost-sensitive workloads; low and medium effort reduce token consumption meaningfully at the cost of some performance on harder tasks.

For developers comparing options, Claude Opus 4.8 costs $5/$25 per million tokens and remains the better choice for deep scientific reasoning and tasks where Sonnet 5 clearly trails. Claude Haiku 4.5 at $1/$5 is the latency-first option when task complexity doesn't justify Sonnet-class cost.

Strengths

  • SWE-bench Verified at 85.2% - highest score for any Sonnet-class model
  • BrowseComp 84.7% single-agent, effectively tied with GPT-5.5 for agentic search
  • OSWorld-Verified 81.2%, ahead of GPT-5.5 on autonomous computer use
  • FrontierCode v1 gain from 15.1% to 38.8% - a 2.6x improvement in one generation
  • GDPval-AA Elo 1,609 leads the benchmark table on office productivity
  • Introductory pricing of $2/$10 through Aug 31, 2026 makes cost comparisons favorable vs. prior flagship tiers
  • Improved prompt injection resistance over Sonnet 4.6 (critical for agentic deployments)

Weaknesses

  • Trails Opus 4.8 on USAMO (79.5% vs. 96.7%) and CursorBench (61.2% vs. 63.8%)
  • Terminal-Bench 2.1 at 80.4% is still below Codex CLI (83.4%) for terminal-heavy workflows
  • Cybersecurity capabilities are intentionally reduced vs. Opus tier - not appropriate for offensive security research
  • Parameters not disclosed; no open-weight or self-hosted option
  • Standard pricing ($3/$15) reverts in September 2026 to the same level as Sonnet 4.6 - the intro period cost advantage is temporary
  • Wet-blanket response rate is slightly elevated vs. prior models per the system card

FAQ

How does Claude Sonnet 5 compare to Claude Opus 4.8?

Sonnet 5 closes the gap on coding (SWE-bench Verified: 85.2% vs. not published for Opus 4.8) and computer use (OSWorld: 81.2% vs. a higher Opus result), but Opus 4.8 leads by a wide margin on mathematical reasoning (USAMO 2026: 96.7% vs. 79.5%). Cost is 1/2.5x: Sonnet 5 at $3/$15 vs. Opus at $5/$25.

What is the model ID for Claude Sonnet 5?

The API model ID is claude-sonnet-5. On Amazon Bedrock: anthropic.claude-sonnet-5. On Google Cloud: claude-sonnet-5. On Microsoft Foundry, check the model catalog for the versioned ID.

When does introductory pricing end?

Introductory pricing of $2 per million input tokens and $10 per million output tokens applies through August 31, 2026. Standard pricing of $3/$15 per million tokens applies from September 1, 2026.

Does Claude Sonnet 5 support computer use?

Yes. OSWorld-Verified score is 81.2%, ahead of Sonnet 4.6 (78.5%) and GPT-5.5 (78.7%). Computer use is available via the Anthropic API and Claude.ai. The model also shows improved prompt injection resistance, which is important for browser automation tasks.

What is the context window for Claude Sonnet 5?

1 million tokens, matching Sonnet 4.6. Max output is 128k tokens (or up to 300k via the Batch API using the output-300k-2026-03-24 beta header). Training knowledge cutoff is January 2026.

Is Claude Sonnet 5 available for free?

Free-tier users on claude.ai get access with usage limits. Sonnet 5 is the default model on Free and Pro plans.

Sources

✓ Last verified June 30, 2026

James Kowalski
About the author AI Benchmarks & Tools Analyst

James is a software engineer turned tech writer who spent six years building backend systems at a fintech startup in Chicago before pivoting to full-time analysis of AI tools and infrastructure.