Claude Sonnet 5

Sources#

Introducing Claude Sonnet 5

Summary#

Claude Sonnet 5 is Anthropic's "most agentic Sonnet yet" (announced July 2, 2026), a direct upgrade to Sonnet 4.6 that makes plans, uses tools (browsers, terminals), and runs autonomously "at a level that, just a few months ago, required larger and more expensive models." Anthropic's positioning: the Sonnet class started the agentic era (3.5 / 3.6 / 3.7 were the first models with strong coding + tool use), but recent agentic gains had concentrated in the Opus class — Sonnet 5 narrows that gap, landing close to Opus 4.8 at lower prices. API model id: claude-sonnet-5. As a first-party release announcement this is a vendor-claim source; benchmark deltas below are Anthropic's own and the fuller evaluation lives in the Claude Sonnet 5 System Card.

Pricing and identity#

Introductory pricing (through Aug 31, 2026): $2 / M input tokens, $10 / M output tokens.
Standard pricing (from Sep 1, 2026): $3 / M input, $15 / M output — vs Opus 4.8 at $5 / $25.
API model id claude-sonnet-5; the default model for Free and Pro plans, and available to Max, Team, Enterprise, in Claude Code, and on the Claude Platform.
Rate limits raised across Chat, Cowork, Claude Code, and the Claude Platform to accommodate higher-effort token usage.

Capability profile#

The headline claim is cost-performance range, not a single peak. Anthropic frames Sonnet 5 against Sonnet 4.6 (its predecessor) and Opus 4.8 (a more capable reference model) on two agentic evals — BrowseComp (agentic search) and OSWorld-Verified (computer use):

Strict improvement over Sonnet 4.6 across reasoning, tool use, coding, and knowledge work.
Wider cost-performance range than Opus 4.8, tunable via the effort parameter (up to xhigh): substantially better cost-efficiency at medium effort, and higher-effort runs "can match Opus 4.8 on some tasks." The pitch is that between Sonnet 5 and Opus 4.8 a user dials effort to find the right cost/performance balance — a first-party instance of the effort/budget lever in Client-Side Agent Optimization.
Early-access partners reported it "finishes complex tasks where previous Sonnet models would stop short" and "checks its own output without explicitly being asked" — spontaneous self-verification, a capability the shrinking-harness thesis predicts (the verification scaffold the harness used to supply migrates into the model).

(The head-to-head benchmark table vs Sonnet 4.6 and Opus 4.8 is published only as an image in the source and is not transcribed here.)

Benchmark errata (methodology, not model): a June 30, 2026 changelog corrected the BrowseComp cost-performance chart to the standard agentic-search methodology (10M-token budget with compaction + programmatic tool calling), which had underestimated Sonnet 5. Sonnet 4.6 baselines were also restated: Humanity's Last Exam re-graded to 34.6% (no tools) / 46.8% (tools), and OSWorld-Verified to 78.5%, after eval-methodology changes — so numbers differ from the Sonnet 4.6 launch blog.

Token-economics (migration hazard)#

Sonnet 5 uses an updated tokenizer — the same kind of change Opus 4.7 introduced — so the same input maps to roughly 1.0–1.35× more tokens depending on content type. Anthropic set the introductory pricing so the transition from Sonnet 4.6 is "roughly cost-neutral." As with Opus 4.7, real-traffic token inflation is content-dependent and worth measuring rather than assuming (cross-reference the context-window-as-primary-constraint theme in Claude Code Best Practices).

Safety and alignment profile#

Anthropic's pre-deployment evaluations report Sonnet 5 as an overall improvement on Sonnet 4.6:

Agentic safety: better at refusing malicious requests and resisting hijack attempts in prompt-injection attacks.
Honesty: lower rates of hallucination and sycophancy than Sonnet 4.6.
Automated behavioral audit (cooperation-with-misuse, deception, and other misaligned behaviors across many contexts): Sonnet 5 scored lower — i.e. safer — overall than Sonnet 4.6, but higher (worse) than the more capable Opus 4.8 and Claude Mythos Preview. This is the mirror image of the usual worry: here the more capable models are better aligned on the audit, and Sonnet 5's residual misalignment is a mid-tier-capability artifact, not a frontier one.

Cyber capabilities and safeguards#

Sonnet 5 was not deliberately trained on cybersecurity tasks (contrast Opus 4.7, whose cyber capability was differentially reduced during training). It can do routine, non-harmful cyber work, but performs "substantially poorer" than Opus 4.8 and Mythos 5 on dangerous tasks like exploit development.

Firefox exploit eval (built with Mozilla; all vulnerabilities patched in Firefox 148): both Sonnet models scored 0.0% at developing a working exploit. Sonnet 5 showed a slightly higher partial-success rate than Sonnet 4.6 — which Anthropic attributes to general-intelligence gains, not cyber-specific training.
Default safeguards on. Because Sonnet 5 is somewhat stronger than 4.6 here, it launches with the same real-time cyber safeguards as Opus 4.7 and 4.8 (detect-and-block prohibited/high-risk cyber usage at inference). Judged low-risk, these are less strict than the safeguards launched with Fable 5 (which block a much wider range of cyber tasks and fall back to a weaker model). Legitimate security researchers route through the Cyber Verification Program; Anthropic recommends Opus 4.8 for cyber work needing reduced guardrails.

This places Sonnet 5 as a distinct point on the safeguard spectrum mapped in Capability-Gated Model Fallback: no deliberate train-down (its low cyber capability is native), inference-time detect-and-block at the Opus-4.7/4.8 strictness level, and no model-swap fallback — narrower than Fable 5's classifier-plus-fallback regime because the underlying uplift risk is judged low.

Availability#

Available everywhere from launch (July 2, 2026): default for Free/Pro, available to Max/Team/Enterprise, in Claude Code, and on the Claude Platform (native, AWS, Microsoft Foundry; Google Vertex "coming soon" for the Cyber Verification Program). API id claude-sonnet-5.

Connections#

Claude Opus 4.8 — the capability ceiling Sonnet 5 is measured against: "close to Opus 4.8 at lower prices," matching it at higher effort on some tasks, and safer-than-Sonnet-5 on the behavioral audit; the model Anthropic recommends over Sonnet 5 for reduced-guardrail cyber work
Claude Opus 4.7 — precedent for the two migration-relevant changes: the 1.0–1.35× tokenizer inflation and the default real-time cyber safeguards Sonnet 5 inherits
Claude Fable 5 — the stricter end of the safeguard spectrum; Sonnet 5's cyber safeguards are explicitly "less strict than those launched with Fable 5"
Claude Mythos 5 — cyber-capability reference point Sonnet 5 falls far short of
Mythos Model — Mythos Preview is the best-aligned reference on the behavioral audit that Sonnet 5 trails
Anthropic — vendor
Claude Code — primary agentic runtime; Sonnet 5 ships as an available model there at launch
Capability-Gated Model Fallback — the safeguard-spectrum framing Sonnet 5 adds a low-risk, no-fallback point to
Client-Side Agent Optimization — Sonnet 5's effort-level cost-performance tuning is a first-party instance of the model-per-role / budget / routing lever
Harness Shrinkage as Models Improve — "checks its own output without being asked" is verification scaffolding migrating from harness into model
Agentic Prompt Injection — improved hijack-resistance is a headline agentic-safety gain
Automated Behavioral Audit — the alignment evaluation Sonnet 5 is scored on (safer than 4.6, worse than Opus 4.8 / Mythos Preview)
LLM-Driven Vulnerability Research — the cyber-capability axis Sonnet 5 is deliberately weak on; the Firefox exploit eval is the worked instance
Responsible Scaling Policy Evaluations — Sonnet 5's pre-deployment safety/capability evals and its low-risk cyber determination

Open questions#

The head-to-head benchmark numbers vs Sonnet 4.6 and Opus 4.8 are image-only in the source; the System Card has the full set.
What is the real-world token-inflation multiplier on typical Sonnet 5 traffic (1.0–1.35× is content-dependent), and does "roughly cost-neutral" hold once effort levels rise?
Why does a mid-tier model show higher behavioral-audit misalignment than the more capable Opus 4.8 and Mythos Preview — a capability-alignment coupling, or a training-recipe difference between the Sonnet and Opus/Mythos lines?
At what effort level does Sonnet 5 actually match Opus 4.8, and how does the crossover cost compare to just running Opus 4.8?

Sources#

Introducing Claude Sonnet 5 — Anthropic, "Introducing Claude Sonnet 5" (July 2, 2026; changelog edit June 30, 2026). evidence: vendor-claim