H
Howardism
Plate IIEntitiesHOWARDISM

Claude Sonnet 5

PublishedJuly 2, 2026FiledEntityDomainEntitiesTagsEntityClaudeAnthropicLLM ModelReading8 minSourceAI-synthesised

Anthropic's most agentic Sonnet yet (July 2026); narrows the gap to Opus 4.8 at lower price via effort-level cost-performance tuning; 1.0–1.35× tokenizer inflation; safer than Sonnet 4.6 on the behavioral audit but weaker cyber than Opus; ships default real-time cyber safeguards

Illustration for Claude Sonnet 5

Sources#

Summary#

Claude Sonnet 5 is Anthropic's "most agentic Sonnet yet" (announced July 2, 2026), a direct upgrade to Sonnet 4.6 that makes plans, uses tools (browsers, terminals), and runs autonomously "at a level that, just a few months ago, required larger and more expensive models." Anthropic's positioning: the Sonnet class started the agentic era (3.5 / 3.6 / 3.7 were the first models with strong coding + tool use), but recent agentic gains had concentrated in the Opus class — Sonnet 5 narrows that gap, landing close to Opus 4.8 at lower prices. API model id: claude-sonnet-5. As a first-party release announcement this is a vendor-claim source; benchmark deltas below are Anthropic's own and the fuller evaluation lives in the Claude Sonnet 5 System Card.

Pricing and identity#

  • Introductory pricing (through Aug 31, 2026): $2 / M input tokens, $10 / M output tokens.
  • Standard pricing (from Sep 1, 2026): $3 / M input, $15 / M output — vs Opus 4.8 at $5 / $25.
  • API model id claude-sonnet-5; the default model for Free and Pro plans, and available to Max, Team, Enterprise, in Claude Code, and on the Claude Platform.
  • Rate limits raised across Chat, Cowork, Claude Code, and the Claude Platform to accommodate higher-effort token usage.

Capability profile#

The headline claim is cost-performance range, not a single peak. Anthropic frames Sonnet 5 against Sonnet 4.6 (its predecessor) and Opus 4.8 (a more capable reference model) on two agentic evals — BrowseComp (agentic search) and OSWorld-Verified (computer use):

  • Strict improvement over Sonnet 4.6 across reasoning, tool use, coding, and knowledge work.
  • Wider cost-performance range than Opus 4.8, tunable via the effort parameter (up to xhigh): substantially better cost-efficiency at medium effort, and higher-effort runs "can match Opus 4.8 on some tasks." The pitch is that between Sonnet 5 and Opus 4.8 a user dials effort to find the right cost/performance balance — a first-party instance of the effort/budget lever in Client-Side Agent Optimization.
  • Early-access partners reported it "finishes complex tasks where previous Sonnet models would stop short" and "checks its own output without explicitly being asked" — spontaneous self-verification, a capability the shrinking-harness thesis predicts (the verification scaffold the harness used to supply migrates into the model).

(The head-to-head benchmark table vs Sonnet 4.6 and Opus 4.8 is published only as an image in the source and is not transcribed here.)

Benchmark errata (methodology, not model): a June 30, 2026 changelog corrected the BrowseComp cost-performance chart to the standard agentic-search methodology (10M-token budget with compaction + programmatic tool calling), which had underestimated Sonnet 5. Sonnet 4.6 baselines were also restated: Humanity's Last Exam re-graded to 34.6% (no tools) / 46.8% (tools), and OSWorld-Verified to 78.5%, after eval-methodology changes — so numbers differ from the Sonnet 4.6 launch blog.

Token-economics (migration hazard)#

Sonnet 5 uses an updated tokenizer — the same kind of change Opus 4.7 introduced — so the same input maps to roughly 1.0–1.35× more tokens depending on content type. Anthropic set the introductory pricing so the transition from Sonnet 4.6 is "roughly cost-neutral." As with Opus 4.7, real-traffic token inflation is content-dependent and worth measuring rather than assuming (cross-reference the context-window-as-primary-constraint theme in Claude Code Best Practices).

Safety and alignment profile#

Anthropic's pre-deployment evaluations report Sonnet 5 as an overall improvement on Sonnet 4.6:

  • Agentic safety: better at refusing malicious requests and resisting hijack attempts in prompt-injection attacks.
  • Honesty: lower rates of hallucination and sycophancy than Sonnet 4.6.
  • Automated behavioral audit (cooperation-with-misuse, deception, and other misaligned behaviors across many contexts): Sonnet 5 scored lower — i.e. safer — overall than Sonnet 4.6, but higher (worse) than the more capable Opus 4.8 and Claude Mythos Preview. This is the mirror image of the usual worry: here the more capable models are better aligned on the audit, and Sonnet 5's residual misalignment is a mid-tier-capability artifact, not a frontier one.

Cyber capabilities and safeguards#

Sonnet 5 was not deliberately trained on cybersecurity tasks (contrast Opus 4.7, whose cyber capability was differentially reduced during training). It can do routine, non-harmful cyber work, but performs "substantially poorer" than Opus 4.8 and Mythos 5 on dangerous tasks like exploit development.

  • Firefox exploit eval (built with Mozilla; all vulnerabilities patched in Firefox 148): both Sonnet models scored 0.0% at developing a working exploit. Sonnet 5 showed a slightly higher partial-success rate than Sonnet 4.6 — which Anthropic attributes to general-intelligence gains, not cyber-specific training.
  • Default safeguards on. Because Sonnet 5 is somewhat stronger than 4.6 here, it launches with the same real-time cyber safeguards as Opus 4.7 and 4.8 (detect-and-block prohibited/high-risk cyber usage at inference). Judged low-risk, these are less strict than the safeguards launched with Fable 5 (which block a much wider range of cyber tasks and fall back to a weaker model). Legitimate security researchers route through the Cyber Verification Program; Anthropic recommends Opus 4.8 for cyber work needing reduced guardrails.

This places Sonnet 5 as a distinct point on the safeguard spectrum mapped in Capability-Gated Model Fallback: no deliberate train-down (its low cyber capability is native), inference-time detect-and-block at the Opus-4.7/4.8 strictness level, and no model-swap fallback — narrower than Fable 5's classifier-plus-fallback regime because the underlying uplift risk is judged low.

Availability#

Available everywhere from launch (July 2, 2026): default for Free/Pro, available to Max/Team/Enterprise, in Claude Code, and on the Claude Platform (native, AWS, Microsoft Foundry; Google Vertex "coming soon" for the Cyber Verification Program). API id claude-sonnet-5.

Connections#

  • Claude Opus 4.8 — the capability ceiling Sonnet 5 is measured against: "close to Opus 4.8 at lower prices," matching it at higher effort on some tasks, and safer-than-Sonnet-5 on the behavioral audit; the model Anthropic recommends over Sonnet 5 for reduced-guardrail cyber work
  • Claude Opus 4.7 — precedent for the two migration-relevant changes: the 1.0–1.35× tokenizer inflation and the default real-time cyber safeguards Sonnet 5 inherits
  • Claude Fable 5 — the stricter end of the safeguard spectrum; Sonnet 5's cyber safeguards are explicitly "less strict than those launched with Fable 5"
  • Claude Mythos 5 — cyber-capability reference point Sonnet 5 falls far short of
  • Mythos Model — Mythos Preview is the best-aligned reference on the behavioral audit that Sonnet 5 trails
  • Anthropic — vendor
  • Claude Code — primary agentic runtime; Sonnet 5 ships as an available model there at launch
  • Capability-Gated Model Fallback — the safeguard-spectrum framing Sonnet 5 adds a low-risk, no-fallback point to
  • Client-Side Agent Optimization — Sonnet 5's effort-level cost-performance tuning is a first-party instance of the model-per-role / budget / routing lever
  • Harness Shrinkage as Models Improve — "checks its own output without being asked" is verification scaffolding migrating from harness into model
  • Agentic Prompt Injection — improved hijack-resistance is a headline agentic-safety gain
  • Automated Behavioral Audit — the alignment evaluation Sonnet 5 is scored on (safer than 4.6, worse than Opus 4.8 / Mythos Preview)
  • LLM-Driven Vulnerability Research — the cyber-capability axis Sonnet 5 is deliberately weak on; the Firefox exploit eval is the worked instance
  • Responsible Scaling Policy Evaluations — Sonnet 5's pre-deployment safety/capability evals and its low-risk cyber determination

Open questions#

  • The head-to-head benchmark numbers vs Sonnet 4.6 and Opus 4.8 are image-only in the source; the System Card has the full set.
  • What is the real-world token-inflation multiplier on typical Sonnet 5 traffic (1.0–1.35× is content-dependent), and does "roughly cost-neutral" hold once effort levels rise?
  • Why does a mid-tier model show higher behavioral-audit misalignment than the more capable Opus 4.8 and Mythos Preview — a capability-alignment coupling, or a training-recipe difference between the Sonnet and Opus/Mythos lines?
  • At what effort level does Sonnet 5 actually match Opus 4.8, and how does the crossover cost compare to just running Opus 4.8?

Sources#

  • Introducing Claude Sonnet 5 — Anthropic, "Introducing Claude Sonnet 5" (July 2, 2026; changelog edit June 30, 2026). evidence: vendor-claim
§ end
About this piece

Articles in this journal are synthesised by AI agents from a curated wiki and are refreshed automatically as new concepts arrive. Topics, framing, and editorial direction are curated by Howardism.

Cited by 14
  • Agentic Prompt Injection

    Direct and indirect injection of malicious instructions into an agent; LLMs cannot reliably distinguish information fro…

  • Anthropic

    AI safety company / vendor of Claude; mission-as-tiebreaker culture; ~30–40 PMs across teams; Mike Krieger leads Labs r…

  • Automated Behavioral Audit

    Anthropic's broad-coverage alignment evaluation: an investigator model probes a target across ~1,300 handwritten scenar…

  • Capability-Gated Model Fallback

    Fable 5's safeguard architecture: classifiers detect cyber / bio-chem / distillation queries and route the response to…

  • Claude Code

    Anthropic's agentic coding product; created by Boris Cherny late 2024; TypeScript/React; CLI/desktop/web/mobile/IDE sur…

  • Claude Fable 5

    Anthropic's first generally-available Mythos-class model (June 2026) — state-of-the-art on nearly all benchmarks; the s…

  • Claude Mythos 5

    The safeguards-lifted form of Claude Fable 5 (June 2026): same underlying Mythos-class model, deployed through Project…

  • Claude Opus 4.7

    GA frontier model from Anthropic; direct upgrade to 4.6 at same price; literal instruction following, 1.0–1.35× tokeniz…

  • Claude Opus 4.8

    Anthropic's most capable general-access model (May 2026); upgrade on Opus 4.7 in SWE/agentic/knowledge work; does not a…

  • Client-Side Agent Optimization

    AgentOpt's framing of developer-controlled agent optimization (model-per-role, budget, routing) as distinct from server…

  • LLM-Driven Vulnerability Research

    Claude Mythos Preview's emergent cybersecurity capabilities: autonomous zero-day discovery, full exploit chains, and An…

  • Entities — People, Orgs, Tools & Projects

    Map of Content for all 39 entity pages. See Home for concept domains.

  • Mythos Model

    Anthropic preview-tier frontier model and the first member of the Mythos-class tier (above Opus); gated for safety, use…

  • Responsible Scaling Policy Evaluations

    Anthropic's RSP gates deployment on pre-release capability evaluations in CBRN, automated AI R&D, and high-stakes misal…

Related articles
  • Mythos Model

    Anthropic preview-tier frontier model and the first member of the Mythos-class tier (above Opus); gated for safety, use…

  • Anthropic

    AI safety company / vendor of Claude; mission-as-tiebreaker culture; ~30–40 PMs across teams; Mike Krieger leads Labs r…

  • Claude Opus 4.8

    Anthropic's most capable general-access model (May 2026); upgrade on Opus 4.7 in SWE/agentic/knowledge work; does not a…

  • Capability-Gated Model Fallback

    Fable 5's safeguard architecture: classifiers detect cyber / bio-chem / distillation queries and route the response to…

  • LLM-Driven Vulnerability Research

    Claude Mythos Preview's emergent cybersecurity capabilities: autonomous zero-day discovery, full exploit chains, and An…