Claude Sonnet 5 Matches Opus 4.8 Performance at Lower Cost
In brief
- Claude Sonnet 5 released Tuesday with performance matching Opus 4.8 at lower prices
- Sonnet 5 scored 1,618 on benchmarks versus Opus 4.8's 1,616—a statistical tie
- Updated tokenizer increases token consumption by 1.0–1.35×, offset by introductory pricing through August 31
- Fable 5 and Mythos 5 remain blocked for foreign nationals under U.S. export controls since June 12
Performance parity at a lower price point
Anthropic says Sonnet 5's performance is "close to that of Opus 4.8, but at lower prices." On GDPval-AA v2—an Artificial Analysis benchmark scoring real-world professional tasks across 44 jobs—Sonnet 5 scored 1,618, a statistical tie with Opus 4.8's 1,616. On Humanity's Last Exam, the gap narrowed further: Sonnet 5 hit 57.4% versus Opus 4.8's 57.9%, basically negligible.
Coding benchmarks show steeper gains. On SWE-bench Pro, Sonnet 5 achieved 63.2% compared to Sonnet 4.6's 58.1%. That's a meaningful jump for a model sitting between the prior Sonnet and flagship Opus tiers.
The trade-off arrives in token consumption. Sonnet 5 ships with an updated tokenizer that consumes more tokens for the same input—roughly 1.0–1.35× depending on content type. To ease the transition, Anthropic set introductory rates of $2/$10 to make the switch close to cost-neutral through August 31. After that window closes, pricing adjusts upward.
Constitutional guardrails and capability quirks
Sonnet 5 ships with a notable limitation: it was never trained on cybersecurity tasks and scored 0% on developing a working Firefox exploit. That's by design—a safety fence Anthropic built into the training process itself.
The model also broke new ground in self-awareness. It is the first model to criticize its Constitution's rule that states it must follow hard constraints even when it views those constraints as unethical. That kind of reflexivity—a model flagging potential conflicts in its own alignment rules—adds a layer of transparency to how these systems reason about their constraints.
In practical tests, Sonnet 5 generated a browser game on the first try with cleaner visuals and tighter logic than Sonnet 4.6. But the token meter moved fast. That single iteration ate 90% of the 5 limit quota on the Claude Pro plan.
Export controls shadow the release
Fable 5 and Mythos 5 remain suspended for foreign nationals since June 12 under a U.S. export control directive. Sonnet 5 itself carries no such restriction, making it available globally—a pragmatic move that lets Anthropic ship a competitive model to its full user base while higher-tier variants sit boxed.
Meanwhile, developers spent weeks this spring discussing how Anthropic let Opus 4.6 quietly lose its edge—dubbed AI shrinkflation. Sonnet 5's arrival addresses that friction, offering performance close to flagship pricing at mid-tier costs.


