Mystery solved: Anthropic reveals changes to Claude’s harnesses and operating instructions likely caused degradation

by News Feed Editor | Apr 23, 2026 | Technology

For several weeks, a growing chorus of developers and AI power users claimed that Anthropic’s flagship models were losing their edge. Users across GitHub, X, and Reddit reported a phenomenon they described as “AI shrinkflation”—a perceived degradation where Claude seemed less capable of sustained reasoning, more prone to hallucinations, and increasingly wasteful with tokens. Critics pointed to a measurable shift in behavior, alleging that the model had moved from a “research-first” approach to a lazier, “edit-first” style that could no longer be trusted for complex engineering. While the company initially pushed back against claims of “nerfing” the model to manage demand, the mounting evidence from high-profile users and third-party benchmarks created a significant trust gap. Today, Anthropic addressed these concerns directly, publishing a technical post-mortem that identified three separate product-layer changes responsible for the reported quality issues. “We take reports about degradation very seriously,” reads Anthropic’s blog post on the matter. “We never intentionally degrade our models, and we were able to immediately confirm that our API and inference layer were unaffected.”Anthropic claims it has resolved the issues by reverting the reasoning effort change and the verbosity prompt, while fixing the caching bug in version v2.1.116. The mounting evidence of degradationThe controversy gained momentum in early April 2026, fueled by detailed technical analyses from the developer community. Stella Laurenzo, a Senior Director in AMD’s AI group, published an exhaustive audit of 6,852 Claude Code session files and over 234,000 tool calls on Github showing performance falling from her usage before. Her findings suggested that Claude’s reasoning depth had fallen sharply, leading to reasoning loops and a tendency to choose the “simplest fix” rather than the correct one.This anecdotal frustration was seemingly validated by third-party benchmarks. BridgeMind reported that Claude Opus 4.6’s accuracy had dropped from 83.3% to 68.3% in their tests, causing its ranking to plummet from No. 2 to No. 10. Although so …

Article Attribution | Read More at Article Source

Mystery solved: Anthropic reveals changes to Claude’s harnesses and operating instructions likely caused degradation

About RN

Website Awards

More Info