Anthropic releases Claude Opus 4.7, narrowly retaking lead for most powerful generally available LLM

by | Apr 16, 2026 | Technology

Anthropic is publicly releasing its most powerful large language model yet, Claude Opus 4.7, today — as it continues to keep an even more powerful successor, Mythos, restricted to a small number of external enterprise partners for cybersecurity testing and patching vulnerabilities in the software said enterprises use (which Mythos exposed rapidly).The big headlines are that Opus 4.7 exceeds its most direct rivals — OpenAI’s GPT-5.4, released in early March 2026, scarcely more than a month ago; and Google’s latest flagship model Gemini 3.1 Pro from February — on key benchmarks including agentic coding, scaled tool-use, agentic computer use, and financial analysis. But also, it’s notable how tight the race is getting: on directly comparable benchmarks, Opus 4.7 only leads GPT-5.4 by 7-4.It currently leads the market on the GDPVal-AA knowledge work evaluation with an Elo score of 1753, surpassing both GPT-5.4 (1674) and Gemini 3.1 Pro (1314). Yet, the model does not represent a “clean sweep” across all categories. Competitors like GPT-5.4 and Gemini 3.1 Pro still hold the lead in specific domains such as agentic search, where GPT-5.4 scores 89.3% compared to Opus 4.7’s 79.3%, as well as in multilingual Q&A and raw terminal-based coding. This positioning defines Opus 4.7 not as a unilateral victor in all AI tasks, but as a specialized powerhouse optimized for the reliability and long-horizon autonomy required by the burgeoning agentic economy.Claude Opus 4.7 is available today across all major cloud platforms, including Amazon Bedrock, Google Cloud’s Vertex AI, and Microsoft Foundry, with API pricing held steady at $5/$25 per million tokens.Improvement in hard sciences and agentic workflowsClaude Opus 4.7 is a direct evolution of the Opus 4.6 architecture, but its performance delta is most visible in the “hard” sciences of agentic workflows: software engineering and complex document reasoning. At its core, the model has been re-tuned to exhibit what Anthropic describes as “rigor”. This isn’t just marketing parlance; it refers to the model’s new ability to devise its own verification steps before reporting a task as complete. For e …

Article Attribution | Read More at Article Source