Salesforce takes aim at ‘jagged intelligence’ in push for more reliable AI

by | May 1, 2025 | Technology

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

Salesforce is tackling one of artificial intelligence’s most persistent challenges for business applications: the gap between an AI system’s raw intelligence and its ability to consistently perform in unpredictable enterprise environments — what the company calls “jagged intelligence.”

In a comprehensive research announcement today, Salesforce AI Research revealed several new benchmarks, models, and frameworks designed to make future AI agents more intelligent, trusted, and versatile for enterprise use. The innovations aim to improve both the capabilities and consistency of AI systems, particularly when deployed as autonomous agents in complex business settings.

“While LLMs may excel at standardized tests, plan intricate trips, and generate sophisticated poetry, their brilliance often stumbles when faced with the need for reliable and consistent task execution in dynamic, unpredictable enterprise environments,” said Silvio Savarese, Salesforce’s Chief Scientist and Head of AI Research, during a press conference preceding the announcement.

The initiative represents Salesforce’s push toward what Savarese calls “Enterprise General Intelligence” (EGI) — AI designed specifically for business complexity rather than the more theoretical pursuit of Artificial General Intelligence (AGI).

“We define EGI as purpose-built AI agents for business optimized not just for capability, but for consistency, too,” Savarese explained. “While AGI may conjure images of superintelligent machines surpassing human intelligence, businesses aren’t waiting for that distant, illusory future. They’re applying these foundational concepts now to solve real-world challenges at scale.”

How Salesforce is measuring and fixing AI’s inconsistency problem in enterprise settings

A central focus of the research is quantifying and addressing AI’s inconsistency in performance. Salesforce introduced the SIMPLE dataset, a public benchmark featuring 225 straightforward reasoning questions designed to measure how jagged an AI system’s capabilities really are.

“Today’s AI is jagged, so we need to work on that. But how can we work on something without measuring it first? That’s exactly what this SIMPLE benchmark is,” explained Shelby Heinecke, Senior Manager of Research at Salesforce, during the press conference.

For enterprise applications, this inconsistency isn’t merely an academic concern. A single misstep from an AI agent could disrupt operations, erode customer trust, or inflict su …

Article Attribution | Read More at Article Source