Rapt AI and AMD work to make GPU utilization more efficient

by | Mar 26, 2025 | Technology

Rapt AI, a provider of AI-powered AI-workload automation for GPUs and AI accelerators, has teamed with AMD to enhance AI infrastructure.

The long-term strategic collaboration aims to improve AI inference and training workload management and performance on AMD Instinct GPUs, offering customers a scalable and cost-effective solution for deploying AI applications.

As AI adoption accelerates, organizations are grappling with resource allocation, performance bottlenecks, and complex GPU management.

By integrating Rapt’s intelligent workload automation platform with AMD Instinct MI300X, MI325X and upcoming MI350 series GPUs, this collaboration delivers a scalable, high-performance, and cost-effective solution that enables customers to maximize AI inference and training efficiency across on-premises and multi-cloud infrastructures.

A more efficient solution

AMD Instinct MI325X GPU.

Charlie Leeming, CEO of Rapt AI, said in a press briefing, “The AI models we are seeing today are so large and most importantly are so dynamic and unpredictable. The older tools for optimizing don’t really fit at all. We observed these dynamics. Enterprises are throwing lots of money. Hiring a new set of talent in AI. It’s one of these disruptive technologies. We have a scenario where CFOs and CIOs are asking where is the return. In some cases, there is tens of millions, hundreds of millions or billions of dollars spend on GPU-related infrastructure.”

Leeming said Anil Ravindranath, CTO of Rapt AI, saw the solution. And that involved deploying monitors to enable observations of the infrastructure.

“We feel we have the right solution at the right time. We came out of stealth last fall. We are in a growing number of Fortune 100 companies. Two are running the code among cloud service providers,” Leeming said.

And he said, “We do have strategic partners but our conversations with AMD went extremely well. They are building tremendous GPUs, AI accelerators. We are known for putting the maximum amount of workload on GPUs. Inference is taking off. It’s in production stage now. AI workloads are exploding. Their data scientists are running as fast as they can. They are panicking, they need tools, they need efficiency, they need automation. It’s screaming for the right solution. Inefficiencies — 30% GPU underutilization. Customers do want flexibility. Large customers are asking if you support AMD.”

Improvements that can take nine hours can be done in three minutes, he said. Ravindranath said in a press briefing the Rapt AI platform enables up to 10 times model run capacity at the same AI compute spending level, up to 90% cost savings, and zero humans in a loop and no code changes. For productivity, this means no more waiting for compute and time spent tuning infrastructure.

Lemming said other techniques have been around for a while and haven’t cut it. Run AI, a rival, overlaps in a competitive way somewhat. He said his company observes in minutes instead of hours and then optimizes the infrastructure. Ravind …

Article Attribution | Read More at Article Source