FOMO is why enterprises pay for GPUs they don’t use — and why prices keep climbing

by | Apr 29, 2026 | Technology

Enterprises can’t fix their GPU waste problem because the fix makes the problem worse. Releasing idle capacity would improve utilization, but the same shortage driving GPU prices up is exactly why no team will give capacity back. So the fleet sits at roughly 5%, billed by the hour, and the cycle tightens.That pressure — repeated across thousands of enterprises over the past two years — is the reason most companies are now running their GPU fleets at roughly 5% utilization, according to Cast AI’s 2026 State of Kubernetes Optimization Report, which measured actual production clusters rather than surveying them. It’s also the reason nobody releases the idle capacity. Cast AI co-founder and President Laurent Gil has been tracking the dynamic for two years. “Many of the neoclouds are not cloud,” he told VentureBeat. “They are neo-real estate.”Five percent is about six times worse than a no-effort baseline. Gil puts a reasonable human-managed target at around 30% once you factor in day cycles, weekends and normal business patterns. Five percent means enterprises are running their most expensive infrastructure line at a fraction of what doing nothing intentional would yield. And it lands at the same moment cloud compute pricing has broken its 20-year pattern. AWS quietly raised its reserved H200 GPU prices by roughly 15% on a Saturday in January, with no formal announcement. Memory suppliers pushed HBM3e prices up 20% for 2026. It is the first time since AWS launched EC2 in 2006 that a hyperscaler has meaningfully raised reserved GPU pricing rather than cut it. For now, the assumption under most enterprise AI budgets — that cloud compute gets cheaper every year— no longer holds at the top of the stack.The cloud market has split in twoThe pricing move matters less for what it is than for what it signals about where the shortage a …

Article Attribution | Read More at Article Source