The big news this week from Nvidia, splashed in headlines across all forms of media, was the company’s announcement about its Vera Rubin GPU.This week, Nvidia CEO Jensen Huang used his CES keynote to highlight performance metrics for the new chip. According to Huang, the Rubin GPU is capable of 50 PFLOPs of NVFP4 inference and 35 PFLOPs of NVFP4 training performance, representing 5x and 3.5x the performance of Blackwell.But it won’t be available until the second half of 2026. So what should enterprises be doing now?Blackwell keeps on getting betterThe current, shipping Nvidia GPU architecture is Blackwell, which was announced in 2024 as the successor to Hopper. Alongside that release, Nvidia emphasized that that its product engineering path also included squeezing as much performance as possible out of the prior Grace Hopper architecture. It’s a direction that will hold true for Blackwell as well, with Vera Rubin coming later this year.”We continue to optimize our inference and training stacks for the Blackwell architecture,” Dave Salvator, director of accelerated computing products at Nvidia, told VentureBeat.In the same week that Vera Rubin was being touted by Nvidia’s CEO as its most powerful GPU ever, the company published new research showing improved Blackwell performance.How Blackwell performance has improved inference by 2.8x Nvidia has been able to increase Blackwell GPU performance by up to 2.8x per GPU in a period of just three short months.The performance gains come from a series of innovations that have been added to the Nvidia TensorRT-LLM inference engine. These optimizations apply to existing hardware, allowing current Blackwell deployments to achieve higher throughput without hardware changes.The performance gains are measured on DeepSeek-R1, a 671-billion parameter mixture-of-experts (MoE) model that activates 37 billion parameters per token. Among the technical innovations that provide the performan …