Nvidia debuts Nemotron 3 with hybrid MoE and Mamba-Transformer to drive efficient agentic AI

by | Dec 15, 2025 | Technology

Nvidia launched the new version of its frontier models, Nemotron 3, by leaning in on a model architecture that the world’s most valuable company said offers more accuracy and reliability for agents. Nemotron 3 will be available in three sizes: Nemotron 3 Nano with 30B parameters, mainly for targeted, highly efficient tasks; Nemotron 3 Super, which is a 100B parameter model for multi-agent applications and with high-accuracy reasoning and Nemotron 3 Ultra, with its large reasoning engine and around 500B parameters for more complex applications. To build the Nemotron 3 models, Nvidia said it leaned into a hybrid mixture-of-experts (MoE) architecture to improve scalability and efficiency. By using this architecture, Nvidia said in a press release that its new models also offer enterprises more openness and performance when building multi-agent autonomous systems. Kari Briski, Nvidia vice president for generative AI software, told reporters in a briefing that the company wanted to demonstrate its commitment to learn and improving from previous iterations of its models. “We believe that we are uniquely positioned to serve a wide range of developers who want full flexibility to customize models for building specialized AI by combining that new hybrid mixture of our mixture of experts architecture with a 1 million token context length,” Briski said.  Nvidia said early adopters of the Nemotron 3 models include Accenture, CrowdStrike, Cursor, Deloitte, EY, Oracle Cloud Infrastructure, Palantir, Perplexity, ServiceNow, Siemens and Zoom.Breakthrough architectures Nvidia has been using the hybrid Mamba-Transformer mixture-of-experts architecture for many of its models, including Nemotron-Nano-9B-v2.The architecture is based on research from Carnegie Mellon University and Princeton, which weaves in selective state-space models to handle long pieces of information while maintaining states. It can reduce compute costs even throu …

Article Attribution | Read More at Article Source