Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now
It’s been a little more than a month since Chinese AI startup DeepSeek, an offshoot of Hong Kong-based High-Flyer Capital Management, released the latest version of its hit open source model DeepSeek, R1-0528.
Like its predecessor, DeepSeek-R1 — which rocked the AI and global business communities with how cheaply it was trained and how well it performed on reasoning tasks, all available to developers and enterprises for free — R1-0528 is already being adapted and remixed by other AI labs and developers, thanks in large part to its permissive Apache 2.0 license.
This week, the 24-year-old German firm TNG Technology Consulting GmbH released one such adaptation: DeepSeek-TNG R1T2 Chimera, the latest model in its Chimera large language model (LLM) family. R1T2 delivers a notable boost in efficiency and speed, scoring at upwards of 90% of R1-0528’s intelligence benchmark scores, while generating answers with less than 40% of R1-0528’s output token count.
That means it produces shorter responses, translating directly into faster inference and lower compute costs. On the model card TNG released for its new R1T2 on the AI code sharing community Hugging Face, the company states that it is “about 20% faster than the regular R1” (the one released back in January) “and more than twice as fast as R1-0528” (the May official update from DeepSeek).
Already, the response has been incredibly positive from the AI developer community. “DAMN! DeepSeek R1T2 – 200% faster than R1-0528 & 20% faster than R1,” wrote Vaibhav (VB) Srivastav, a senior leader at Hugging Face, on X. “Significantly better than R1 on GPQA & AIME 24, made via Assembly of Experts with DS V3, R1 & R1-0528 — and it’s MIT-licensed, available on Hugging Face.”
This gain is made possible by TNG’s Assembly-of-Experts (AoE) method — a technique for building LLMs by selectively merging the weight tensors (internal parameters) from multiple pre-trained models that TNG described in a paper published in May on arXiv, the non-peer reviewed open access online journal.
A successor to the original R1T Chimera, R1T2 introduces a new “Tri-Mind” configuration that integrates three parent models: DeepSeek-R1-0528, DeepSeek-R1, an …