Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
Things are moving quickly in AI—and if you’re not keeping up, you’re falling behind.
Two recent developments are reshaping the landscape for developers and enterprises alike: DeepSeek’s R1 model release and OpenAI’s new Deep Research product. Together, they’re redefining the cost and accessibility of powerful reasoning models, which has been well reported on. Less talked about, however, is how they’ll push companies to use techniques like distillation, supervised fine-tuning (SFT), reinforcement learning (RL), and retrieval-augmented generation (RAG) to build smarter, more specialized AI applications.
After the initial excitement around the amazing achievements of DeepSeek begins to settle, developers and enterprise decision-makers need to consider what it means for them. From pricing and performance to hallucination risks and the importance of clean data, here’s what these breakthroughs mean for anyone building AI today.
Cheaper, transparent, industry-leading reasoning models – but through distillation
The headline with DeepSeek-R1 is simple: It delivers an industry-leading reasoning model at a fraction of the cost of OpenAI’s o1. Specifically, it’s about 30 times cheaper to run, and unlike many closed models, DeepSeek offers full transparency around its reasoning steps. For developers, this means you can now build highly customized AI models without breaking the bank—whether through distillation, fine-tuning, or simple RAG implementations.
Distillation, in particular, is emerging as a powerful tool. By using DeepSeek-R1 as a “teacher model,” companies can create smaller, task-specific models that inherit R1’s superior reasoning capabilities. These smaller models, in fact, are the future for most enterprise companies. The full R1 reasoning model can be too much for what companies need – thinking too much, and not taking the decisive action companies need for their specific domain applications. “One of the things that no one is really talking about in, certainly in the mainstream media, is that actually the reasoning models are not working that well for things like agents,” said Sam Witteveen, an ML developer who works on AI agents, which are increasingly orchestrating enterprise applications.
As part of its release, DeepSeek distilled its own reasoning capabilities onto a number of smaller models, including open-source models from Meta’s Llama family and Alibaba’s Qwen family, as described in its paper. It’s these smaller models that can then be optimized for specific tasks. This trend towa …