DeepSeek’s first reasoning model R1-Lite-Preview turns heads, beating OpenAI o1 performance

by | Nov 20, 2024 | Technology

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

DeepSeek, an AI offshoot of Chinese quantitative hedge fund High-Flyer Capital Management focused on releasing high performance open source tech, has unveiled the R1-Lite-Preview, its latest reasoning-focused large language model, available for now exclusively through DeepSeek Chat, its web-based AI chatbot.

Known for its innovative contributions to the open-source AI ecosystem, DeepSeek’s new release aims to bring high-level reasoning capabilities to the public while maintaining its commitment to accessible and transparent AI.

And the R1-Lite-Preview, despite only being available through the chat application for now, is already turning heads by offering performance nearing and in some cases exceeding OpenAI’s vaunted o1-preview model.

Like that model released in September 2024, DeepSeek-R1-Lite-Preview exhibits “chain-of-thought” reasoning, showing the user the different chains or trains of “thought” it goes down to respond to their queries and inputs, documenting the process by explaining what it is doing and why.

While some of the chains/trains of thoughts may appear nonsensical or even erroneous to humans, DeepSeek-R1-Lite-Preview appears on the whole to be strikingly accurate, even answering “trick” questions that have tripped up other, older, yet powerful AI models such as GPT-4o and Claude’s Anthropic family, including “how many letter Rs are in the word Strawberry?” and “which is larger, 9.11 or 9.9?” See screenshots below of my tests of these prompts on DeepSeek Chat:

A New Approach to AI Reasoning

DeepSeek-R1-Lite-Preview is designed to excel in tasks requiring logical inference, mathematical reasoning, and real-time problem-solving.

According to DeepSeek, the model exceeds OpenAI o1-preview-level performance on established benchmarks such as AIME (American Invitational Mathematics Examination) and MATH.

DeepSeek-R1-Lite-Preview benchmark results posted on X.

Its reasoning capabilities are enhanced by its transparent thought process, allowing users to follow along as the model tackles complex challenges step by step.

DeepSeek has also published scaling data, showcasing steady accuracy improvements when the model is given more time or “thought tokens” to solve problems. Performance graphs highlight its profici …

Article Attribution | Read More at Article Source