January 20th, 2025

DeepSeek R1

DeepSeek-R1 is a new series of reasoning models utilizing large-scale reinforcement learning, featuring distilled models that outperform benchmarks. They are open-sourced, available for local use, and licensed under MIT.

Read original articleLink Icon
DeepSeek R1

DeepSeek-R1 is a new series of reasoning models developed by DeepSeek, including the first-generation model DeepSeek-R1-Zero, which was trained using large-scale reinforcement learning (RL) without prior supervised fine-tuning (SFT). This innovative approach has led to impressive reasoning capabilities, although it also faced challenges like repetition and readability. To enhance performance, DeepSeek-R1 was introduced, incorporating cold-start data before RL training. The models have shown competitive performance against OpenAI's models in various tasks, including math and coding. The research community benefits from the open-sourcing of DeepSeek-R1-Zero, DeepSeek-R1, and several distilled models derived from them, which demonstrate that smaller models can achieve high performance by leveraging the reasoning patterns of larger models. The evaluation results indicate that the distilled models outperform many existing benchmarks, establishing new state-of-the-art results. The models are available for download and can be run locally or accessed via an API. The project is licensed under the MIT License, allowing for commercial use and modifications.

- DeepSeek-R1 models utilize large-scale reinforcement learning for enhanced reasoning capabilities.

- The series includes distilled models that outperform existing benchmarks, demonstrating the effectiveness of smaller models.

- DeepSeek-R1 and its variants are open-sourced to support the research community.

- The models are available for local deployment and through an API platform.

- The project is licensed under MIT, permitting commercial use and modifications.

Link Icon 2 comments
By @chvid - about 1 month
Not much info here but this is big news - an actually published open source model that matches O1 from OpenAI - the model has been available behind an api for a few months.

Here is an article with someone playing with it:

https://www.datacamp.com/blog/deepseek-r1-lite-preview

By @deyiao - about 1 month
It’s been reported that DeepSeek R1’s coding capabilities exceed GPT-o1-low and nearly match GPT-o1-meduim, quite astonishing.