January 28th, 2025

Open-R1: an open reproduction of DeepSeek-R1

Open-R1 is an initiative to replicate and enhance the DeepSeek-R1 reasoning model, focusing on transparency in data collection and training, while encouraging community collaboration for future research advancements.

Read original article

CuriositySkepticismExcitement

Open-R1: an open reproduction of DeepSeek-R1

Open-R1 is an initiative aimed at reconstructing the DeepSeek-R1 model, which has gained attention for its advanced reasoning capabilities. DeepSeek-R1, developed by DeepSeek, utilizes a base model called DeepSeek-V3 and employs innovative techniques such as pure reinforcement learning (RL) to enhance reasoning without human supervision. The model has shown impressive performance in reasoning tasks, outperforming existing models like OpenAI's o1. However, the release of DeepSeek-R1 left several questions unanswered, particularly regarding data collection, model training, and scaling laws. Open-R1 seeks to address these gaps by replicating the data and training pipeline of DeepSeek-R1, validating its claims, and sharing insights with the open-source community. The project plans to distill high-quality reasoning datasets, replicate the RL training pipeline, and document the findings to aid future research. The initiative emphasizes collaboration and aims to explore reasoning applications beyond mathematics, including fields like medicine. Open-R1 encourages community involvement to enhance the development of reasoning models.

- Open-R1 aims to replicate and enhance the DeepSeek-R1 reasoning model.

- DeepSeek-R1 utilizes pure reinforcement learning for improved reasoning capabilities.

- The initiative seeks to fill gaps in data collection and model training transparency.

- Open-R1 plans to create high-quality datasets and document training processes.

- Community collaboration is encouraged to advance research in reasoning models.

DeepSeek R1

DeepSeek-R1 is a new series of reasoning models utilizing large-scale reinforcement learning, featuring distilled models that outperform benchmarks. They are open-sourced, available for local use, and licensed under MIT.

DeepSeek-R1-Distill-Qwen-1.5B Surpasses GPT-4o in certain benchmarks

DeepSeek launched its first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1, utilizing large-scale reinforcement learning. The models are open-sourced, with DeepSeek-R1-Distill-Qwen-32B achieving state-of-the-art results.

Notes on the New Deepseek R1

Deepseek launched the Deepseek-R1 model, an open-source AI using pure reinforcement learning, which is cheaper and faster than OpenAI's o1, showing strong performance but slightly less in complex reasoning tasks.

How DeepSeek-R1 Was Built, for Dummies

DeepSeek launched DeepSeek-R1, a reasoning model trained with pure reinforcement learning, achieving performance comparable to OpenAI's o1. It features a cost-effective API and highlights open-source potential in AI.

The Illustrated DeepSeek-R1

DeepSeek-R1 is a new language model emphasizing reasoning, utilizing a three-step training process and a unique architecture. It faces challenges in readability and language mixing while enhancing reasoning capabilities.

AI: What people are saying

The comments reflect a mix of curiosity, skepticism, and enthusiasm regarding the Open-R1 initiative and its implications for open-source AI development.

Several users express interest in contributing to the project through crowdsourcing and collaboration.
There are concerns about the transparency of the model's training data and code, questioning the appropriateness of the "open source" label.
Some commenters draw parallels between the current AI landscape and the early days of the internet, highlighting security concerns.
Users are eager to understand the timeline for reproducing the model and the resources required, such as access to GPUs.
There is a general appreciation for the rapid advancements in open-source AI and the collaborative spirit it fosters.

20 comments

By @LZ_Khan - 26 days

Note: This is not an actual model but rather an announcement of an effort to reproduce the R1 model.

By @ipnon - 26 days

Is this what the Web was like in the beginning? Something exciting and fascinating every week?

By @nutanc - 25 days

How can we help. Can crowd sourcing help? Is there any list of tasks that we want a crowd to do? The reason I am asking is because we have done a couple of crowdsourcing efforts and collected story data in Telugu(Chandamama Kathalu) and ASR speech data using college going students. Since we have access to the students, we can mobilize them and get this going. We will also be doing an internship program for 100,000 students in Telangana as part of Viswam[1] in April. Can include some work as part of this effort.

[1] https://viswam.ai/

By @breadwinner - 25 days

From the article: they didn’t release everything—although the model weights are open, the datasets and code used to train the model are not.

Is that true about Meta Llama as well? Specifically, the code used to train the model is not open? (I know no one releases datasets). If so the label "open source" is inappropriate. "Open weights" would be more appropriate.

By @sunshine-o - 25 days

Now that things are really getting wild in the LLM space and people are just running anything that come it seems I did a quick search on the thead model of hosting you own LLM.

I didn't find much, starting with llama.ccp which is just reminding you to sandbox and isolate everything if running untrusted models.

I feel we are back in the Windows 95 / early Internet era when people would just run anything without caring about security.

By @fblp - 26 days

Given DeepSeek's open philosophy I wonder what their response is to simply being asked for access to the code and data that this project intends to recreate?

By @drakenot - 26 days

What are some other domains outside of Math and Coding that would be suitable for RL with automated verification?

By @zoobab - 25 days

For "open source", we will wait that Debian ships them to have the guarantee it's actually "open" and with "sources". Right now it's a mystery how they produce their binaries.

By @DeflectedFlux - 25 days

About the training data, cant the datasets from the Tulu3 Model by the Allen Institute be used? They claim that they have used a fully open source training dataset.

By @htrp - 25 days

The hf team tweeted they'd be doing this over the weekend. I guess now it's an official project with headcount

By @freddealmeida - 26 days

how is this open vs whatdeepseek did?

By @cadamsdotcom - 26 days

Exciting to see this being reproduced, loving the hyper-fast movement in open source!

This is exactly why it is not “US vs China”, the battle is between heavily-capitalized Silicon Valley companies versus open source.

Every believer in this tech owes DeepSeek some gratitude, but even they stand on shoulders of giants in the form of everyone else who pushed the frontier forward and chose to publish, rather than exploit, what they learned.

By @vinni2 - 25 days

I wonder how long it takes to reproduce it and would having access to latest GPUs speed it up?

By @amelius - 26 days

Are there any other groups trying this?

By @Babawomba - 25 days

super cool to see an open initiative like this—love the idea of replicating DeepSeek-R1 in a transparent way.

I do like the idea of making these reasoning techniques accessible to everyone. If they really manage to replicate the results of DeepSeek-R1, especially on a smaller budget, that’s a huge win for open-source AI.

I’m all for projects that push innovation and share the process with others, even if it’s messy.

But yeah—lots of hurdles. They might hit a wall because they don’t have DeepSeek’s original datasets.

By @fl4tul4 - 26 days

Is OpenAI open as of yesterday?

By @readthenotes1 - 25 days

Stopped reading when it said DeepSeek brokevtge stock market

By @vinni2 - 26 days

Where is the evaluation numbers? without it you can’t call it reproduction.

Open-R1: an open reproduction of DeepSeek-R1