China's DeepSeek just dropped a free challenger to OpenAI's o1
Chinese AI startup DeepSeek has launched the R1 reasoning model, claiming it rivals OpenAI's o1. Trained on 14.8 trillion tokens, it offers free access but faces censorship and privacy concerns.
Read original articleChinese AI startup DeepSeek has launched a new series of large language models (LLMs), including the R1 reasoning model, which it claims can rival OpenAI's o1 in various benchmarks. Founded in 2023 by Liang Wenfeng and backed by High Flyer hedge fund, DeepSeek has developed these models at a significantly lower cost compared to Western counterparts. The R1 model, which utilizes chain-of-thought (CoT) reasoning, was trained on 14.8 trillion tokens using 2,048 Nvidia H800 GPUs, costing approximately $5.58 million. It features 671 billion parameters and is designed to provide more logical and accurate responses by breaking down queries into a series of thoughts. Initial tests indicate that R1 performs well in tasks like counting and basic arithmetic, often surpassing smaller distilled models. However, it still struggles with consistency in complex reasoning tasks. Additionally, the model is subject to censorship, avoiding sensitive topics related to the Chinese Communist Party. DeepSeek's models are available for free on platforms like Hugging Face, and users can run them locally or through a cloud API, although privacy concerns arise from data storage policies.
- DeepSeek's R1 model claims to rival OpenAI's o1 in reasoning capabilities.
- The model was trained at a fraction of the cost of Western AI models.
- R1 utilizes chain-of-thought reasoning for improved accuracy in responses.
- The model is available for free, but user data may be stored in China.
- Censorship limits the model's ability to discuss sensitive political topics.
Related
Notes on the New Deepseek R1
Deepseek launched the Deepseek-R1 model, an open-source AI using pure reinforcement learning, which is cheaper and faster than OpenAI's o1, showing strong performance but slightly less in complex reasoning tasks.
Tech Things: Inference Time Compute, Deepseek R1, and the Arrival of the Chinese
OpenAI is improving LLM reasoning with "inference time compute." Deepseek's R1 model outperforms established models and is open-source, intensifying competition and challenging assumptions about Chinese AI capabilities.
Why everyone in AI is freaking out about DeepSeek
DeepSeek, a Chinese AI firm, launched the open-source DeepSeek-R1 model, outperforming OpenAI's o1 at lower costs, raising concerns about U.S.-China competition and potential market disruption in AI technology.
China's AI Earthquake: How DeepSeek's Surprise Model R1 Shook Silicon Valley
Deepseek, a Chinese AI lab, developed its R1 model with minimal funding, outperforming competitors and raising concerns about censorship and a China-centric worldview in AI, prompting reassessment of U.S. dominance.
How a top Chinese AI model overcame US sanctions
DeepSeek, a Chinese AI startup, launched DeepSeek R1, an open-source model matching ChatGPT's performance, developed under US sanctions, emphasizing efficiency and collaboration, with smaller versions for local use.
Related
Notes on the New Deepseek R1
Deepseek launched the Deepseek-R1 model, an open-source AI using pure reinforcement learning, which is cheaper and faster than OpenAI's o1, showing strong performance but slightly less in complex reasoning tasks.
Tech Things: Inference Time Compute, Deepseek R1, and the Arrival of the Chinese
OpenAI is improving LLM reasoning with "inference time compute." Deepseek's R1 model outperforms established models and is open-source, intensifying competition and challenging assumptions about Chinese AI capabilities.
Why everyone in AI is freaking out about DeepSeek
DeepSeek, a Chinese AI firm, launched the open-source DeepSeek-R1 model, outperforming OpenAI's o1 at lower costs, raising concerns about U.S.-China competition and potential market disruption in AI technology.
China's AI Earthquake: How DeepSeek's Surprise Model R1 Shook Silicon Valley
Deepseek, a Chinese AI lab, developed its R1 model with minimal funding, outperforming competitors and raising concerns about censorship and a China-centric worldview in AI, prompting reassessment of U.S. dominance.
How a top Chinese AI model overcame US sanctions
DeepSeek, a Chinese AI startup, launched DeepSeek R1, an open-source model matching ChatGPT's performance, developed under US sanctions, emphasizing efficiency and collaboration, with smaller versions for local use.