January 28th, 2025

China's DeepSeek just dropped a free challenger to OpenAI's o1

Chinese AI startup DeepSeek has launched the R1 reasoning model, claiming it rivals OpenAI's o1. Trained on 14.8 trillion tokens, it offers free access but faces censorship and privacy concerns.

Read original articleLink Icon
China's DeepSeek just dropped a free challenger to OpenAI's o1

Chinese AI startup DeepSeek has launched a new series of large language models (LLMs), including the R1 reasoning model, which it claims can rival OpenAI's o1 in various benchmarks. Founded in 2023 by Liang Wenfeng and backed by High Flyer hedge fund, DeepSeek has developed these models at a significantly lower cost compared to Western counterparts. The R1 model, which utilizes chain-of-thought (CoT) reasoning, was trained on 14.8 trillion tokens using 2,048 Nvidia H800 GPUs, costing approximately $5.58 million. It features 671 billion parameters and is designed to provide more logical and accurate responses by breaking down queries into a series of thoughts. Initial tests indicate that R1 performs well in tasks like counting and basic arithmetic, often surpassing smaller distilled models. However, it still struggles with consistency in complex reasoning tasks. Additionally, the model is subject to censorship, avoiding sensitive topics related to the Chinese Communist Party. DeepSeek's models are available for free on platforms like Hugging Face, and users can run them locally or through a cloud API, although privacy concerns arise from data storage policies.

- DeepSeek's R1 model claims to rival OpenAI's o1 in reasoning capabilities.

- The model was trained at a fraction of the cost of Western AI models.

- R1 utilizes chain-of-thought reasoning for improved accuracy in responses.

- The model is available for free, but user data may be stored in China.

- Censorship limits the model's ability to discuss sensitive political topics.

Link Icon 1 comments