Run DeepSeek R1 Dynamic 1.58-bit
DeepSeek-R1 is an open-source alternative to OpenAI's O1, reduced from 720GB to 131GB via quantization. It runs on various systems, with performance benchmarks indicating valid outputs and minor errors.
Read original articleDeepSeek-R1 has emerged as a competitive open-source alternative to OpenAI's O1 reasoning model, achieving significant size reduction through quantization techniques. The model, originally 720GB, has been compressed to 131GB while maintaining functionality. This was accomplished by selectively quantizing certain layers to higher bit rates, allowing for efficient performance without compromising output quality. The 1.58-bit version can operate with 160GB of VRAM for fast inference, or with 20GB of RAM on a CPU, albeit at slower speeds. Various dynamic quantized versions have been released, with performance benchmarks indicating that the 1.58-bit model produces valid outputs, although some incorrect tokens may occur. The architecture of DeepSeek R1 utilizes a mixture of experts (MoE) approach, which allows for increased parameters without a corresponding increase in computational cost. The model's performance was evaluated through a Flappy Bird game generation task, scoring high on various criteria. The dynamic quantization code has been made available on GitHub, and users can run the model on various systems, including those without GPUs. The blog post provides detailed instructions for downloading and running the model, emphasizing the importance of proper hardware configuration for optimal performance.
- DeepSeek-R1 is an open-source model rivaling OpenAI's O1.
- The model size was reduced from 720GB to 131GB through selective quantization.
- The 1.58-bit version can run on 160GB VRAM or 20GB RAM, with varying performance.
- Performance benchmarks show the model generates valid outputs, with some minor errors.
- Dynamic quantization code is available on GitHub for user implementation.
Related
Official DeepSeek R1 Now on Ollama
DeepSeek has launched its first generation of reasoning models, matching OpenAI's performance across tasks. Available in sizes from 1.5B to 70B parameters, they are MIT licensed for free use.
DeepSeek-R1 and Exploring DeepSeek-R1-Distill-Llama-8B
DeepSeek, a Chinese AI lab, has launched its R1 model and derived models for tasks like math and coding, open-sourced under MIT, with some licensing concerns and known limitations.
Notes on the New Deepseek R1
Deepseek launched the Deepseek-R1 model, an open-source AI using pure reinforcement learning, which is cheaper and faster than OpenAI's o1, showing strong performance but slightly less in complex reasoning tasks.
DeepSeek R1 Runs at 200 Tokens per Second on Raspberry Pi
The Open Source DeepSeek R1 model runs at 200 tokens per second on Raspberry Pi, outperforming some leading models, raising concerns among major AI companies, and is available for local applications.
DeepSeek Outpaced OpenAI at 3% of the Cost
DeepSeek R1 offers performance similar to OpenAI's models at 3%-5% of the cost, utilizing reinforcement learning. Its success may shift enterprise reliance from proprietary AI, raising ethical bias concerns.
instruction for llama.cpp: https://huggingface.co/unsloth/DeepSeek-R1-GGUF#instructions...
Related
Official DeepSeek R1 Now on Ollama
DeepSeek has launched its first generation of reasoning models, matching OpenAI's performance across tasks. Available in sizes from 1.5B to 70B parameters, they are MIT licensed for free use.
DeepSeek-R1 and Exploring DeepSeek-R1-Distill-Llama-8B
DeepSeek, a Chinese AI lab, has launched its R1 model and derived models for tasks like math and coding, open-sourced under MIT, with some licensing concerns and known limitations.
Notes on the New Deepseek R1
Deepseek launched the Deepseek-R1 model, an open-source AI using pure reinforcement learning, which is cheaper and faster than OpenAI's o1, showing strong performance but slightly less in complex reasoning tasks.
DeepSeek R1 Runs at 200 Tokens per Second on Raspberry Pi
The Open Source DeepSeek R1 model runs at 200 tokens per second on Raspberry Pi, outperforming some leading models, raising concerns among major AI companies, and is available for local applications.
DeepSeek Outpaced OpenAI at 3% of the Cost
DeepSeek R1 offers performance similar to OpenAI's models at 3%-5% of the cost, utilizing reinforcement learning. Its success may shift enterprise reliance from proprietary AI, raising ethical bias concerns.