DeepSeek

DeepSeek-R1

DeepSeek-R1 is a text generation model developed by DeepSeek, a Chinese AI company. It is a reasoning-focused model that generates a Chain of Thought (CoT) before producing a final answer, a technique designed to improve accuracy on multi-step problems. The model was trained through late 2024 and supports a context window of 64,000 tokens. DeepSeek released the model weights publicly, making it available for local deployment and research use. DeepSeek-R1 is well suited for tasks that benefit from structured reasoning, such as mathematics, logic puzzles, coding challenges, and scientific problem-solving. Because the model externalizes its reasoning steps before answering, users can inspect the thought process that led to a given response. DeepSeek also released a series of distilled versions of R1 based on smaller base models, broadening its accessibility across different hardware configurations.

Jan 22, 2025 64,000 context 8,000 tokens output

Chain-of-Thought Reasoning Math & Logic Code Generation Long-Context Processing Open Weights Access

Overview ↓ About ↓ Capabilities ↓ Pricing ↓ Price Comparison ↓ Benchmarks ↓ Tools ↓ Daily ↓ Resources ↓ Community ↓ FAQ ↓

Model Overview

High-signal model metadata in a structured two-column overview table.

Provider

The entity that provides this model.

DeepSeek

Input Context Window

The number of tokens supported by the input context window.

64,000 tokens

Maximum Output Tokens

The number of tokens that can be generated by the model in a single request.

8,000 tokens tokens

Open Source

Whether the model's code is available for public use.

Release Date

When the model was first released.

Jan 22, 2025 1 year ago

Knowledge Cut-off Date

When the model's knowledge was last updated.

2024

API Providers

The providers that offer this model. This is not an exhaustive list.

DeepSeek API, OpenAI API, Anthropic API, Hugging Face

Modalities

Types of data this model can process.

Text

What is DeepSeek-R1

A fuller summary of positioning, capabilities, and source-specific details for DeepSeek-R1.

DeepSeek-R1 is a text generation model developed by DeepSeek, a Chinese AI company. It is a reasoning-focused model that generates a Chain of Thought (CoT) before producing a final answer, a technique designed to improve accuracy on multi-step problems. The model was trained through late 2024 and supports a context window of 64,000 tokens. DeepSeek released the model weights publicly, making it available for local deployment and research use.

DeepSeek-R1 is well suited for tasks that benefit from structured reasoning, such as mathematics, logic puzzles, coding challenges, and scientific problem-solving. Because the model externalizes its reasoning steps before answering, users can inspect the thought process that led to a given response. DeepSeek also released a series of distilled versions of R1 based on smaller base models, broadening its accessibility across different hardware configurations.

Capabilities

What DeepSeek-R1 supports

Chain-of-Thought Reasoning

Generates an explicit reasoning trace before producing a final answer, allowing multi-step problems to be broken down systematically. This CoT process is visible in the model's output.

Math & Logic

Applies step-by-step reasoning to solve mathematical and logical problems, including proofs, equations, and structured inference tasks.

</>

Code Generation

Produces and debugs code across common programming languages, using its reasoning process to work through algorithmic problems before outputting a solution.

CTX

Long-Context Processing

Handles input and output sequences within a 64,000-token context window, supporting analysis of lengthy documents or extended multi-turn conversations.

Open Weights Access

Model weights are publicly released by DeepSeek, enabling local deployment and fine-tuning without relying solely on the hosted API.

Pricing for DeepSeek-R1

Primary API pricing shown in the same “quick compare” spirit as the reference page.

Input tokens $0.55 Per million tokens

Output tokens N/A Per million tokens

Price Comparison

Additional usage-cost dimensions synced into the project for this model.

maxTemperature 2

maxResponseSize 8,000 tokens

API Access & Providers

Places where this model is available, based on the synced detail-page metadata.

DeepSeek API OpenAI API Anthropic API Hugging Face

Model Performance

Benchmark scores synced from the current model source and normalized into the local catalog.

Benchmark	Score
AIME 2024 American math olympiad problems	89.3%
GPQA Diamond PhD-level science questions (biology, physics, chemistry)	81.3%
HLE Questions that challenge frontier models across many domains	14.9%
LiveCodeBench Real-world coding tasks from recent competitions	77.0%
MATH-500 Undergraduate and competition-level math problems	98.3%
MMLU-Pro Expert knowledge across 14 academic disciplines	84.9%
SciCode Scientific research coding and numerical methods	40.3%

Resources & Documentation

Official model cards, release notes, docs, and other references synced from the source page.

Official Website Other

→

Research Paper (arXiv) Research

→

Model on Hugging Face Open Source

→

GitHub Repository Open Source

→

DeepSeek API Docs Documentation

→

AI tools related to DeepSeek-R1

These tools are strongly connected to DeepSeek-R1 through direct product references, provider mentions, or explicit model mappings.

AI Assistant

DeepSeek

DeepSeek is an AI research company established in 2023 that specializes in developing advanced general artificial intelligence foundation models. The company has released and open-sourced several large-scale models, such as DeepSeek-LLM, DeepSeek-Coder, and DeepSeek-MoE. Additionally, DeepSeek offers API access to these models, enabling developers to integrate their AI capabilities into various applications.

Free 411 visits 44 saves

AI Chatbot

DeepSeek R1 Online

DeepSeek R1 Online provides direct access to the DeepSeek R1 AI model, an open-source solution built for advanced reasoning. The platform offers free, no-login access to the model, which is engineered for complex problem-solving, multilingual tasks, and production-grade code generation. By leveraging a Mixture of Experts (MoE) architecture and advanced reinforcement learning, the model delivers high performance across mathematics, coding, and general reasoning. The platform also hosts distilled versions of the model for various specialized use cases.

Free 36 visits 2 saves

AI Chatbot

GlobalGPT

GlobalGPT is an all-in-one AI platform providing access to a diverse suite of models, including GPT-4o, GPT-4.5, Claude 3.7, Midjourney, and Runway. Through a single subscription, users can perform writing, research, image and video generation, and task automation.

Free 856 visits 4 saves

AI Assistant

Private LLM

Private LLM is a local AI chatbot for iOS and macOS that operates entirely offline, ensuring your data remains secure and private on your device. By functioning without an internet connection, it guarantees that your information never leaves your hardware. The app is available as a one-time purchase with no subscription fees, providing access across all your Apple devices. It offers user-friendly features for text generation, language assistance, and more.

Free 0 visits 4 saves

Related Daily Briefs

Recent daily stories tied to DeepSeek-R1 through direct model mentions or provider-level coverage.

Frontier Models

Samsung Deploys ChatGPT Enterprise as Small Models Outperform Frontier LLMs and MiniMax M3 Challenges DeepSeek

MiniMax and OpenAI are raising the stakes for enterprise adoption.

2026-06-21 AI Models AI API

Community discussion

What people think about DeepSeek-R1

DeepSeek-R1 discussions are most active in r/LocalLLaMA, r/selfhosted, r/singularity.

Top Reddit threads cluster around benchmark and model-comparison threads, safety and censorship questions, coding workflow discussions. The strongest match in this snapshot has 2136 upvotes and 684 comments.

r/selfhosted 2,136 upvotes 684 comments January 28, 2025

Yes, you can run DeepSeek-R1 locally on your device (20GB RAM min.)

I've recently seen some misconceptions that you can't run DeepSeek-R1 locally on your own device. Last weekend, we were busy trying to make you guys have the ability to run the actual R1 (non-distilled) model with just an RTX 4090 (24GB VRAM) which gives at least 2-3 tokens/second.

Over the weekend, we at Unsloth (currently a team of just 2 brothers) studied R1's architecture, then selectively quantized layers to 1.58-bit, 2-bit etc. which vastly outperforms basic versions with minimal compute.

1. We shrank R1, the 671B parameter model from 720GB to just 131GB (a 80% size reduction) whilst making it still fully functional and great
2. No the dynamic GGUFs does not work directly with Ollama but it does work on llama.cpp as they support sharded GGUFs and disk mmap offloading. For Ollama, you will need to merge the GGUFs manually using llama.cpp.
3. Minimum requirements: a CPU with 20GB of RAM (but it will be very slow) - and 140GB of diskspace (to download the model weights)
4. Optimal requirements: sum of your VRAM+RAM= 80GB+ (this will be somewhat ok)
5. No, you do not need hundreds of RAM+VRAM but if you have it, you can get **140 tokens per second** for throughput & 14 tokens/s for single user inference with 2xH100
6. Our open-source GitHub repo: [github.com/unslothai/unsloth](http://github.com/unslothai/unsloth)

Many people have tried running the dynamic GGUFs on their potato devices and it works very well (including mine).

R1 GGUFs uploaded to Hugging Face: [huggingface.co/unsloth/DeepSeek-R1-GGUF](http://huggingface.co/unsloth/DeepSeek-R1-GGUF)

To run your own R1 locally we have instructions + details: [unsloth.ai/blog/deepseekr1-dynamic](http://unsloth.ai/blog/deepseekr1-dynamic)

Open Reddit thread

r/developersIndia 1,753 upvotes 145 comments February 18, 2026

Sarvam AI unveils 30B and 105B models, says 105B outperforms DeepSeek R1 and Gemini Flash on key benchmarks

Source: Moneycontrol \[[Article Link](https://www.moneycontrol.com/news/business/startup/sarvam-ai-launches-30b-and-105b-models-says-105b-outperforms-deepseek-r1-and-gemini-flash-on-key-benchmarks-13834399.html)\]

>Bengaluru-based AI startup just announced the launch of two new large language models, a 30-billion-parameter model and a 105-billion-parameter model, both trained from scratch.

“At 105 billion parameters, on most benchmarks this model beats DeepSeek R1 released a year ago, which was a 600-billion-parameter model."

>“It is cheaper than something like a Gemini Flash, but outperforms it in many benchmarks,” Kumar said.

>On Indian language benchmarks, Kumar said the model delivers stronger performance than several larger competitors.

>“Even with something like Gemini 2.5 Flash, which is a bigger and more expensive model, we find that the Indian language performance of this model is even better.”

Sarvam was earlier announced as the first startup selected to build India’s foundational AI model under the mission.Article LinkBengaluru-based AI startup just announced the launch of two new large language models, a 30-billion-parameter model and a 105-billion-parameter model, both trained from scratch.

“At 105 billion parameters, on most benchmarks this model beats DeepSeek R1 released a year ago, which was a 600-billion-parameter model."It is cheaper than something like a Gemini Flash, but outperforms it in many benchmarks,” Kumar said. On Indian language benchmarks, Kumar said the model delivers stronger performance than several larger competitors. “Even with something like Gemini 2.5 Flash, which is a bigger and more expensive model, we find that the Indian language performance of this model is even better.”

Sarvam was earlier announced as the first startup selected to build India’s foundational AI model under the mission.

Open Reddit thread

r/LocalLLaMA 1,691 upvotes 598 comments January 27, 2025

1.58bit DeepSeek R1 - 131GB Dynamic GGUF

Hey r/LocalLLaMA! I managed to **dynamically quantize** the full DeepSeek R1 671B MoE to 1.58bits in GGUF format. The trick is **not to quantize all layers**, but quantize only the MoE layers to 1.5bit, and leave attention and other layers in 4 or 6bit.

|MoE Bits|Type|Disk Size|Accuracy|HF Link|
|:-|:-|:-|:-|:-|
|1.58bit|IQ1\_S|**131GB**|Fair|[Link](https://huggingface.co/unsloth/DeepSeek-R1-GGUF/tree/main/DeepSeek-R1-UD-IQ1_S)|
|1.73bit|IQ1\_M|**158GB**|Good|[Link](https://huggingface.co/unsloth/DeepSeek-R1-GGUF/tree/main/DeepSeek-R1-UD-IQ1_M)|
|2.22bit|IQ2\_XXS|**183GB**|Better|[Link](https://huggingface.co/unsloth/DeepSeek-R1-GGUF/tree/main/DeepSeek-R1-UD-IQ2_XXS)|
|2.51bit|Q2\_K\_XL|**212GB**|Best|[Link](https://huggingface.co/unsloth/DeepSeek-R1-GGUF/tree/main/DeepSeek-R1-UD-Q2_K_XL)|

You can get **140 tokens / s** for throughput and 14 tokens /s for single user inference on 2x H100 80GB GPUs with all layers offloaded. A 24GB GPU like RTX 4090 should be able to get at least 1 to 3 tokens / s.

If we naively quantize all layers to 1.5bit (-1, 0, 1), the model will fail dramatically, since it'll produce **gibberish** and **infinite repetitions**. I selectively leave all attention layers in 4/6bit, and leave the first 3 transformer dense layers in 4/6bit. The MoE layers take up 88% of all space, so we can leave them in 1.5bit. We get in total a weighted sum of 1.58bits!

I asked it the 1.58bit model to create Flappy Bird with 10 conditions (like random colors, a best score etc), and it did pretty well! Using a generic non dynamically quantized model will fail miserably - there will be no output at all!

[Flappy Bird game made by 1.58bit R1](https://i.redd.it/k8nfun2ezjfe1.gif)

There's more details in the blog here: [https://unsloth.ai/blog/deepseekr1-dynamic](https://unsloth.ai/blog/deepseekr1-dynamic) The link to the 1.58bit GGUF is here: [https://huggingface.co/unsloth/DeepSeek-R1-GGUF/tree/main/DeepSeek-R1-UD-IQ1\_S](https://huggingface.co/unsloth/DeepSeek-R1-GGUF/tree/main/DeepSeek-R1-UD-IQ1_S) You should be able to run it in your favorite inference tool if it supports i matrix quants. No need to re-update llama.cpp.

A reminder on DeepSeek's chat template (for distilled versions as well) - it auto adds a BOS - do not add it manually!

`<｜begin▁of▁sentence｜><｜User｜>What is 1+1?<｜Assistant｜>It's 2.<｜end▁of▁sentence｜><｜User｜>Explain more!<｜Assistant｜>`

To know how many layers to offload to the GPU, I approximately calculated it as below:

|Quant|File Size|24GB GPU|80GB GPU|2x80GB GPU|
|:-|:-|:-|:-|:-|
|1.58bit|131GB|7|33|All layers 61|
|1.73bit|158GB|5|26|57|
|2.22bit|183GB|4|22|49|
|2.51bit|212GB|2|19|32|

All other GGUFs for R1 are here: [https://huggingface.co/unsloth/DeepSeek-R1-GGUF](https://huggingface.co/unsloth/DeepSeek-R1-GGUF) There's also GGUFs and dynamic 4bit bitsandbytes quants and others for all other distilled versions (Qwen, Llama etc) at [https://huggingface.co/collections/unsloth/deepseek-r1-all-versions-678e1c48f5d2fce87892ace5](https://huggingface.co/collections/unsloth/deepseek-r1-all-versions-678e1c48f5d2fce87892ace5)

Open Reddit thread

r/LocalLLaMA 1,559 upvotes 487 comments February 18, 2025

PerplexityAI releases R1-1776, a DeepSeek-R1 finetune that removes Chinese censorship while maintaining reasoning capabilities

Open Reddit thread

r/LocalLLaMA 1,540 upvotes 508 comments February 2, 2025

DeepSeek-R1 fails every safety test. It exhibits a 100% attack success rate, meaning it failed to block a single harmful prompt.

We knew R1 was good, but not that good. All the cries of CCP censorship are meaningless when it's trivial to bypass its guard rails.

Open Reddit thread

View more discussions →

FAQ

Common questions about DeepSeek-R1

What is the context window for DeepSeek-R1?

DeepSeek-R1 supports a context window of 64,000 tokens, which covers both input and output combined.

What makes DeepSeek-R1 different from a standard text generation model?

DeepSeek-R1 generates a Chain of Thought (CoT) before delivering its final answer. This means the model works through reasoning steps explicitly, which is intended to improve accuracy on complex or multi-step tasks.

What is the training data cutoff for DeepSeek-R1?

Based on the available metadata, DeepSeek-R1 was trained through late 2024. It does not have knowledge of events after that period.

Is DeepSeek-R1 available as open weights?

Yes. DeepSeek released the model weights for DeepSeek-R1 publicly on Hugging Face, allowing users to run the model locally or fine-tune it independently of the hosted API.

What types of tasks is DeepSeek-R1 best suited for?

DeepSeek-R1 is designed for tasks that benefit from structured reasoning, including mathematics, logic, coding, and scientific problem-solving. Its CoT approach makes it particularly useful when intermediate reasoning steps matter.

More models from DeepSeek

Continue browsing adjacent models from the same provider.

← All AI Models