DeepSeek

DeepSeek V4 Flash

DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-token context window. It is designed for fast inference and...

Apr 24, 2026 1.0M context 384,000 tokens output
Text Tools Structured Output Reasoning

Model Overview

High-signal model metadata in a structured two-column overview table.

Provider

The entity that provides this model.

DeepSeek

Model ID

The routed model identifier exposed by upstream providers.

deepseek/deepseek-v4-flash:free

Input Context Window

The number of tokens supported by the input context window.

1.0M tokens

Maximum Output Tokens

The number of tokens that can be generated by the model in a single request.

384,000 tokens tokens

Open Source

Whether the model's code is available for public use.

Yes

Release Date

When the model was first released.

Apr 24, 2026 27 days ago

Knowledge Cut-off Date

When the model's knowledge was last updated.

Unknown

API Providers

The providers that offer this model. This is not an exhaustive list.

Crucible

Modalities

Types of data this model can process.

Text

What is DeepSeek V4 Flash

A fuller summary of positioning, capabilities, and source-specific details for DeepSeek V4 Flash.

DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-token context window. It is designed for fast inference and...

Capabilities

What DeepSeek V4 Flash supports

RN

Reasoning Controls

OpenRouter lists GPT-5.5 with reasoning support and explicit reasoning-related request parameters.

JSON

Structured Outputs

Structured output settings are exposed through OpenRouter for schema-driven or format-controlled responses.

TL

Tool Calling

Tool invocation and tool selection are supported in the routed OpenRouter interface for this model.

MM

Multimodal I/O

This model accepts text input and returns text output.

CTX

Large Context Window

OpenRouter currently lists a context window of 1.0M with up to 384,000 tokens maximum output tokens.

Pricing for DeepSeek V4 Flash

Primary API pricing shown in the same “quick compare” spirit as the reference page.

Price Comparison

Additional usage-cost dimensions synced into the project for this model.

Cache read $0.02
maxTemperature 1
maxResponseSize 384,000 tokens

API Access & Providers

Places where this model is available, based on the synced detail-page metadata.

Crucible

Provider Endpoints

Endpoint-level provider data currently available for this model.

Crucible

Max output: 384,000 1d uptime: 99.8% Supported params: 4 Implicit caching: No

Configuration & Parameters

The configurable options currently documented for this model.

Reasoning Effort

Select

Non-think for fast responses, High for complex problem-solving, Max to push reasoning to its fullest extent.

Default: high
Non-think High Max

Top P

Number

Nucleus sampling. Considers only tokens whose cumulative probability exceeds this threshold.

Default: 0.95 Range: 0 - 1 (step 0.01)

Top K

Number

Limits sampling to the K most likely tokens at each step. Set to 0 to disable.

Default: 20 Range: 0 - 100

Min P

Number

Minimum probability threshold relative to the most likely token.

Range: 0 - 1 (step 0.01)

Presence Penalty

Number

Penalizes tokens that have already appeared in the output, encouraging new topics.

Frequency Penalty

Number

Penalizes tokens based on how often they have already appeared.

Repetition Penalty

Number

Penalizes repeated tokens. Values above 1 discourage repetition.

Default: 1 Range: 0 - 2 (step 0.01)

Seed

Seed

Supported Request Parameters

Parameters currently listed by OpenRouter or the local catalog for this model.

Reasoning Effort Top P Top K Min P Presence Penalty Frequency Penalty Repetition Penalty Seed

Resources & Documentation

Official model cards, release notes, docs, and other references synced from the source page.

Community discussion

What people think about DeepSeek V4 Flash

DeepSeek V4 Flash discussions are most active in r/opencodeCLI, r/hermesagent, r/openclaw.

Top Reddit threads cluster around benchmark and model-comparison threads, safety and censorship questions, coding workflow discussions. The strongest match in this snapshot has 426 upvotes and 98 comments.

r/opencodeCLI 42 upvotes 6 comments May 11, 2026
DeepSeek V4 Flash Free in OpenCode Zen

I just discovered that ES Flash is in free mode on OpenCode Zen 😳

|Model|Input|Output|Cached Read|My Test|
|:-|:-|:-|:-|:-|
|Big Pickle|Free|Free|Free|Good - Fast and smart 0.2 M context.|
|DeepSeek V4 Flash Free|Free|Free|Free|DS - Fast and smart - 1M context. The best.|
|MiniMax M2.5 Free|Free|Free|Free|It handles OpenCode quite poorly.|
|Ring 2.6 1T Free|Free|Free|Free|Not working now.|
|Nemotron 3 Super Free|Free|Free|Free|Working now|

Open Reddit thread

Hit the free tier limit on both 'big-pickle' and \`deepseek-v4-flash-free\` yesterday. Still getting \`FreeUsageLimitError\` today. I've since funded the account with $20 (paid models work fine), but the free model remains locked.

Does the free tier reset daily, weekly, or is it a one-time cap per account? Anyone know the actual reset window?

[Rate limit error](https://preview.redd.it/pdq8vmmf7t1h1.png?width=1568&format=png&auto=webp&s=4e1a43cfbb4357c1785968325bf5fb28f890024d)

Open Reddit thread
View more discussions →

More models from DeepSeek

Continue browsing adjacent models from the same provider.

← All AI Models