DeepSeek

DeepSeek V4 Flash

DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-token context window. It is designed for fast inference and...

Apr 24, 2026 1.0M context 384,000 tokens output

Text Tools Structured Output Reasoning

Overview ↓ About ↓ Capabilities ↓ Pricing ↓ Price Comparison ↓ Parameters ↓ Compare ↓ Daily ↓ Resources ↓ Community ↓

Model Overview

High-signal model metadata in a structured two-column overview table.

Provider

The entity that provides this model.

DeepSeek

Model ID

The routed model identifier exposed by upstream providers.

deepseek/deepseek-v4-flash:free

Input Context Window

The number of tokens supported by the input context window.

1.0M tokens

Maximum Output Tokens

The number of tokens that can be generated by the model in a single request.

384,000 tokens tokens

Open Source

Whether the model's code is available for public use.

Yes

Release Date

When the model was first released.

Apr 24, 2026 2 months ago

Knowledge Cut-off Date

When the model's knowledge was last updated.

Unknown

API Providers

The providers that offer this model. This is not an exhaustive list.

DeepSeek

Modalities

Types of data this model can process.

Text

What is DeepSeek V4 Flash

A fuller summary of positioning, capabilities, and source-specific details for DeepSeek V4 Flash.

Capabilities

What DeepSeek V4 Flash supports

Reasoning Controls

OpenRouter lists GPT-5.5 with reasoning support and explicit reasoning-related request parameters.

JSON

Structured Outputs

Structured output settings are exposed through OpenRouter for schema-driven or format-controlled responses.

Tool Calling

Tool invocation and tool selection are supported in the routed OpenRouter interface for this model.

Multimodal I/O

This model accepts text input and returns text output.

CTX

Large Context Window

OpenRouter currently lists a context window of 1.0M with up to 384,000 tokens maximum output tokens.

Pricing for DeepSeek V4 Flash

Primary API pricing shown in the same “quick compare” spirit as the reference page.

Input tokens $0.14 Per million tokens

Output tokens $0.00 Per million tokens

Price Comparison

Additional usage-cost dimensions synced into the project for this model.

Cache read $0.02

maxTemperature 1

maxResponseSize 384,000 tokens

API Access & Providers

Places where this model is available, based on the synced detail-page metadata.

DeepSeek

Configuration & Parameters

The configurable options currently documented for this model.

Reasoning Effort

Select

Non-think for fast responses, High for complex problem-solving, Max to push reasoning to its fullest extent.

Default: high

Non-think High Max

Top P

Number

Nucleus sampling. Considers only tokens whose cumulative probability exceeds this threshold.

Default: 0.95 Range: 0 - 1 (step 0.01)

Top K

Number

Limits sampling to the K most likely tokens at each step. Set to 0 to disable.

Default: 20 Range: 0 - 100

Min P

Number

Minimum probability threshold relative to the most likely token.

Range: 0 - 1 (step 0.01)

Presence Penalty

Number

Penalizes tokens that have already appeared in the output, encouraging new topics.

Frequency Penalty

Number

Penalizes tokens based on how often they have already appeared.

Repetition Penalty

Number

Penalizes repeated tokens. Values above 1 discourage repetition.

Default: 1 Range: 0 - 2 (step 0.01)

Seed

Supported Request Parameters

Parameters currently listed by OpenRouter or the local catalog for this model.

Reasoning Effort Top P Top K Min P Presence Penalty Frequency Penalty Repetition Penalty Seed

Resources & Documentation

Official model cards, release notes, docs, and other references synced from the source page.

Official Website

→

OpenRouter Model Page OpenRouter

→