Mistral

Mistral Medium 3

Mistral Medium 3 is a text generation model released on May 7, 2025 by Mistral, a French AI company. It is designed to balance performance with cost efficiency, priced at $0.40 per million input tokens and $2.00 per million output tokens. The model supports a 128,000-token context window and was trained on data through early 2025. It is available through Mistral La Plateforme and Amazon SageMaker, with additional platform support planned. Mistral Medium 3 is built with enterprise deployment in mind, supporting self-hosted setups with a minimum of four GPUs as well as any cloud environment. It can be customized through continuous pre-training, fine-tuning, and integration with enterprise knowledge bases, making it applicable to domain-specific workflows in sectors such as financial services, energy, and healthcare. The model is noted for its strengths in coding tasks and multimodal understanding, and is suited for use cases including customer service automation, business process personalization, and complex dataset analysis.

May 07, 2025 128,000 context 16,000 tokens output
Long Context Window Code Generation Multimodal Understanding Fine-Tuning Support Enterprise Deployment Cost-Efficient Pricing

Model Overview

High-signal model metadata in a structured two-column overview table.

Provider

The entity that provides this model.

Mistral

Model ID

The routed model identifier exposed by upstream providers.

mistralai/mistral-medium-3

Input Context Window

The number of tokens supported by the input context window.

128,000 tokens

Maximum Output Tokens

The number of tokens that can be generated by the model in a single request.

16,000 tokens tokens

Open Source

Whether the model's code is available for public use.

No

Release Date

When the model was first released.

May 07, 2025 1 year ago

Knowledge Cut-off Date

When the model's knowledge was last updated.

2025

API Providers

The providers that offer this model. This is not an exhaustive list.

Mistral

Modalities

Types of data this model can process.

Text Image File

What is Mistral Medium 3

A fuller summary of positioning, capabilities, and source-specific details for Mistral Medium 3.

Mistral Medium 3 is a text generation model released on May 7, 2025 by Mistral, a French AI company. It is designed to balance performance with cost efficiency, priced at $0.40 per million input tokens and $2.00 per million output tokens. The model supports a 128,000-token context window and was trained on data through early 2025. It is available through Mistral La Plateforme and Amazon SageMaker, with additional platform support planned.

Mistral Medium 3 is built with enterprise deployment in mind, supporting self-hosted setups with a minimum of four GPUs as well as any cloud environment. It can be customized through continuous pre-training, fine-tuning, and integration with enterprise knowledge bases, making it applicable to domain-specific workflows in sectors such as financial services, energy, and healthcare. The model is noted for its strengths in coding tasks and multimodal understanding, and is suited for use cases including customer service automation, business process personalization, and complex dataset analysis.

Capabilities

What Mistral Medium 3 supports

CTX

Long Context Window

Processes up to 128,000 tokens in a single request, enabling analysis of long documents, codebases, or extended conversations without truncation.

</>

Code Generation

Generates, explains, and debugs code across common programming languages, with coding identified as one of the model's primary strengths.

MM

Multimodal Understanding

Handles tasks requiring multimodal comprehension, supporting analysis that goes beyond plain text inputs as noted in the model's official overview.

AI

Fine-Tuning Support

Supports continuous pre-training and comprehensive fine-tuning, allowing organizations to adapt the model to domain-specific datasets and workflows.

AI

Enterprise Deployment

Can be deployed on any cloud environment or self-hosted on a minimum of four GPUs, with integration options for enterprise knowledge bases.

AI

Cost-Efficient Pricing

Priced at $0.40 per million input tokens and $2.00 per million output tokens, positioning it as an accessible option for organizations managing AI inference costs.

Pricing for Mistral Medium 3

Primary API pricing shown in the same “quick compare” spirit as the reference page.

Price Comparison

Additional usage-cost dimensions synced into the project for this model.

Cache read $0.04
maxTemperature 1
maxResponseSize 16,000 tokens

API Access & Providers

Places where this model is available, based on the synced detail-page metadata.

Mistral

Provider Endpoints

Endpoint-level provider data currently available for this model.

Mistral

1d uptime: 99.6% Supported params: 11 Implicit caching: No

Model Performance

Benchmark scores synced from the current model source and normalized into the local catalog.

Benchmark Score
AIME 2024
American math olympiad problems
44.0%
GPQA Diamond
PhD-level science questions (biology, physics, chemistry)
57.8%
HLE
Questions that challenge frontier models across many domains
4.3%
LiveCodeBench
Real-world coding tasks from recent competitions
40.0%
MATH-500
Undergraduate and competition-level math problems
90.7%
MMLU-Pro
Expert knowledge across 14 academic disciplines
76.0%
SciCode
Scientific research coding and numerical methods
33.1%

Resources & Documentation

Official model cards, release notes, docs, and other references synced from the source page.

Compare Mistral Medium 3 with related models

Jump straight into the most relevant side-by-side comparison pages for this model.

GLM 5.1 vs Mistral Medium 3

Compare GLM 5.1 and Mistral Medium 3 across pricing, context window, capabilities, benchmarks, and API access to choose the better fit for reasoning-heavy tasks versus tool-augmented workflows.

DeepSeek V4 Pro vs Mistral Medium 3

Compare DeepSeek V4 Pro and Mistral Medium 3 across pricing, context window, capabilities, benchmarks, and API access to choose the better fit for long-context workloads versus tool-augmented workflows.

Mistral Medium 3 vs Mistral Small 3.1 (25.03)

Compare Mistral Medium 3 and Mistral Small 3.1 (25.03) across pricing, context window, capabilities, benchmarks, and API access to choose the better fit for tool-augmented workflows versus cost-efficient scale.

Amazon Nova Lite vs Mistral Medium 3

Compare Amazon Nova Lite and Mistral Medium 3 across pricing, context window, capabilities, benchmarks, and API access to choose the better fit for tool-augmented workflows versus tool-augmented workflows.

Kimi K2.6 vs Mistral Medium 3

Compare Kimi K2.6 and Mistral Medium 3 across pricing, context window, capabilities, benchmarks, and API access to choose the better fit for reasoning-heavy tasks versus tool-augmented workflows.

DeepSeek V4 Flash vs Mistral Medium 3

Compare DeepSeek V4 Flash and Mistral Medium 3 across pricing, context window, capabilities, benchmarks, and API access to choose the better fit for long-context workloads versus tool-augmented workflows.

Community discussion

What people think about Mistral Medium 3

Mistral Medium 3 discussions are most active in r/MistralAI, r/LocalLLaMA, r/singularity.

Top Reddit threads cluster around benchmark and model-comparison threads, safety and censorship questions, coding workflow discussions. The strongest match in this snapshot has 906 upvotes and 336 comments.

r/LocalLLaMA 544 upvotes 315 comments April 29, 2026
mistralai/Mistral-Medium-3.5-128B · Hugging Face

[https://huggingface.co/unsloth/Mistral-Medium-3.5-128B-GGUF](https://huggingface.co/unsloth/Mistral-Medium-3.5-128B-GGUF)

# Mistral Medium 3.5 128B

Mistral Medium 3.5 is our first flagship merged model. It is a dense 128B model with a 256k context window, handling instruction-following, reasoning, and coding in a single set of weights. Mistral Medium 3.5 replaces its predecessor Mistral Medium 3.1 and Magistral in Le Chat. It also replaces Devstral 2 in our coding agent Vibe. Concretely, expect better performance for instruct, reasoning and coding tasks in a new unified model in comparison with our previous released models.

Reasoning effort is configurable per request, so the same model can answer a quick chat reply or work through a complex agentic run. We trained the vision encoder from scratch to handle variable image sizes and aspect ratios.

Find more information on our [blog](https://mistral.ai/news/vibe-remote-agents-mistral-medium-3-5).

# Key Features

Mistral Medium 3.5 includes the following architectural choices:

* **Dense 128B parameters**.
* **256k context length**.
* **Multimodal input**: Accepts both text and image input, with text output.
* **Instruct and Reasoning functionalities** with function calls (reasoning effort configurable per request).

Mistral Medium 3.5 offers the following capabilities:

* **Reasoning Mode**: Toggle between fast instant reply mode and reasoning mode, boosting performance with test-time compute when requested.
* **Vision**: Analyzes images and provides insights based on visual content, in addition to text.
* **Multilingual**: Supports dozens of languages, including English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, and Arabic.
* **System Prompt**: Strong adherence and support for system prompts.
* **Agentic**: Best-in-class agentic capabilities with native function calling and JSON output.
* **Large Context Window**: Supports a 256k context window.

We release this model under a [**Modified MIT License**](https://huggingface.co/mistralai/Mistral-Medium-3.5-128B/blob/main/(https://huggingface.co/mistralai/mistralai/Mistral-Medium-3.5-128B/blob/main/LICENSE)): Open-source license for both commercial and non-commercial use with exceptions for companies with large revenue.

# Recommended Settings

* **Reasoning Effort**:
* `'none'` → Do not use reasoning
* `'high'` → Use reasoning (recommended for complex prompts and agentic usage) Use `reasoning_effort="high"` for complex tasks and agentic coding.
* **Temperature**: 0.7 for `reasoning_effort="high"`. Temp between 0.0 and 0.7 for `reasoning_effort="none"` depending on the task. Generally, lower means answer that are more to the point and higher allows the model to be more creative. It is a good practice to try different values in order to improve the model performance to meet your demands.

Open Reddit thread
View more discussions →
FAQ

Common questions about Mistral Medium 3

What is the context window size for Mistral Medium 3?

Mistral Medium 3 supports a context window of 128,000 tokens, allowing it to process long documents, extended conversations, or large codebases in a single request.

How is Mistral Medium 3 priced?

The model is priced at $0.40 per million input tokens and $2.00 per million output tokens when accessed via API.

What is the training data cutoff for Mistral Medium 3?

According to the available metadata, Mistral Medium 3 was trained on data through early 2025.

Where can Mistral Medium 3 be deployed?

Mistral Medium 3 is available through Mistral La Plateforme and Amazon SageMaker. It also supports self-hosted deployment on a minimum of four GPUs and can run on any cloud environment.

Can Mistral Medium 3 be fine-tuned for specific use cases?

Yes. The model supports continuous pre-training, comprehensive fine-tuning, and integration with enterprise knowledge bases, making it adaptable for domain-specific applications in industries such as healthcare, finance, and energy.

More models from Mistral

Continue browsing adjacent models from the same provider.

← All AI Models