OpenAI

GPT-5 mini

GPT-5 mini is a text generation model developed by OpenAI, designed as a faster and more cost-efficient variant of GPT-5. It supports a 400,000-token context window and has a training data cutoff of May 2024. The model is tagged as a latest release and supports tool use and MCP (Model Context Protocol) server integrations. GPT-5 mini is best suited for well-defined tasks where precise prompting is used and response speed or cost efficiency is a priority. It accepts structured inputs including tool calls and MCP server configurations, making it a practical choice for agentic workflows and automation pipelines. Developers working on tasks with clear, bounded requirements are the primary intended audience for this model.

Aug 07, 2025 400,000 context 128,000 tokens output
Large Context Window Tool Use MCP Server Support Text Generation Fast Inference

Model Overview

High-signal model metadata in a structured two-column overview table.

Provider

The entity that provides this model.

OpenAI

Model ID

The routed model identifier exposed by upstream providers.

openai/gpt-5-mini

Input Context Window

The number of tokens supported by the input context window.

400,000 tokens

Maximum Output Tokens

The number of tokens that can be generated by the model in a single request.

128,000 tokens tokens

Open Source

Whether the model's code is available for public use.

No

Release Date

When the model was first released.

Aug 07, 2025 9 months ago

Knowledge Cut-off Date

When the model's knowledge was last updated.

2024-05-31

API Providers

The providers that offer this model. This is not an exhaustive list.

OpenAI, Azure

Modalities

Types of data this model can process.

Text Image File

What is GPT-5 mini

A fuller summary of positioning, capabilities, and source-specific details for GPT-5 mini.

GPT-5 mini is a text generation model developed by OpenAI, designed as a faster and more cost-efficient variant of GPT-5. It supports a 400,000-token context window and has a training data cutoff of May 2024. The model is tagged as a latest release and supports tool use and MCP (Model Context Protocol) server integrations.

GPT-5 mini is best suited for well-defined tasks where precise prompting is used and response speed or cost efficiency is a priority. It accepts structured inputs including tool calls and MCP server configurations, making it a practical choice for agentic workflows and automation pipelines. Developers working on tasks with clear, bounded requirements are the primary intended audience for this model.

Capabilities

What GPT-5 mini supports

CTX

Large Context Window

Processes up to 400,000 tokens in a single context, enabling long documents, extended conversations, or large codebases to be handled in one request.

TL

Tool Use

Supports function calling and tool integrations, allowing the model to invoke external tools or APIs as part of a response.

MCP

MCP Server Support

Accepts MCP (Model Context Protocol) server configurations as inputs, enabling standardized integration with external context and data sources.

AI

Text Generation

Generates natural language text across a wide range of formats including summaries, instructions, and structured responses.

AI

Fast Inference

Optimized for lower latency compared to full GPT-5, making it suitable for applications where response speed is a priority.

Pricing for GPT-5 mini

Primary API pricing shown in the same “quick compare” spirit as the reference page.

Price Comparison

Additional usage-cost dimensions synced into the project for this model.

Web search $10000.00
Cache read $0.02
maxTemperature 1
maxResponseSize 128,000 tokens

API Access & Providers

Places where this model is available, based on the synced detail-page metadata.

OpenAI Azure

Provider Endpoints

Endpoint-level provider data currently available for this model.

OpenAI

Max prompt: 272,000 Max output: 128,000 1d uptime: 99.9% Supported params: 8 Implicit caching: Yes

Azure

1d uptime: 97.0% Supported params: 8 Implicit caching: No

Azure

1d uptime: 99.9% Supported params: 8 Implicit caching: No

Configuration & Parameters

The configurable options currently documented for this model.

Reasoning Effort

Select

Used to give the model guidance on how many reasoning tokens it should generate before creating a response to the prompt. Low will favor speed and economical token usage, and high will favor more complete reasoning at the cost of more tokens generated and slower responses. The default value is medium, which is a balance between speed and reasoning accuracy.

Default: medium
Low Medium High

Supported Request Parameters

Parameters currently listed by OpenRouter or the local catalog for this model.

Reasoning Effort

Model Performance

Benchmark scores synced from the current model source and normalized into the local catalog.

Benchmark Score
GPQA Diamond
PhD-level science questions (biology, physics, chemistry)
82.8%
HLE
Questions that challenge frontier models across many domains
19.7%
LiveCodeBench
Real-world coding tasks from recent competitions
83.8%
MMLU-Pro
Expert knowledge across 14 academic disciplines
83.7%
SciCode
Scientific research coding and numerical methods
39.2%

Resources & Documentation

Official model cards, release notes, docs, and other references synced from the source page.

Community discussion

What people think about GPT-5 mini

GPT-5 mini discussions are most active in r/GithubCopilot, r/ChatGPT, r/OpenAI. Top Reddit threads cluster around benchmark and model-comparison threads, safety and censorship questions, coding workflow discussions.

The strongest match in this snapshot has 340 upvotes and 29 comments.

r/ChatGPT 2 upvotes 12 comments May 10, 2026
GPT-5.5 Thinking selected, but ChatGPT web replies as GPT-5 mini

I’m on ChatGPT Plus. Since May 8/9, ChatGPT web appears to route me to GPT-5 mini even when I manually select “Latest • 5.5 → Thinking”.

The UI still shows Thinking selected and does not show any usage-limit warning, but the assistant replies:

“I’m currently running on GPT-5 mini, so I cannot use GPT-5.5 Thinking.”

What makes it stranger:
- Web keeps giving me GPT-5 mini.
- Android mobile currently responds as GPT-5.5 Thinking.
- Yesterday mobile worked briefly too, then later started responding as mini.
- I tried Edge, other browsers, incognito, Windows app, Android app, sign out/in, “Log out all”, waiting, and brand-new chats.
- It has persisted across overnight gaps, so a simple 3-hour cap fallback doesn’t seem to explain it.
- As far as I understand, GPT-5.5 Instant should be unlimited or almost unlimited for my subscription tier, so being silently routed to mini even outside Thinking is especially confusing.
- OpenAI support escalated it, but I’m still waiting for a specialist response.

Has anyone else seen web and mobile route to different models despite the same manually selected model?

Open Reddit thread
r/GithubCopilot 10 upvotes 9 comments March 7, 2026
GPT-5 mini is very capable

there are only two free options in github copilot cli so i have been using GPT-5 mini for some tasks because i don't want to burn out my PR too quickly and to my surprise it is very capable with reasoning set to "high". since its free option, i always run plan mode first and after the task is done i run review command.

Open Reddit thread

https://preview.redd.it/zb1gzzm9ahlg1.png?width=3000&format=png&auto=webp&s=2fe11dfb13a252dacd0ae8c250f4ec17d1a51d93

Qwen3.5-122B-A10B generally comes out ahead of gpt-5-mini and gpt-oss-120b across most benchmarks.

**vs GPT-5-mini:** Qwen3.5 wins on knowledge (MMLU-Pro 86.7 vs 83.7), STEM reasoning (GPQA Diamond 86.6 vs 82.8), agentic tasks (BFCL-V4 72.2 vs 55.5), and vision tasks (MathVision 86.2 vs 71.9). GPT-5-mini is only competitive in a few coding benchmarks and translation.

**vs GPT-OSS-120B:** Qwen3.5 wins more decisively. GPT-OSS-120B holds its own in competitive coding (LiveCodeBench 82.7 vs 78.9) but falls behind significantly on knowledge, agents, vision, and multilingual tasks.

**TL;DR:** Qwen3.5-122B-A10B is the strongest of the three overall. GPT-5-mini is its closest rival in coding/translation. GPT-OSS-120B trails outside of coding.

Lets see if the quants hold up to the benchmarks

Open Reddit thread

Hi LocalLlama.

Here are the results from the March run of the GACL. A few observations from my side:

* **GPT-5.4** clearly leads among the major models at the moment.
* **Qwen3.5-27B** performed better than every other Qwen model except **397B**, trailing it by only **0.04 points**. In my opinion, it’s an outstanding model.
* **Kimi2.5** is currently the top **open-weight** model, ranking **#6 globally**, while **GLM-5** comes next at **#7 globally**.
* Significant difference between Opus and Sonnet, more than I expected.
* **GPT models dominate the Battleship game.** However, **Tic-Tac-Toe** didn’t work well as a benchmark since nearly all models performed similarly. I’m planning to replace it with another game next month. Suggestions are welcome.

For context, **GACL** is a league where models generate **agent code** to play **seven different games**. Each model produces **two agents**, and each agent competes against every other agent except its paired “friendly” agent from the same model. In other words, the models themselves don’t play the games but they generate the agents that do. Only the top-performing agent from each model is considered when creating the leaderboards.

All **game logs, scoreboards, and generated agent codes** are available on the league page.

[Github Link](https://github.com/summersonnn/Game-Agent-Coding-Benchmark)

[League Link](https://gameagentcodingleague.com/leaderboard.html)

Open Reddit thread
View more discussions →
FAQ

Common questions about GPT-5 mini

What is the context window for GPT-5 mini?

GPT-5 mini supports a context window of 400,000 tokens, allowing large volumes of text, documents, or conversation history to be included in a single request.

What is the knowledge cutoff date for GPT-5 mini?

GPT-5 mini has a training data cutoff of May 2024, meaning it does not have knowledge of events or information published after that date.

How does GPT-5 mini differ from GPT-5?

GPT-5 mini is described by OpenAI as a faster and more cost-efficient version of GPT-5, optimized for well-defined tasks and precise prompts rather than open-ended or highly complex reasoning.

Does GPT-5 mini support tool calling and MCP integrations?

Yes. GPT-5 mini supports tool use and MCP (Model Context Protocol) server inputs, making it compatible with agentic workflows and external integrations on platforms like MindStudio.

What types of tasks is GPT-5 mini best suited for?

According to OpenAI's overview, GPT-5 mini is best suited for well-defined tasks where precise prompts are used, such as structured data extraction, classification, summarization, and automation pipelines.

More models from OpenAI

Continue browsing adjacent models from the same provider.

← All AI Models