OpenAI

GPT-5 nano

GPT-5 Nano is a text generation model developed by OpenAI and released as part of the GPT-5 model family. It is designed to be the fastest and most cost-efficient variant in that family, making it accessible for high-volume or latency-sensitive applications. The model supports a 400,000-token context window and has a training data cutoff of May 2024. It accepts structured inputs including tool calls and MCP server configurations. GPT-5 Nano is particularly well-suited for summarization and classification tasks, where speed and throughput matter more than extended reasoning depth. Its large context window allows it to process long documents in a single pass, which is useful for document triage, content labeling, and similar workflows. Developers can integrate it with external tools and MCP servers, extending its utility beyond pure text generation into agentic and multi-step task scenarios.

Aug 07, 2025 400,000 context 128,000 tokens output
Large Context Window Tool Use MCP Server Support Text Summarization Text Classification Low-Latency Inference

Model Overview

High-signal model metadata in a structured two-column overview table.

Provider

The entity that provides this model.

OpenAI

Model ID

The routed model identifier exposed by upstream providers.

openai/gpt-5-nano

Input Context Window

The number of tokens supported by the input context window.

400,000 tokens

Maximum Output Tokens

The number of tokens that can be generated by the model in a single request.

128,000 tokens tokens

Open Source

Whether the model's code is available for public use.

No

Release Date

When the model was first released.

Aug 07, 2025 10 months ago

Knowledge Cut-off Date

When the model's knowledge was last updated.

May 2024

API Providers

The providers that offer this model. This is not an exhaustive list.

Azure, OpenAI

Modalities

Types of data this model can process.

Text Image File

What is GPT-5 nano

A fuller summary of positioning, capabilities, and source-specific details for GPT-5 nano.

GPT-5 Nano is a text generation model developed by OpenAI and released as part of the GPT-5 model family. It is designed to be the fastest and most cost-efficient variant in that family, making it accessible for high-volume or latency-sensitive applications. The model supports a 400,000-token context window and has a training data cutoff of May 2024. It accepts structured inputs including tool calls and MCP server configurations.

GPT-5 Nano is particularly well-suited for summarization and classification tasks, where speed and throughput matter more than extended reasoning depth. Its large context window allows it to process long documents in a single pass, which is useful for document triage, content labeling, and similar workflows. Developers can integrate it with external tools and MCP servers, extending its utility beyond pure text generation into agentic and multi-step task scenarios.

Capabilities

What GPT-5 nano supports

CTX

Large Context Window

Processes up to 400,000 tokens in a single request, enabling full-document ingestion for summarization or classification without chunking.

TL

Tool Use

Supports function calling and tool integrations, allowing the model to invoke external APIs or structured actions during a conversation.

MCP

MCP Server Support

Accepts MCP server configurations as inputs, enabling connection to Model Context Protocol-compatible tool servers for agentic workflows.

AI

Text Summarization

Condenses long-form content into concise outputs; the 400K context window allows entire documents to be summarized in one pass.

AI

Text Classification

Assigns categories or labels to input text, suited for content moderation, routing, and tagging pipelines at scale.

AI

Low-Latency Inference

Optimized for speed within the GPT-5 family, making it suitable for real-time or high-throughput production use cases.

Pricing for GPT-5 nano

Primary API pricing shown in the same “quick compare” spirit as the reference page.

Price Comparison

Additional usage-cost dimensions synced into the project for this model.

Cache read $0.01
maxTemperature 1
maxResponseSize 128,000 tokens

API Access & Providers

Places where this model is available, based on the synced detail-page metadata.

Azure OpenAI

Provider Endpoints

Endpoint-level provider data currently available for this model.

Azure

Supported params: 8 Implicit caching: No

OpenAI

Max prompt: 272,000 Max output: 128,000 1d uptime: 98.7% Supported params: 8 Implicit caching: Yes

Azure

1d uptime: 100.0% Supported params: 8 Implicit caching: No

Configuration & Parameters

The configurable options currently documented for this model.

Reasoning Effort

Select

Used to give the model guidance on how many reasoning tokens it should generate before creating a response to the prompt. Low will favor speed and economical token usage, and high will favor more complete reasoning at the cost of more tokens generated and slower responses. The default value is medium, which is a balance between speed and reasoning accuracy.

Default: medium
Low Medium High

Supported Request Parameters

Parameters currently listed by OpenRouter or the local catalog for this model.

Reasoning Effort

Model Performance

Benchmark scores synced from the current model source and normalized into the local catalog.

Benchmark Score
GPQA Diamond
PhD-level science questions (biology, physics, chemistry)
67.6%
HLE
Questions that challenge frontier models across many domains
8.2%
LiveCodeBench
Real-world coding tasks from recent competitions
78.9%
MMLU-Pro
Expert knowledge across 14 academic disciplines
78.0%
SciCode
Scientific research coding and numerical methods
36.6%

Resources & Documentation

Official model cards, release notes, docs, and other references synced from the source page.

Community discussion

What people think about GPT-5 nano

GPT-5 nano discussions are most active in r/OpenAI, r/ChatGPT, r/ai_trading. Top Reddit threads cluster around benchmark and model-comparison threads. The strongest match in this snapshot has 222 upvotes and 68 comments.

I’ve been curious whether current AI models have any natural aptitude for trading on realtime, raw financial data, without any elaborate news pipelines or convoluted system prompts. I mean literally just raw livestreamed market numbers and a calculator. 

So I built a crypto daytrading arena. All agents consume a realtime stream of ticker data and candlesticks for **BTC**, **SOL**, and **FARTCOIN**. They have access to a calculator and can view their portfolio and holdings. As data flows in, the agent autonomously decides to enter or exit, whenever they want, no guardrails.

I started with four agents, each with $100k in its account: **gpt 5 nano (low reasoning)**, **minimax m2.5**, **grok 4.1 fast (no reasoning)**, and **gemini 2.5 flash**.

After a little more than 24 hrs of continuous trading, here’s roughly where they stand:

* gpt 5 nano: **+$11,500**
* minimax m2.5: +$4000
* gemini 2.5 flash: +$1900
* grok 4.1 fast: -$100

Here's a view of the arena dashboard I've been using to track agent portfolios:

[agent arena dashboard](https://reddit.com/link/1reww2d/video/352q5gcwtqlg1/player)

I'm honestly impressed with how gpt-5-nano has performed so far, considering it's a relatively cheap model. When I started this I definitely wasn't even expecting it to be in the positives by now. It might just be really good at processing raw financial numbers(idk)? I’m keeping these agents running so we’ll see if these gains stay consistent. Eventually I also want to throw in more expensive models (gpt 5.2, sonnet 4.6) and see how they compete too. 

Also, this project is fully open source so you can check it out and run it yourself! [https://github.com/ryan-yuuu/crypto-trading-arena](https://github.com/ryan-yuuu/crypto-trading-arena) 

edit: Thanks for the award! This is my first one!

**Update 2/28:**

I had built some more metrics and trade recording features in order provide data for insights into the arena performance over time. Because of this, I had redeployed the arena on 2/26 \~9pm PT, this time only with gpt5 nano and gemini 2.5 flash. Here are some key highlights since the agents were redeployed \~48hrs ago.

During this session, 2/26 9:19pm PT to 2/28 9:29pm PT:

* Market was bearish. During the session BTC had dipped to -7%, SOL -12%, FART -18%
* Both agents outperformed equal weight buy and hold BITCOIN and SOLANA--let's say this is the "benchmark" for comparison purposes
* gpt-5 nano returned +1.50%, beating the benchmark by +1.90%. It stayed above the benchmark for the entirety of the session
* gemini 2.5 flash was very cash heavy and returned -0.35% but still beat the benchmark by +0.05%. At some point it did dip below the benchmark briefly (by at most -0.05%)

Here are reports I generated to visualize the arena's performance during the 48hr trading session.

* Google drive link to PDF in case the reports don't render in the post: [https://drive.google.com/file/d/1iICEoMPU7JUKUHkjw\_tq5qYp9MeUxr0F/view?usp=sharing](https://drive.google.com/file/d/1iICEoMPU7JUKUHkjw_tq5qYp9MeUxr0F/view?usp=sharing)

[gpt 5 nano and gemini 2.5 portfolio performance](https://preview.redd.it/2yg4ydeatdmg1.png?width=3943&format=png&auto=webp&s=995c01aee42834512cad7cc1d8873d6cd58b9813)

[gpt 5 nano and gemini 2.5 flash perf vs market](https://preview.redd.it/zfktgeeatdmg1.png?width=3927&format=png&auto=webp&s=5b104ba3c64d014541916786270a9d9acacb0f80)

p.s. these redeployed agents are still running so I can provide more updates later on.

Open Reddit thread

GPT5-nano is doing an excellent job with assist. Seems to have no issues handling a long string of different commands. Speed is good and so far my testing has been great!!

https://youtu.be/o30ymyQqQo4?si=p7h95KIKICWIq9IG

Edit: Here’s some details ony setup

Using the stock OpenAI conversion integration with GPT5-nano
STT is Home Assistant Cloud
TTS is Eleven labs
Cloned the Jarvis voice from sound bites I found in an obscure Reddit post with a Google Drive link to said files from an old phone app.
Device is Voice PE taken apart and put in a 3d printed arch reactor case. (Just had to take front plant off.

Open Reddit thread
r/OpenAI 222 upvotes 68 comments August 11, 2025
GPT-5 Benchmarks: How GPT-5, Mini, and Nano Perform in Real Tasks

Hi everyone,

We ran task benchmarks on the GPT-5 series models, and as per general consensus, they are likely not a break through in intelligence. But they are a good replacement of o3, o1 and gpt-4.1. And lower latency and the cost improvements are impressive! Likely really good models for chatgpt, even though users have to get used to them.

**For builders, perhaps one way to look at it:**

o3 and gpt-4.1 -> gpt-5

o1 -> gpt-5-mini

o1-mini -> gpt-5-nano

**But let's look at a tricky failure case to be aware of.**

Part of our context oriented task evals, we task the model to read a travel journal and count the number of visited cities:

Question: "How many cities does the author mention"

Expected: 19

GPT-5: 12

Models that consistently gets this right is gemini-2.5-flash, gemini-2.5-pro, claude-sonnet-4, claude-opus-4, claude-sonnet-3.7, claude-3.5-sonnet, gpt-oss-120b, grok-4.

To be a good model for building with, context attention is one of the primary criterias. What makes Anthropic models stand out is how well they have been utilising the context window even since sonnet-3.5. Gemini series and Grok seems to be putting attention to this as well.

You can read more about our task categories and eval methods here: [https://opper.ai/models](https://opper.ai/models)

For those building with it, anyone else seeing similar strengths/weaknesses?

Open Reddit thread

I’ve been curious whether current AI models have any natural aptitude for trading on realtime, raw financial data, without any elaborate news pipelines or convoluted system prompts. I mean literally just raw livestreamed market numbers and a calculator. 

So I built a crypto daytrading arena. All agents consume a realtime stream of ticker data and candlesticks for **BTC**, **SOL**, and **FARTCOIN**. They have access to a calculator and can view their portfolio and holdings. As data flows in, the agent autonomously decides to enter or exit, whenever they want, no guardrails.

I started with four agents, each with $100k to start: gpt 5 nano (low reasoning), minimax m2.5, grok 4.1 fast (no reasoning), and gemini 2.5 flash.

After a little more than 24 hrs of continuous trading, here’s roughly where they stand:

* gpt 5 nano: **+$11,500**
* minimax m2.5: +$4000
* gemini 2.5 flash: +$1900
* grok 4.1 fast: -$100

I'm honestly impressed with how gpt5-nano has performed so far, considering it's a relatively cheap model. When I started this I definitely wasn't even expecting it to be in the positives by now. It might just be really good at processing raw financial numbers(idk)? I’m keeping these agents running so we’ll see if these gains stay consistent. Eventually I also want to throw in more expensive models (gpt 5.2, sonnet 4.6) and see how they compete too. 

Also, this is fully open source: will provide github repo in comments.

**tldr:** gpt-5-nano, good with money??

Open Reddit thread

https://reddit.com/link/1svgw4o/video/zqh4ydm46dxg1/player

No tricks, no copy-paste. Two completely different AI models, separate conversations - one remembers what the other was told.

The way it works: every message gets embedded and stored. When you open a new chat with any model, your memory is injected into context automatically. GPT, Claude, Gemini, Grok and DeepSeek - they all share the same memory layer.

So when I told GPT-5 Nano "I live in Bahrain" and then opened a fresh Claude Sonnet 4.6 conversation and asked "where do I live?" - it said "Based on your memory, you live in Bahrain 🇧🇭"

Live on [asksary.com](http://asksary.com) now

Open Reddit thread
View more discussions →
FAQ

Common questions about GPT-5 nano

What is the context window size for GPT-5 Nano?

GPT-5 Nano supports a context window of 400,000 tokens, allowing large volumes of text to be processed in a single request.

What is the training data cutoff for GPT-5 Nano?

The model's training data has a cutoff of May 2024, meaning it does not have knowledge of events after that date.

What tasks is GPT-5 Nano best suited for?

According to OpenAI's overview, GPT-5 Nano is designed for summarization and classification tasks, and is optimized for speed and cost efficiency.

Does GPT-5 Nano support tool calling and MCP servers?

Yes. The model supports tool use and MCP server inputs, enabling integration with external APIs and Model Context Protocol-compatible tool servers.

How does GPT-5 Nano relate to the rest of the GPT-5 family?

GPT-5 Nano is positioned as the fastest and most cost-efficient model in the GPT-5 family, intended for use cases where throughput and cost matter more than maximum reasoning depth.

More models from OpenAI

Continue browsing adjacent models from the same provider.

← All AI Models