X.ai

Grok 4 Fast

Grok 4 Fast is a text generation model developed by xAI, the AI division of X. It is built on learnings from Grok 4 and is designed to deliver high-quality reasoning at lower computational cost, using approximately 40% fewer thinking tokens on average compared to its full counterpart. The model features a 2 million token context window and supports both reasoning and non-reasoning modes within a single unified architecture. Grok 4 Fast is trained end-to-end with tool-use reinforcement learning, enabling it to handle agentic tasks such as web browsing, code execution, and real-time information synthesis. It accepts both text and image inputs and produces text output. The model is well-suited for developers and enterprises that need multi-step reasoning, long-context document processing, and real-time web research without the computational overhead of a full frontier model.

September 2025 N/A context 2,000,000 tokens output

Long Context Window Reasoning Modes Token Efficiency Agentic Tool Use Multimodal Input Web & Search Integration

Overview ↓ About ↓ Capabilities ↓ Pricing ↓ Price Comparison ↓ Benchmarks ↓ Tools ↓ Resources ↓ Community ↓ FAQ ↓

Model Overview

High-signal model metadata in a structured two-column overview table.

Provider

The entity that provides this model.

X.ai

Input Context Window

The number of tokens supported by the input context window.

N/A tokens

Maximum Output Tokens

The number of tokens that can be generated by the model in a single request.

2,000,000 tokens tokens

Open Source

Whether the model's code is available for public use.

Release Date

When the model was first released.

September 2025

Knowledge Cut-off Date

When the model's knowledge was last updated.

September 2025

API Providers

The providers that offer this model. This is not an exhaustive list.

xAI API, OpenAI API

Modalities

Types of data this model can process.

Text Image Code

What is Grok 4 Fast

A fuller summary of positioning, capabilities, and source-specific details for Grok 4 Fast.

Grok 4 Fast is a text generation model developed by xAI, the AI division of X. It is built on learnings from Grok 4 and is designed to deliver high-quality reasoning at lower computational cost, using approximately 40% fewer thinking tokens on average compared to its full counterpart. The model features a 2 million token context window and supports both reasoning and non-reasoning modes within a single unified architecture.

Grok 4 Fast is trained end-to-end with tool-use reinforcement learning, enabling it to handle agentic tasks such as web browsing, code execution, and real-time information synthesis. It accepts both text and image inputs and produces text output. The model is well-suited for developers and enterprises that need multi-step reasoning, long-context document processing, and real-time web research without the computational overhead of a full frontier model.

Capabilities

What Grok 4 Fast supports

CTX

Long Context Window

Supports a 2 million token context window, enabling processing of very long documents, codebases, or multi-turn conversations in a single request.

Reasoning Modes

Offers both reasoning and non-reasoning modes in one unified architecture, allowing developers to choose the appropriate inference style per task.

Token Efficiency

Uses approximately 40% fewer thinking tokens on average than Grok 4, achieved through large-scale reinforcement learning optimized for intelligence density.

Agentic Tool Use

Trained end-to-end with tool-use reinforcement learning, supporting web browsing, code execution, and real-time information synthesis across multi-step tasks.

Multimodal Input

Accepts both text and image inputs, producing text output, making it usable for tasks that involve visual content alongside natural language.

Web & Search Integration

The search-enabled variant (grok-4-fast-search) supports real-time web and X (Twitter) browsing, and ranked first on LMArena's Search Arena with an Elo score of 1163.

</>

Code Generation

Scored 80.0% on LiveCodeBench (January–May evaluation window), reflecting strong performance on competitive programming and code synthesis tasks.

Math & Science Reasoning

Achieved 92.0% on AIME 2025 and 93.3% on HMMT 2025 without tools, demonstrating strong performance on formal mathematical reasoning benchmarks.

Pricing for Grok 4 Fast

Primary API pricing shown in the same “quick compare” spirit as the reference page.

Input tokens $0.20 Per million tokens

Output tokens $2.50 Per million tokens

Price Comparison

Additional usage-cost dimensions synced into the project for this model.

maxTemperature 1

maxResponseSize 2,000,000 tokens

API Access & Providers

Places where this model is available, based on the synced detail-page metadata.

xAI API OpenAI API

Model Performance

Benchmark scores synced from the current model source and normalized into the local catalog.

Benchmark	Score
GPQA Diamond PhD-level science questions (biology, physics, chemistry)	60.6%
HLE Questions that challenge frontier models across many domains	5.0%
LiveCodeBench Real-world coding tasks from recent competitions	40.1%
MMLU-Pro Expert knowledge across 14 academic disciplines	73.0%
SciCode Scientific research coding and numerical methods	32.9%

Resources & Documentation

Official model cards, release notes, docs, and other references synced from the source page.

Official Announcement Announcements

→

Independent Performance Analysis Other

→

xAI API Reference Documentation

→

xAI Developer Documentation Documentation

→

Community discussion

What people think about Grok 4 Fast

Grok 4 Fast discussions are most active in r/singularity, r/SillyTavernAI, r/grok. Top Reddit threads cluster around benchmark and model-comparison threads.

The strongest match in this snapshot has 508 upvotes and 97 comments.

r/grok 3 upvotes 7 comments May 7, 2026

Grok Model 4.1 and 4 retirement from API on May 15, 2026 12PM PT 🥺😢

As we continue advancing Grok, we are retiring several earlier models to focus fully on our newest generation. **Effective May 15, 2026 at 12:00pm PT**, the following models will be retired from the xAI API:

* `grok-4-1-fast-reasoning`
* `grok-4-1-fast-non-reasoning`
* `grok-4-fast-reasoning`
* `grok-4-fast-non-reasoning`
* `grok-4-0709`
* `grok-code-fast-1`
* `grok-3`
* `grok-imagine-image-pro`

[`https://docs.x.ai/developers/migration/may-15-retirement`](https://docs.x.ai/developers/migration/may-15-retirement)

Open Reddit thread

r/singularity 505 upvotes 137 comments September 20, 2025

Grok 4 fast with 2M context window is available!

Open Reddit thread

r/singularity 508 upvotes 97 comments September 21, 2025

There is a very real possibility that Google, OpenAI, Anthropic, etc. will release their own super cheap versions of Grok-4-fast!

It seems that Grok-4-fast was created based on Jet-Nemotron architecture.

[https://arxiv.org/abs/2508.15884v1](https://arxiv.org/abs/2508.15884v1)

It allows to massively decrease the amount of compute needed for inference without sacrificing the model performance. It also allows for a much bigger context window since the price no longer scales quadratically but linearly!

**So basically:** everyone can implement this architecture without retraining. The price of models can be **Drastically** reduced without sacrificing accuracy much.

XAI did it first, but others will definitely follow (if they haven't already).

>There is a high chance that OpenAI has already done it:

A sudden slash in prices on o3 by 80% and then GPT-5-thinking being even cheaper in a very short period of time.

Open Reddit thread

r/singularity 225 upvotes 164 comments September 30, 2025

Grok 4 Fast matches same high-level performance as Claude Opus 4.1, at less than 1% of the cost

How can xAI afford to run such a model for so little?

Open Reddit thread

r/singularity 225 upvotes 111 comments September 20, 2025

Grok 4 Fast Impressive performance - Gemini 2.5 pro level

Open Reddit thread

View more discussions →

FAQ

Common questions about Grok 4 Fast

What is the context window size for Grok 4 Fast?

Grok 4 Fast supports a 2 million token context window, which allows it to process very long documents, extended conversations, or large codebases within a single request.

How does Grok 4 Fast differ from Grok 4?

Grok 4 Fast is a cost-efficient variant built on learnings from Grok 4. It uses approximately 40% fewer thinking tokens on average, making it less computationally expensive while targeting comparable reasoning quality.

What input types does Grok 4 Fast support?

Grok 4 Fast accepts both text and image inputs and produces text output. It also supports tool use including web browsing, X (Twitter) browsing, and code execution.

What is the training data cutoff for Grok 4 Fast?

Based on the available metadata, the training date for Grok 4 Fast is listed as September 2025.

Where can I access the Grok 4 Fast API?

Grok 4 Fast is available through the xAI API. You can find API documentation and access details at x.ai/api. On MindStudio, no separate API key is required to use the model.

Does Grok 4 Fast support reasoning mode?

Yes. Grok 4 Fast supports both reasoning and non-reasoning modes within a single unified architecture, allowing developers to select the appropriate mode depending on the task.

More models from X.ai

Continue browsing adjacent models from the same provider.

← All AI Models

Grok 4 Fast

Model Overview

Provider

Input Context Window

Maximum Output Tokens

Open Source

Release Date

Knowledge Cut-off Date

API Providers

Modalities

What is Grok 4 Fast

What Grok 4 Fast supports

Long Context Window

Reasoning Modes

Token Efficiency

Agentic Tool Use

Multimodal Input

Web & Search Integration

Code Generation

Math & Science Reasoning

Pricing for Grok 4 Fast

Price Comparison

API Access & Providers

Model Performance

Resources & Documentation

AI tools related to Grok 4 Fast

XX.AI

Opnbx-ai

SliceX AI - Chrome Extension

What people think about Grok 4 Fast

Common questions about Grok 4 Fast

More models from X.ai