X.ai

Grok 4.20

Grok 4.20 is a text generation model developed by xAI, the AI division of X. This variant is specifically configured with reasoning disabled, meaning it skips the extended chain-of-thought process to deliver faster, lower-latency responses while still operating on the full Grok 4.20 architecture. It supports a context window of up to 2 million tokens, allowing it to ingest very long documents, large codebases, or extended conversation histories in a single pass. The model was made available via API in March 2026 as part of the Grok 4.20 Beta family, which also includes reasoning-enabled and multi-agent-tuned variants. This model is designed for agentic and tool-centric workflows where response speed is a priority over deep step-by-step reasoning. It is well-suited for automated pipelines, coding agents, data-processing tasks, and any application where the model needs to call external tools rapidly and reliably. Its instruction-following behavior is tuned for consistency, making outputs predictable across repeated or templated prompts. Developers building low-latency AI systems or integrating LLM capabilities into production pipelines are the primary intended audience.

Mar 31, 2026 2M context 2,000,000 tokens output
Massive Context Window Agentic Tool Calling Non-Reasoning Mode Instruction Following Multimodal Input Text Generation

Model Overview

High-signal model metadata in a structured two-column overview table.

Provider

The entity that provides this model.

X.ai

Model ID

The routed model identifier exposed by upstream providers.

x-ai/grok-4.20

Input Context Window

The number of tokens supported by the input context window.

2M tokens

Maximum Output Tokens

The number of tokens that can be generated by the model in a single request.

2,000,000 tokens tokens

Open Source

Whether the model's code is available for public use.

No

Release Date

When the model was first released.

Mar 31, 2026 2 months ago

Knowledge Cut-off Date

When the model's knowledge was last updated.

March 2026

API Providers

The providers that offer this model. This is not an exhaustive list.

xAI

Modalities

Types of data this model can process.

Text Image Code File

What is Grok 4.20

A fuller summary of positioning, capabilities, and source-specific details for Grok 4.20.

Grok 4.20 is a text generation model developed by xAI, the AI division of X. This variant is specifically configured with reasoning disabled, meaning it skips the extended chain-of-thought process to deliver faster, lower-latency responses while still operating on the full Grok 4.20 architecture. It supports a context window of up to 2 million tokens, allowing it to ingest very long documents, large codebases, or extended conversation histories in a single pass. The model was made available via API in March 2026 as part of the Grok 4.20 Beta family, which also includes reasoning-enabled and multi-agent-tuned variants.

This model is designed for agentic and tool-centric workflows where response speed is a priority over deep step-by-step reasoning. It is well-suited for automated pipelines, coding agents, data-processing tasks, and any application where the model needs to call external tools rapidly and reliably. Its instruction-following behavior is tuned for consistency, making outputs predictable across repeated or templated prompts. Developers building low-latency AI systems or integrating LLM capabilities into production pipelines are the primary intended audience.

Capabilities

What Grok 4.20 supports

CTX

Massive Context Window

Processes up to 2 million tokens in a single pass, enabling ingestion of entire codebases, lengthy documents, or extended conversation histories without truncation.

AG

Agentic Tool Calling

Optimized for rapid and reliable external tool invocation, making it suitable for automated agent frameworks and multi-step pipelines.

RN

Non-Reasoning Mode

Reasoning is disabled by design, reducing latency by skipping extended chain-of-thought processing while retaining the underlying model's generation capabilities.

AI

Instruction Following

Tuned for strong prompt adherence, producing consistent and predictable outputs across templated or repeated instructions.

MM

Multimodal Input

Accepts input types beyond plain text, supporting diverse real-world task formats within a single model interface.

AI

Text Generation

Generates coherent, contextually grounded text responses across a wide range of domains including coding, data processing, and conversational tasks.

Pricing for Grok 4.20

Primary API pricing shown in the same “quick compare” spirit as the reference page.

Price Comparison

Additional usage-cost dimensions synced into the project for this model.

Web search $5000.00
Cache read $0.20
maxTemperature 1
maxResponseSize 2,000,000 tokens

API Access & Providers

Places where this model is available, based on the synced detail-page metadata.

xAI

Provider Endpoints

Endpoint-level provider data currently available for this model.

xAI

1d uptime: 100.0% Supported params: 12 Implicit caching: No

Model Performance

Benchmark scores synced from the current model source and normalized into the local catalog.

Benchmark Score
GPQA Diamond
PhD-level science questions (biology, physics, chemistry)
88.5%
HLE
Questions that challenge frontier models across many domains
30.0%
SciCode
Scientific research coding and numerical methods
44.7%

Resources & Documentation

Official model cards, release notes, docs, and other references synced from the source page.

Community discussion

What people think about Grok 4.20

Grok 4.20 discussions are most active in r/singularity, r/grok, r/LovingAI. Top Reddit threads cluster around benchmark and model-comparison threads, safety and censorship questions, coding workflow discussions.

The strongest match in this snapshot has 1779 upvotes and 451 comments.

r/SillyTavernAI 12 upvotes 17 comments March 14, 2026
What's the general opinion on the new grok 4.20?

I haven't tried it much at all but doing a little bit of testing so far it seems... Decent, maybe even pretty good though I'll need to test more.

So far I can definitely say that the multi agent version is definetly better in understanding everything that's going on in context and stuff but alot more costly, like a lot, and it kinda makes the characters sound like robots to be honest. It was also a lot more unhinged I feel compared to the normal one.

I also find that it has really good prompt adherence atleast in the following case as I have a small section in my prompt that basically says "Stop the roleplay or redirect it if you feel the characters are going ooc and address your concerns ooc" or whatever. And sometimes when I'm messing with bots I intentionally put them in a ooc scenario, more just for fun that legit roleplay and where every other model tends to just go with it forcing the character to be and act ooc so far the multi-agent version of grok actually either stops the roleplay completely or begins to push the roleplay in a more in character direction and informing me ooc, I think that could definitely be taken as a positive and a negative depending on your preference but I think it's cool that it actually acknowledges this, I'm hoping that means it's overall prompt adherence is quite good.

I'll probably do a bit more testing tonight but I'm just curious what's the general consensus so far?

Open Reddit thread
View more discussions →
FAQ

Common questions about Grok 4.20

What is the context window size for Grok 4.20?

Grok 4.20 supports a context window of up to 2 million tokens, allowing it to process very long inputs in a single request.

Why is reasoning disabled in this variant?

Reasoning is disabled to reduce response latency. This makes the model faster and more suitable for agentic or tool-calling workflows where speed is prioritized over extended step-by-step reasoning.

What is the training data cutoff for Grok 4.20?

According to the model metadata, the training date is listed as March 2026.

Who publishes Grok 4.20?

Grok 4.20 is published by xAI, the AI division associated with X (formerly Twitter).

What types of workloads is this model best suited for?

This model is best suited for low-latency agentic systems such as automated assistants, coding agents, and data-processing pipelines where fast tool-calling and instruction adherence are more important than deep reasoning.

Is Grok 4.20 available via API?

Yes, Grok 4.20 Beta models were released via API in March 2026, as reflected in the model's dateAdded metadata.

More models from X.ai

Continue browsing adjacent models from the same provider.

← All AI Models