X.ai

Grok 3 Mini

Grok 3 Mini Beta is a compact text generation model developed by xAI, the AI division of X. It is designed as a thinking model, meaning it reasons through problems step by step before producing a final answer, and it exposes that reasoning trace so users can follow the model's logic in full. The model supports adjustable reasoning effort, defaulting to a lower setting for speed but allowing a high-effort mode for more demanding problems. It has a 131,072-token context window and was trained with data up to April 2025. Grok 3 Mini is best suited for tasks that rely heavily on structured reasoning rather than broad world knowledge — including math problems, logic puzzles, coding challenges, and quantitative analysis. According to xAI's published benchmarks, it scores 95.8% on AIME 2024 and 80.4% on LiveCodeBench. It also supports function calling and web search, making it usable in agentic workflows. Epoch AI has noted that with high reasoning effort, Grok 3 Mini outperforms the larger Grok 3 model on math benchmarks.

Unknown 131,072 context 8,192 tokens output
Step-by-Step Reasoning Adjustable Reasoning Effort Math and Quantitative Reasoning Function Calling Web Search Long Context Window

Model Overview

High-signal model metadata in a structured two-column overview table.

Provider

The entity that provides this model.

X.ai

Input Context Window

The number of tokens supported by the input context window.

131,072 tokens

Maximum Output Tokens

The number of tokens that can be generated by the model in a single request.

8,192 tokens tokens

Open Source

Whether the model's code is available for public use.

No

Release Date

When the model was first released.

Unknown

Knowledge Cut-off Date

When the model's knowledge was last updated.

Unknown

API Providers

The providers that offer this model. This is not an exhaustive list.

xAI API, OpenAI API

Modalities

Types of data this model can process.

Text Code

What is Grok 3 Mini

A fuller summary of positioning, capabilities, and source-specific details for Grok 3 Mini.

Grok 3 Mini Beta is a compact text generation model developed by xAI, the AI division of X. It is designed as a thinking model, meaning it reasons through problems step by step before producing a final answer, and it exposes that reasoning trace so users can follow the model's logic in full. The model supports adjustable reasoning effort, defaulting to a lower setting for speed but allowing a high-effort mode for more demanding problems. It has a 131,072-token context window and was trained with data up to April 2025.

Grok 3 Mini is best suited for tasks that rely heavily on structured reasoning rather than broad world knowledge — including math problems, logic puzzles, coding challenges, and quantitative analysis. According to xAI's published benchmarks, it scores 95.8% on AIME 2024 and 80.4% on LiveCodeBench. It also supports function calling and web search, making it usable in agentic workflows. Epoch AI has noted that with high reasoning effort, Grok 3 Mini outperforms the larger Grok 3 model on math benchmarks.

Capabilities

What Grok 3 Mini supports

RN

Step-by-Step Reasoning

The model works through problems before responding and exposes its full thinking trace, letting users follow each reasoning step to the final answer.

RN

Adjustable Reasoning Effort

Reasoning depth can be set to low or high effort via a simple parameter, trading speed for thoroughness depending on problem complexity.

RN

Math and Quantitative Reasoning

Achieves 95.8% on AIME 2024 and 80.4% on LiveCodeBench, reflecting strong performance on structured mathematical and coding problems.

AI

Function Calling

Supports tool use via function calling, enabling integration into agentic workflows that require the model to invoke external functions.

AI

Web Search

Can perform web search as a tool, allowing the model to retrieve current information during a reasoning session.

CTX

Long Context Window

Supports a 131,072-token context window, accommodating long documents, multi-turn conversations, or extended reasoning chains in a single session.

Pricing for Grok 3 Mini

Primary API pricing shown in the same “quick compare” spirit as the reference page.

Price Comparison

Additional usage-cost dimensions synced into the project for this model.

maxTemperature 1
maxResponseSize 8,192 tokens

API Access & Providers

Places where this model is available, based on the synced detail-page metadata.

xAI API OpenAI API

Model Performance

Benchmark scores synced from the current model source and normalized into the local catalog.

Benchmark Score
AIME 2024
American math olympiad problems
93.3%
GPQA Diamond
PhD-level science questions (biology, physics, chemistry)
79.1%
HLE
Questions that challenge frontier models across many domains
11.1%
LiveCodeBench
Real-world coding tasks from recent competitions
69.6%
MATH-500
Undergraduate and competition-level math problems
99.2%
MMLU-Pro
Expert knowledge across 14 academic disciplines
82.8%
SciCode
Scientific research coding and numerical methods
40.6%

Resources & Documentation

Official model cards, release notes, docs, and other references synced from the source page.

Community discussion

What people think about Grok 3 Mini

Grok 3 Mini discussions are most active in r/singularity, r/grok, r/accelerate. Top Reddit threads cluster around benchmark and model-comparison threads, coding workflow discussions.

The strongest match in this snapshot has 234 upvotes and 322 comments.

r/singularity 234 upvotes 322 comments February 18, 2025
Grok 3 mini for Free

Source : https://x.com/archanfel_anoth/status/1891742175009865811?t=NmJ66xTh2Sh4CxVqIutCvg&s=19

Open Reddit thread
r/accelerate 44 upvotes 153 comments February 18, 2025
People are seriously downplaying the performance of Grok 3

I know we all have ill feelings about Elon, but can we seriously not take one second to validates its performance objectively.

People are like "Well, it is still worse than o3", we do not have access to that yet, it uses insane amounts of compute, and the pre-training only stopped a month ago, there is still much much potential to train the thinking models to exceed o3. Then there is "Well, it uses 10-15x more compute, and it is barely an improvement, so it is actually not impressive at all". This is untrue for three reason.
Firstly Grok-3 is definitely a big step up from Grok 2.
Secondly scaling has always been very compute-intensive, there is a reason that intelligence had not been a winning evolutionary trait for a long time and still is. It is expensive. If we could predictably get performance improvements like this for every 10-15x scaling in compute, then we would have Superintelligence in no time, especially considering how now three scaling paradigms stack on top of each other: Pre-Training, Post-Training and RL, inference-time-compute.
Thirdly if you look at the LLaMA paper in 54 days of training with 16000 H100, they had 419 component failures, and the small XAI team is training on 100-200 thousands \~h100's for much longer. This is actually quite an achievement.

Then people are also like "Well, GPT-4.5 will easily destroy this any moment now". Maybe, but I would not be so sure. The base Grok 3 performance is honestly ludicrous and people are seriously downplaying it.

https://preview.redd.it/yja5hpeg0xje1.png?width=1080&format=png&auto=webp&s=5e61509ec39c45f7a8e6a610fd4724879d8825d8

When Grok 3 is compared to other base models, it is waay ahead of the pack. People got to remember the difference between the old and new Claude 3.5 sonnet was only 5 points in GPQA, and this is 10 points ahead of Claude 3.5 Sonnet New. You also got to consider the controversial maximum of GPQA Diamond is 80-85 percent, so a non-thinking model is getting close to saturation. Then there is Gemini-2 Pro. Google released this just recently, and they are seriously struggling getting any increase in frontier performance on base-models. Then Grok 3 just comes along and pushes the frontier ahead by many points.

I feel like a part of why the insane performance of Grok 3 is not validated more is because of thinking models. Before thinking models performance increases like this would be absolutely astonishing, but now everybody is just meh. I also would not count out Grok 3 thinking model getting ahead of o3, given its great performance gains, while still being in really early development.

https://preview.redd.it/ldo63vxx0xje1.png?width=2560&format=png&auto=webp&s=3d7961974a10243c1b218af6cc1262659ab30b7a

The grok 3 mini base model is approximately on par with all the other leading base-models, and you can see its reasoning version actually beating Grok-3, and more importantly the performance is actually not too far off o3. o3 still has a couple of months till it gets released, and in the mean time we can definitely expect grok-3 reasoning to improve a fair bit, possibly even beating it.

Maybe I'm just overestimating its performance, but I remember when I tried the new sonnet 3.5, and even though a lot of its performance gains where modest, it really made a difference, and was/is really good. Grok 3 is an even more substantial jump than that, and none of the other labs have created such a strong base-model, Google is especially struggling with further base-model performance gains. I honestly think this seems like a pretty big achievement.

Elon is a piece of shit, but I thought this at least deserved some recognition, not all people on the XAI team are necessarily bad people, even though it would be better if they moved to other companies. Nevertheless this should at least push the other labs forward in releasing there frontier-capabilities so it is gonna get really interesting!

Open Reddit thread

First independent evaluations of Grok 3 suggests it is a very good non-reasoner model, but behind the major reasoners. Grok 3 mini, which is a reasoner, is a solid competitor in the space.

That Google Gemini 2.5 benchmark, though.

link to the tweet [https://x.com/EpochAIResearch/status/1910685268157276631](https://x.com/EpochAIResearch/status/1910685268157276631)

Open Reddit thread
r/OpenAI 117 upvotes 93 comments April 19, 2025
Grok 3 mini Reasoning enters the room

It's a real model thunderstorm these days! Cheaper than DeepSeek. Smarter at coding and math than 3.7 Sonnet, only slightly behind Gemini 2.5 Pro and o4-mini (o3 evaluation not yet included).

Open Reddit thread
r/cursor 134 upvotes 67 comments April 10, 2025
Grok 3 and Grok 3 Mini now available

We've added Grok 3 and Grok 3 Mini to Cursor!

Both models support Agent mode:

* **Grok 3**: Premium model
* **Grok 3 Mini**: Currently free for all users (will announce pricing changes beforehand)

To enable them, go to **Cursor Settings → Models**.

* [Cursor docs: Models](https://docs.cursor.com/settings/models)
* [Grok 3 docs](https://x.ai/api#pricing)

Give them a try and let us know what you think!

Open Reddit thread
View more discussions →
FAQ

Common questions about Grok 3 Mini

What is the context window for Grok 3 Mini?

Grok 3 Mini Beta supports a context window of 131,072 tokens, which can accommodate long documents, extended conversations, or lengthy reasoning chains in a single request.

What is the knowledge cutoff date for Grok 3 Mini?

Based on the model metadata, Grok 3 Mini Beta has a training data cutoff of April 2025.

How does the reasoning effort setting work?

Grok 3 Mini defaults to a lower reasoning effort for faster responses. You can set it to high effort for more complex problems, which causes the model to spend more time working through its thinking trace before producing an answer.

Can Grok 3 Mini be used in agentic or tool-use workflows?

Yes. Grok 3 Mini supports function calling and web search, making it compatible with agentic workflows where the model needs to invoke external tools or retrieve live information.

What types of tasks is Grok 3 Mini best suited for?

Grok 3 Mini is designed for tasks that require structured reasoning, such as math problems, logic puzzles, coding challenges, and quantitative analysis. It is less optimized for tasks requiring broad real-world or factual knowledge.

More models from X.ai

Continue browsing adjacent models from the same provider.

← All AI Models