Perplexity

Sonar

Sonar is Perplexity AI's in-house text generation model, built on Meta's Llama 3.3 70B and optimized for web-grounded question answering. Released in January 2025, it retrieves live internet data at query time rather than relying solely on static training knowledge, and every response includes inline source citations for transparency. It supports a 128,000-token context window and runs at approximately 121 tokens per second using Cerebras wafer-scale inference. Sonar is designed for developers and businesses that need to embed fast, factual, and source-backed search capabilities into their own applications. It offers three search depth modes — High, Medium, and Low — allowing teams to balance thoroughness against response speed depending on their use case. On the SimpleQA benchmark, Sonar achieved an F-score of 0.773, reflecting its focus on factual accuracy. It is particularly well-suited for high-volume applications such as sales research tools, medical information platforms, and real-time in-meeting search features.

Jan 27, 2025 128,000 context 32,768 tokens output
Real-Time Web Search Inline Source Citations 128K Token Context High-Speed Inference Adjustable Search Depth API Integration

Model Overview

High-signal model metadata in a structured two-column overview table.

Provider

The entity that provides this model.

Perplexity

Model ID

The routed model identifier exposed by upstream providers.

perplexity/sonar

Input Context Window

The number of tokens supported by the input context window.

128,000 tokens

Maximum Output Tokens

The number of tokens that can be generated by the model in a single request.

32,768 tokens tokens

Open Source

Whether the model's code is available for public use.

No

Release Date

When the model was first released.

Jan 27, 2025 1 year ago

Knowledge Cut-off Date

When the model's knowledge was last updated.

Unknown

API Providers

The providers that offer this model. This is not an exhaustive list.

Perplexity

Modalities

Types of data this model can process.

Text Image

What is Sonar

A fuller summary of positioning, capabilities, and source-specific details for Sonar.

Sonar is Perplexity AI's in-house text generation model, built on Meta's Llama 3.3 70B and optimized for web-grounded question answering. Released in January 2025, it retrieves live internet data at query time rather than relying solely on static training knowledge, and every response includes inline source citations for transparency. It supports a 128,000-token context window and runs at approximately 121 tokens per second using Cerebras wafer-scale inference.

Sonar is designed for developers and businesses that need to embed fast, factual, and source-backed search capabilities into their own applications. It offers three search depth modes — High, Medium, and Low — allowing teams to balance thoroughness against response speed depending on their use case. On the SimpleQA benchmark, Sonar achieved an F-score of 0.773, reflecting its focus on factual accuracy. It is particularly well-suited for high-volume applications such as sales research tools, medical information platforms, and real-time in-meeting search features.

Capabilities

What Sonar supports

AI

Real-Time Web Search

Grounds every response in live internet data retrieved at query time, rather than relying on static training knowledge alone.

AI

Inline Source Citations

Automatically includes inline citations with each answer, linking responses directly to their source URLs for verifiability.

CTX

128K Token Context

Supports a 128,000-token context window, enabling extended conversations and analysis of long documents within a single request.

AI

High-Speed Inference

Achieves approximately 121 tokens per second using Cerebras wafer-scale inference, enabling sub-second response times for high-volume workloads.

AI

Adjustable Search Depth

Offers High, Medium, and Low search depth modes so developers can tune the balance between answer thoroughness and response latency.

API

API Integration

Available via the Sonar API, allowing developers to embed generative search directly into their own products without building retrieval infrastructure.

Pricing for Sonar

Primary API pricing shown in the same “quick compare” spirit as the reference page.

Price Comparison

Additional usage-cost dimensions synced into the project for this model.

Web search $5000.00
maxTemperature 1.9
maxResponseSize 32,768 tokens

API Access & Providers

Places where this model is available, based on the synced detail-page metadata.

Perplexity

Provider Endpoints

Endpoint-level provider data currently available for this model.

Perplexity

1d uptime: 100.0% Supported params: 7 Implicit caching: No

Configuration & Parameters

The configurable options currently documented for this model.

Return Citations

Select

Determines whether or not a request to an online model should return citations.

Default: false
No Yes

Return Images

Select

Determines whether or not a request to an online model should return images.

Default: false
No Yes

Search Context Size

Select

Controls how much web information is retrieved. Higher context provides more comprehensive results but costs more per request.

Default: low
Low (Fastest, cheapest) Medium (Balanced) High (Best for research)

Supported Request Parameters

Parameters currently listed by OpenRouter or the local catalog for this model.

Return Citations Return Images Search Context Size

Model Performance

Benchmark scores synced from the current model source and normalized into the local catalog.

Benchmark Score
AIME 2024
American math olympiad problems
48.7%
GPQA Diamond
PhD-level science questions (biology, physics, chemistry)
47.1%
HLE
Questions that challenge frontier models across many domains
7.3%
LiveCodeBench
Real-world coding tasks from recent competitions
29.5%
MATH-500
Undergraduate and competition-level math problems
81.7%
MMLU-Pro
Expert knowledge across 14 academic disciplines
68.9%
SciCode
Scientific research coding and numerical methods
22.9%

Resources & Documentation

Official model cards, release notes, docs, and other references synced from the source page.

Community discussion

What people think about Sonar

Sonar discussions are most active in r/DispatchAdHoc, r/todayilearned, r/worldnews. Top Reddit threads cluster around benchmark and model-comparison threads.

The strongest match in this snapshot has 74036 upvotes and 375 comments.

r/DispatchAdHoc 5,915 upvotes 663 comments February 22, 2026
Hot take: Coupé is better to keep over Sonar

Here are my arguments:

* **Thematically, it makes sense:** Coupé was one of the first Z-Teamers you actually meet, and she's very friendly to Robert; it doesn't really make sense to me that Robert would fire one of the first people he's friendly towards
* **It makes more sense to me that Sonar would join the Red Ring:** He's a far-right nutjob who takes cult-like fandom seriously. I could see him falling in line with Shroud's manipulation and money over Coupé, who has a level of professionalism with how she does things
* **Fight scene looks cooler:** Giant bat boss, what's not to like about it?
* **Coupé scene mocking Blonde Blazer is weird:** IDK, hearing "Ok mom, she's bluffing" from Coupé feels awkward, she has goofy moments, but she's still a serious character. Then you have Sonar, a goofy character that blows a raspberry at Blazer, which I feel fits way better.

IDK, just my personal take.

Open Reddit thread
r/steelseries 20 upvotes 50 comments November 21, 2025
Sick of SteelSeries Sonar, what other brands have more stable SOFTWARE?

I just want a quality headset with quality SOFTWARE. Which company (Logitech, Razer, HyperX etc) have the best software? And by "best" I mean simple, RELIABLE. I just want sound and mic for gaming, not too much to ask! (when not on the headset I have AudioEngine A5+ speakers)

I have an Arctis 7 headset which worked perfectly for years with the SteelSeries 'Engine' software, until the Sonar software turned up. Now it just seems to be a nightmare - constantly having to go in to the settings because it's not detecting my headset, chatmix balance is bugged, and other miscellaneous stuff. I'm on a Windows 11 PC with modern hardware.

Open Reddit thread
r/DispatchAdHoc 22 upvotes 13 comments November 28, 2025
Sonar had a uniquely hard life compared to most of Z-Team

Think about growing up as a bat from the neck up. Yeah, people with powers seem to be generally accepted in the Dispatch world, but Sonar didn’t have privilege of looking like a normal human like most of the Z-team (sans Golem/Malevola). Even if he wasn’t born with his powers and got them around puberty, I can’t imagine he didn’t get shit from his peers over his appearance. Not to mention occasionally turning full bat monster. And maybe that led him to use his smarts as a defense mechanism and turn to drugs to cope.

Open Reddit thread
View more discussions →
FAQ

Common questions about Sonar

What is the context window size for Sonar?

Sonar supports a context window of 128,000 tokens, which allows for extended conversations and analysis of lengthy documents in a single request.

Does Sonar have a knowledge cutoff date?

Sonar retrieves live web data at query time, so its answers are not limited to a static training cutoff. The model itself was launched in January 2025, and its underlying Llama 3.3 70B base has its own training data cutoff, but real-time search supplements this with current information.

How is Sonar priced?

Pricing details for Sonar via the Sonar API are available on Perplexity's official API overview page at sonar.perplexity.ai. Sonar is also used to power Perplexity's free consumer tier.

What model is Sonar built on?

Sonar is built on Meta's Llama 3.3 70B and has been optimized by Perplexity AI for web-grounded, real-time question answering with citation support.

How accurate is Sonar on factual questions?

On the SimpleQA benchmark, which tests factual accuracy in language models, Sonar achieved an F-score of 0.773.

More models from Perplexity

Continue browsing adjacent models from the same provider.

← All AI Models