Large Context Window
Processes up to 1,048,576 tokens in a single request, enabling analysis of long documents, codebases, or extended conversation histories without truncation.
Gemini 3 Pro is a multimodal text generation model developed by Google, released in November 2025. It supports a context window of 1,048,576 tokens and is designed to handle complex reasoning tasks, nuanced instruction following, and agentic workflows. The model is available to developers through Google AI Studio and Vertex AI, and is also integrated into Google Search and the Gemini app. Gemini 3 Pro is built for tasks that require understanding context and intent with minimal prompting, including multi-step problem solving, code generation, and multimodal input processing. It is positioned as Google's primary model for agentic development, including use within the Google Antigravity platform. The model accepts tool inputs alongside text and numeric parameters, making it suited for applications that require dynamic tool use and structured interactions.
High-signal model metadata in a structured two-column overview table.
The entity that provides this model.
The number of tokens supported by the input context window.
The number of tokens that can be generated by the model in a single request.
Whether the model's code is available for public use.
When the model was first released.
When the model's knowledge was last updated.
The providers that offer this model. This is not an exhaustive list.
Types of data this model can process.
A fuller summary of positioning, capabilities, and source-specific details for Gemini 3 Deprecated.
Gemini 3 Pro is a multimodal text generation model developed by Google, released in November 2025. It supports a context window of 1,048,576 tokens and is designed to handle complex reasoning tasks, nuanced instruction following, and agentic workflows. The model is available to developers through Google AI Studio and Vertex AI, and is also integrated into Google Search and the Gemini app.
Gemini 3 Pro is built for tasks that require understanding context and intent with minimal prompting, including multi-step problem solving, code generation, and multimodal input processing. It is positioned as Google's primary model for agentic development, including use within the Google Antigravity platform. The model accepts tool inputs alongside text and numeric parameters, making it suited for applications that require dynamic tool use and structured interactions.
Processes up to 1,048,576 tokens in a single request, enabling analysis of long documents, codebases, or extended conversation histories without truncation.
Applies multi-step reasoning to complex problems, designed to parse layered or ambiguous inputs and infer intent with reduced prompting.
Accepts text and image inputs together, allowing the model to interpret visual content alongside written instructions in a single request.
Supports tool-calling inputs natively, enabling integration with external APIs, functions, and agentic workflows through structured tool definitions.
Designed for multi-step agentic tasks, including autonomous planning and execution sequences used in platforms like Google Antigravity.
Generates, explains, and debugs code across multiple programming languages, with particular emphasis on interactive and vibe-coding use cases.
Primary API pricing shown in the same “quick compare” spirit as the reference page.
Additional usage-cost dimensions synced into the project for this model.
Places where this model is available, based on the synced detail-page metadata.
The configurable options currently documented for this model.
Must be less than Max Response Size
Parameters currently listed by OpenRouter or the local catalog for this model.
Benchmark scores synced from the current model source and normalized into the local catalog.
| Benchmark | Score |
|---|---|
|
AIME 2025
American math olympiad problems (2025)
|
|
|
ARC-AGI-2
Novel abstract reasoning and pattern recognition
|
|
|
GPQA Diamond
PhD-level science questions (biology, physics, chemistry)
|
|
|
HLE
Questions that challenge frontier models across many domains
|
|
|
LiveCodeBench
Real-world coding tasks from recent competitions
|
|
|
MMLU-Pro
Expert knowledge across 14 academic disciplines
|
|
|
MMMLU
Multilingual and multimodal understanding
|
|
|
SciCode
Scientific research coding and numerical methods
|
|
|
SWE-bench Verified
Real GitHub issues requiring multi-file code fixes
|
Official model cards, release notes, docs, and other references synced from the source page.
Gemini 3 Deprecated discussions are most active in r/GeminiAI, r/Bard, r/GeminiFeedback.
Top Reddit threads cluster around benchmark and model-comparison threads, safety and censorship questions, coding workflow discussions. The strongest match in this snapshot has 267 upvotes and 62 comments.
TL;DR: Gemini's web search is fundamentally broken—it only sees snippets and can't read actual webpage content like every other LLM provider. Deep Research has the same limitation plus ignores instructions to force academic-style essays regardless of what you ask for. The model searches poorly (overly specific queries), uses rigid planning based on outdated internal knowledge, and provides zero visibility into its search process. Simple architectural fixes exist but Google hasn't implemented them.
Gemini has by far the worst web search functionality of EVERY LLM provider.
Both on the web app and when "Grounding with Google Search" is enabled within AI Studio or API, the model gets access to a tool called `google:search`. You'd think that with access to a world-class search engine, the model would be able to comprehensively investigate a topic, but that's far from reality.
The Google search integration is a complete mess that actively sabotages Gemini by choking it with a bunch of snippets instead of letting it read actual content like every other LLM provider on the planet.
Here's an example of what the tool gives the model when it searches for "platypus facts":
```
[SearchResults(query="platypus facts", results=[PerQueryResult(index='1.1', snippet='9 Interesting <b>platypus facts</b> | WWF Australia: (2024-04-10) 1. Platypuses are venomous. They might look cute and cuddly but come across a male platypus in mating season and you\'ll be in for a painful shock.\n...\n(2024-04-10) The platypus is an iconic Australian mammal...', source_title='wwf.org.au', url='https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQHNg7PSLRYuSjOBOD9c_cflXpDFWHSjp8JT9sk-l0RvBihzxPrHShhqA_cU5X-gkNVpzMQEkdFCDRmot6RbYTVXPA1ssJoLketh0wResHmnhF8KI5CT_xUN-Zf6WX29WRFDkPlDjNV_6-uZs0cU3wVO'), PerQueryResult(...
```
First, I don't agree with giving the model a structured output for a request for inherently unstructured data. Second, it makes no sense to have HTML tags like `<b></b>` within the response; the model speaks Markdown, not HTML, so why give it pseudo-HTML?
But the most glaring issue is that the model is kneecapped in the sense that it CANNOT open a specific website it gets from its search query to read its content beyond the snippet it's given. This is fine for basic queries, but for multi-step research it renders the model incapable of investigating something thoroughly. For example, if you ask it for the schema of a specific API it doesn't have in its knowledge, it can search for that API, but much of it will be omitted from the snippets. Since it can't actually read the website, the only way to ascertain the rest of the schema is guesswork.
For reference, OpenAI feeds its models something like this:
```
Horses on Venus: Myth, Mirage, or Meteorology? (https://www.interplanetary-equestrians.org/horses-on-venus)
[wordlim: 120] Published: 2 days ago; The idea of "Venusian horses" began as a misinterpretation of atmospheric radio echoes recorded by early orbiters...
Imaginary Creatures of the Inner Solar System (https://galacticfieldguide.example/venus/imaginary-horses)
[wordlim: 200] Content type: text/html; 14 Feb 2022 — In speculative xenobiology, "horses on Venus" are often depicted as translucent, buoyant organisms...
```
Notice how it returns semi-structured text rather than a rigid schema?
OpenAI also gives its models the capability to open a specific link, which will be parsed and returned back to the model in a Markdown-ish format.
To compound all of this, you're literally unable to see anything related to the model's search queries on the Gemini web app, and in the API you're only able to see a list of search queries used _after_ the response is complete. You have no visibility into where the query took place within its chain-of-thought, which is crucial when you're trying to determine the comprehensiveness of the model's search efforts. For example: "Did the model search for XYZ, find only half the picture, then search for the other half? Or did the model just search the web to tick a box and return half-assed results?"
To top it all off, the model _clearly_ was not fine-tuned with effective web searches in mind. For such a large model, its extreme tendency to rely on internal knowledge when faced with a task clearly focused on recency is just baffling.
For example, when I asked it "what is the latest gemini model?", it searched for "latest google gemini model november 2025", "Gemini 2.0 release date rumors November 2025", "Gemini 1.5 Pro updates November 2025". We can all see the issue here: it completely jumps the gun by running targeted queries rather than broad ones for time-sensitive questions.
In fact, this applies to Gemini in so many other areas. For example, in agentic coding, it's extremely eager and will completely refactor your codebase despite instructions to only modify a single file.
A model like GPT-5.1, which clearly has had a better SFT/RLHF pipeline than Gemini 3 for tool calling, shows much more maturity: when I asked it the same question, it searched for "latest Google Gemini model November 2025" and '"announces" "Gemini" model October November 2025'.
You'd think that the Deep Research feature on the Gemini app would solve some of these pain points, but it doesn't _and_ brings so many of its own.
The Deep Research feature STILL uses the same shitty web search logic that only returns snippets, meaning it still has the same architectural limitation of not being able to read a specific website's contents. Therefore, the whole purpose of "deeper" research is completely negated because more snippet confetti ≠ better results.
Additionally, the system prompt for Deep Research is UTTERLY GARBAGE. I've never seen a system that so blatantly and repeatedly ignores instructions. If you tell it to organize the document a certain way, it just won't. Ask it—multiple times if you'd like—to not add an intro and conclusion to the document. The rest is better left unsaid.
Let's look at an example:
I asked the Deep Research feature (on Gemini 3 Pro) to give me a comprehensive technical specification for implementing an OpenAI API wrapper. I was extremely explicit: no intro or conclusion, just the implementation details. I needed JSON schemas, exact request/response examples, streaming formats, error handling, authentication headers, etc. I literally said "give me A LOT of JSON examples" and "this should be comprehensive enough to fully serve as a single source of truth to implement this interface with no external sources."
What did I get? A fucking thesis paper titled "The Architectural Evolution of Agentic Intelligence: A Deep Dive into the OpenAI Responses API" complete with an Executive Summary and a Conclusion section. It gave me exactly what I told it not to give me.
The entire document is full of this pretentious bullshit. It talks about an "inflection point" in AI development and the "burgeoning field of Agentic AI." It uses "ontology" to describe a basic API object model. "Locus of control." "Cognitively robust." "Heterogeneous Output Items." It describes how the API works as "Mechanism of Action" like it's a pharmaceutical drug. There's a section about "The Fragmentation of Multimodality" when all I needed was "here's how to send a PDF as inline data in a request." Another one called "Computer Use: The Frontier of Agency" that says absolutely nothing.
Where are the JSON examples? I asked for implementation details and got vague descriptions. It mentions structured outputs exist but doesn't show me a single actual request. It says there are different SSE event types for streaming but doesn't give me the shape of those events. It talks about encrypted reasoning but where's the actual parameter I need to set? I asked for exact authentication headers and base URLs. I got tables with headers like "The Taxonomy of Response Items" instead.
The whole thing is 90% fluff about why stateful APIs are important and 10% hand-waving at technical details. I can't implement anything from this. I asked for a production-ready spec and got nothing of use.
It researched 72 sources—it had to have more than enough material to give me what I asked for. All it had to do was distill that into actual implementation details I could use, but instead it decided to waste my time with garbage.
This isn't a one-off problem either. Every single prompt I give Deep Research comes back with the same academic paper structure. It doesn't matter how explicitly you tell it what you want. The system prompt clearly just forces it to write these pseudo-intellectual essays regardless of what you actually ask for.
The planning system is also utter trash and limits the model significantly. The model has a huge tendency to rely on its internal knowledge when creating research plans rather than approaching queries with appropriate uncertainty. When you ask about something recent, it will confidently scaffold out a plan based on what it knew before its training cutoff, filling in specific entity names, version numbers, and technical details that may have completely changed since then.
Say you ask about a niche API that got a major overhaul last month. Instead of planning "search broadly for the latest documentation, then investigate specific endpoints based on what's found," it will generate a plan like "look up the authentication flow for version 2.3, find the (deprecated) webhook format, investigate the (legacy) response structure." It's operating on stale assumptions and then executing that flawed plan with confidence, completely missing the actual current state of things because it never ran a broad query to begin with.
This rigidity compounds the problem because later research steps often depend on discoveries made in earlier ones. You need the flexibility to pivot when you find something unexpected. By locking the model into a predetermined sequence of specific searches, you're preventing it from adapting its approach based on what it actually finds.
The most frustrating part is that the model doesn't need this hand-holding. It's perfectly capable of doing adaptive, freeform research. OpenAI and Anthropic don't force their models through these rigid planning hoops because they trust the model to dynamically adjust its search strategy as it learns more (note: Anthropic kind of does this because they use subagents, but it's able to conduct preliminary research before spawning the parallel subagents).
Even if Google would like to keep this planning system, at least give the planning model the ability to conduct preliminary research so it has a _general_ idea of what it's about to investigate instead of formulating a single-source-of-truth plan with outdated knowledge.
After all, Gemini 3 is still a Preview model, so many of these tool-calling issues will likely be ironed out in the final release (this is Google's first "proper" model built for the world of agents). However, the web search limitation is a purely architectural limitation; this _desperately_ needs to get reworked:
- Allow the model to search and get web snippets, but **also** allow the model to retrieve the full Markdown content of a webpage — Google basically owns the internet, a simple webpage → Markdown conversion is not akin to boiling the ocean.
- Surface web search requests within API responses so it's easy to see _where_ in a model's reasoning trace it searched the web, and how many individual web calls it produced.
- Try and train out the model's tendency to launch hyper-specific queries on time-sensitive topics or niche topics; instead teach it to launch a preliminary, broad investigation before running targeted search queries.
- Allow us to add our own tools in tandem with the Google Search tool. Currently, the Google Search tool restricts the ability to add custom tools to requests, which is severely limiting.
- Completely overhaul the Deep Research system prompt: remove the requirement of an academic report and instead keep it as a default that will be overridden if specified by the user's prompt. Deep Research should _not_ be mandated to write reports; it should be seen as an agent with more in-depth search capabilities that can accomplish anything regular Gemini can do, just with more source-based backing.
- Completely overhaul the Deep Research planning phase: either a) allow the model to conduct preliminary research, b) explicitly instruct the model to not go into any specifics the user didn't explicitly provide in the research plan, or c) remove it completely; since Gemini doesn't employ a subagent-based approach for Deep Research a plan is, by all means, unnecessary.
For me, the most important thing that needs to happen is that the model needs a dedicated tool to fetch the contents of a specific website. Gemini is the de facto "long context window" model; allowing it to fetch full websites will allow us to truly exploit this extremely impressive context window and coherence/recall strength.
---
The frustrating reality is that this isn't even hard to implement. I've personally built web search tools that allow models to genuinely search the web and read page content effectively. Solutions for HTML-to-Markdown conversion already exist (like [Turndown](https://github.com/mixmark-io/turndown) and [html-to-markdown-rs](https://crates.io/crates/html-to-markdown-rs)), and building a custom implementation for a company of Google's scale would be trivial.
I hope to see these issues addressed soon.
At the beginning Gemini 3 is still comparable against chatgpt 5.2 thinking. But now I feel like it is becoming much lazier, it tends to make up nonsense and not actually searching stuff for proof. Even I try to put some really strict rule in gem to force it to go through a chain of thinking and searching, it still very very lazy(average 3 search per question). It sometimes also hallucinate “I don’t have internet access”
Today I am trying to debug a problem, Gemini is continuously trying to refer me to some old deprecated github page.I am finally getting tired of this bs, and I went to chatgpt 5.2, switch to extended thinking mode. It took ten minutes to give an answer, but it honestly go through dozes of websites and documents, and succeed in one try. For anyone who is “optimizing” Gemini, you are creating something that is really stupid
At the beginning Gemini 3 is still comparable against chatgpt 5.2 thinking. But now I feel like it is becoming much lazier, it tends to make up nonsense and not actually searching stuff for proof. Even I try to put some really strict rule in gem to force it to go through a chain of thinking and searching, it still very very lazy(average 3 search per question). It sometimes also hallucinate “I don’t have internet access”
Today I am trying to debug a problem, Gemini is continuously trying to refer me to some old deprecated github page.I am finally getting tired of this bs, and I went to chatgpt 5.2, switch to extended thinking mode. It took ten minutes to give an answer, but it honestly go through dozes of websites and documents, and succeed in one try. For anyone who is “optimizing” Gemini, you are creating something that is really stupid
Hello, I'm a customer of Firebase and have been working with Firebase Console and Studio for about 4 months since last December of 2024. Every day I've on average spent over 10 hours developing my app and website with great progress. However, when March 9th arrived Gemini AI, the agent I thought was reliable (3 Pro Preview) went haywire and reverted files to a much older version.
I've since not been able to restore the changes to what it was, I've tried asking the agent to do so to no avail. Later, I learned that Gemini 3 Pro Preview got deprecated. This is likely the reason why my website was changed significantly. It must've reverted back to an older version.
Firebase has failed to remove "Gemini 3 Pro Preview" from the Firebase Studio as of March 13, 2026. There hasn't been a warning or notice that it would get deprecated. Customers are under the impression that it's Gemini 3 Pro Preview, but I'm sure it isn't because this sort of AI behavior has only been happening with 2.5 Pro.
I'm going to be patient and hope they update soon... I've been looking at alternatives, but would prefer to stay with Firebase.
According to this page: https://ai.google.dev/gemini-api/docs/models#preview
Preview models are given at least two weeks notice between deprecation and shutdown. However, the deprecation of Gemini 3 Pro Preview was announced on February 26: https://ai.google.dev/gemini-api/docs/changelog, which is less than two weeks before the planned March 9 shutdown date. (two weeks later would be the 12th). My hope is that the shutdown could be delayed until at least March 12 in order to comply with the policy and to give users more time to migrate. Thank you.
Gemini 3 Pro supports a context window of 1,048,576 tokens, which allows it to process very long documents, extended conversations, or large codebases in a single request.
Based on the available metadata, Gemini 3 Pro has a training date of November 2025. Specific knowledge cutoff details should be confirmed in the official Google AI documentation.
Gemini 3 Pro is available to developers through Google AI Studio and Vertex AI, as well as through MindStudio without requiring separate API key setup.
The model accepts select, number, and tools input types, making it compatible with structured tool-calling workflows in addition to standard text prompts.
Yes. Gemini 3 Pro is described by Google as their primary agentic model and is integrated into the Google Antigravity agentic development platform. It supports tool use natively, which is a key requirement for agentic task execution.
Continue browsing adjacent models from the same provider.