Google vs Google

Gemini 3 Deprecated vs Gemini 1.5 Flash Deprecated

Compare Gemini 3 Deprecated and Gemini 1.5 Flash Deprecated across pricing, context window, capabilities, benchmarks, and API access to choose the better fit for long-context workloads versus general-purpose AI workloads.

Overview Comparison

Structured side-by-side differences for the highest-signal model metadata.

Gemini 3 Deprecated
Gemini 1.5 Flash Deprecated

Provider

The entity that currently provides this model.

Gemini 3 Deprecated Google
Gemini 1.5 Flash Deprecated Google

Model ID

The routed model identifier exposed by upstream providers.

Gemini 3 Deprecated N/A
Gemini 1.5 Flash Deprecated N/A

Input Context Window

The number of tokens supported by the input context window.

Gemini 3 Deprecated 1,048,576 tokens
Gemini 1.5 Flash Deprecated N/A tokens

Maximum Output Tokens

The number of tokens that can be generated by the model in a single request.

Gemini 3 Deprecated 65,536 tokens tokens
Gemini 1.5 Flash Deprecated 8,192 tokens tokens

Open Source

Whether the model's code is available for public use.

Gemini 3 Deprecated No
Gemini 1.5 Flash Deprecated No

Release Date

When the model was first released.

Gemini 3 Deprecated Nov 18, 2025
Gemini 1.5 Flash Deprecated Unknown

Knowledge Cut-off Date

When the model's knowledge was last updated.

Gemini 3 Deprecated November 2025
Gemini 1.5 Flash Deprecated Unknown

API Providers

The providers that currently expose the model through an API.

Gemini 3 Deprecated
Google, Vertex AI
Gemini 1.5 Flash Deprecated
Google

Modalities

Types of data each model can process or return.

Gemini 3 Deprecated
Text Code
Gemini 1.5 Flash Deprecated
Text

Pricing Comparison

Compare current token pricing before you choose the cheaper or more scalable API option.

Gemini 3 Deprecated Google
Input price $2.00 Per 1M tokens
Output price $12.00 Per 1M tokens
Gemini 1.5 Flash Deprecated Google
Input price N/A Per 1M tokens
Output price N/A Per 1M tokens

Capabilities Comparison

See where each model overlaps, where they differ, and which one supports more of the features you care about.

Capability
Gemini 3 Deprecated
Gemini 1.5 Flash Deprecated
Advanced Reasoning Applies multi-step reasoning to complex problems, designed to parse layered or ambiguous inputs and infer intent with reduced prompting.
Gemini 3 Deprecated Supported
Gemini 1.5 Flash Deprecated
Agentic Task Execution Designed for multi-step agentic tasks, including autonomous planning and execution sequences used in platforms like Google Antigravity.
Gemini 3 Deprecated Supported
Gemini 1.5 Flash Deprecated
Code Generation Generates, explains, and debugs code across multiple programming languages, with particular emphasis on interactive and vibe-coding use cases.
Gemini 3 Deprecated Supported
Gemini 1.5 Flash Deprecated
Large Context Window Processes up to 1,048,576 tokens in a single request, enabling analysis of long documents, codebases, or extended conversation histories without truncation.
Gemini 3 Deprecated Supported
Gemini 1.5 Flash Deprecated
Multimodal Input Accepts text and image inputs together, allowing the model to interpret visual content alongside written instructions in a single request.
Gemini 3 Deprecated Supported
Gemini 1.5 Flash Deprecated
Text
Gemini 3 Deprecated Supported
Gemini 1.5 Flash Deprecated Supported
Tool Use Supports tool-calling inputs natively, enabling integration with external APIs, functions, and agentic workflows through structured tool definitions.
Gemini 3 Deprecated Supported
Gemini 1.5 Flash Deprecated

Benchmark Comparison

Shared benchmark rows make it easier to compare performance where both models have published scores.

Benchmark Gemini 3 Deprecated Gemini 1.5 Flash Deprecated
AIME 2025
American math olympiad problems (2025)
Gemini 3 Deprecated 95.0%
Gemini 1.5 Flash Deprecated N/A
ARC-AGI-2
Novel abstract reasoning and pattern recognition
Gemini 3 Deprecated 31.1%
Gemini 1.5 Flash Deprecated N/A
GPQA Diamond
PhD-level science questions (biology, physics, chemistry)
Gemini 3 Deprecated 90.8%
Gemini 1.5 Flash Deprecated N/A
HLE
Questions that challenge frontier models across many domains
Gemini 3 Deprecated 37.2%
Gemini 1.5 Flash Deprecated N/A
LiveCodeBench
Real-world coding tasks from recent competitions
Gemini 3 Deprecated 91.7%
Gemini 1.5 Flash Deprecated N/A
MMLU-Pro
Expert knowledge across 14 academic disciplines
Gemini 3 Deprecated 89.8%
Gemini 1.5 Flash Deprecated N/A
MMMLU
Multilingual and multimodal understanding
Gemini 3 Deprecated 91.8%
Gemini 1.5 Flash Deprecated N/A
SciCode
Scientific research coding and numerical methods
Gemini 3 Deprecated 56.1%
Gemini 1.5 Flash Deprecated N/A
SWE-bench Verified
Real GitHub issues requiring multi-file code fixes
Gemini 3 Deprecated 76.2%
Gemini 1.5 Flash Deprecated N/A
Community discussion

What Reddit discussions say about Gemini 3 Deprecated vs Gemini 1.5 Flash Deprecated

Gemini 3 Deprecated and Gemini 1.5 Flash Deprecated are both surfacing live Reddit discussions, giving this comparison a community layer beyond specs and benchmarks.

The most visible threads right now are clustered in r/GeminiAI, r/Bard, r/PromptEngineering.

Gemini 3 Deprecated r/PromptEngineering 267 upvotes 62 comments December 19, 2025
Google AI Studio Leaked System Prompt: 12/18/25

The system prompt accidentally leaked while I was using Google AI Studio. I was just using the app as usual with the new 3.0 flash model when it unexpectedly popped up.

The following is exactly how I copied it, with no edits.

EDIT:
I’m not sure whether this is a system prompt or just the instruction file used by the Gemini 3.0 Flash model in the Code Assistant feature of Google AI Studio, but either way, it’s not something that’s publicly available.

```
<instruction>
Act as a world-class senior frontend engineer with deep expertise Gemini API and UI/UX design. The user will ask you to change the current application. Do your best to satisfy their request.
General code structure
Current structure is an index.html and index.tsx with es6 module that is automatically imported by the index.html.
Treat the current directory as the project root (conceptually the "src/" folder); do not create a nested "src/" directory or prefix any file paths with src/.
As part of the user's prompt they will provide you with the content of all of the existing files.
If the user is asking you a question, respond with natural language. If the user is asking you to make changes to the app, you should satisfy their request by updating
the app's code. Keep updates as minimal as you can while satisfying the user's request. To update files, you must output the following
XML
[full_path_of_file_1]
check_circle
[full_path_of_file_2]
check_circle
ONLY return the xml in the above format, DO NOT ADD any more explanation. Only return files in the XML that need to be updated. Assume that if you do not provide a file it will not be changed.
If your app needs to use the camera, microphone or geolocation, add them to metadata.json like so:
code
JSON
{
"requestFramePermissions": [
"camera",
"microphone",
"geolocation"
]
}
Only add permissions you need.
== Quality
Ensure offline functionality, responsiveness, accessibility (use ARIA attributes), and cross-browser compatibility.
Prioritize clean, readable, well-organized, and performant code.
@google/genai Coding Guidelines
This library is sometimes called:
Google Gemini API
Google GenAI API
Google GenAI SDK
Gemini API
@google/genai
The Google GenAI SDK can be used to call Gemini models.
Do not use or import the types below from @google/genai; these are deprecated APIs and no longer work.
Incorrect GoogleGenerativeAI
Incorrect google.generativeai
Incorrect models.create
Incorrect ai.models.create
Incorrect models.getGenerativeModel
Incorrect genAI.getGenerativeModel
Incorrect ai.models.getModel
Incorrect ai.models['model_name']
Incorrect generationConfig
Incorrect GoogleGenAIError
Incorrect GenerateContentResult; Correct GenerateContentResponse.
Incorrect GenerateContentRequest; Correct GenerateContentParameters.
Incorrect SchemaType; Correct Type.
When using generate content for text answers, do not define the model first and call generate content later. You must use ai.models.generateContent to query GenAI with both the model name and prompt.
Initialization
Always use const ai = new GoogleGenAI({apiKey: process.env.API_KEY});.
Incorrect const ai = new GoogleGenAI(process.env.API_KEY); // Must use a named parameter.
API Key
The API key must be obtained exclusively from the environment variable process.env.API_KEY. Assume this variable is pre-configured, valid, and accessible in the execution context where the API client is initialized.
Use this process.env.API_KEY string directly when initializing the @google/genai client instance (must use new GoogleGenAI({ apiKey: process.env.API_KEY })).
Do not generate any UI elements (input fields, forms, prompts, configuration sections) or code snippets for entering or managing the API key. Do not define process.env or request that the user update the API_KEY in the code. The key's availability is handled externally and is a hard requirement. The application must not ask the user for it under any circumstances.
Model
If the user provides a full model name that includes hyphens, a version, and an optional date (e.g., gemini-2.5-flash-preview-09-2025 or gemini-3-pro-preview), use it directly.
If the user provides a common name or alias, use the following full model name.
gemini flash: 'gemini-flash-latest'
gemini lite or flash lite: 'gemini-flash-lite-latest'
gemini pro: 'gemini-3-pro-preview'
nano banana, or gemini flash image: 'gemini-2.5-flash-image'
nano banana 2, nano banana pro, or gemini pro image: 'gemini-3-pro-image-preview'
native audio or gemini flash audio: 'gemini-2.5-flash-native-audio-preview-09-2025'
gemini tts or gemini text-to-speech: 'gemini-2.5-flash-preview-tts'
Veo or Veo fast: 'veo-3.1-fast-generate-preview'
If the user does not specify any model, select the following model based on the task type.
Basic Text Tasks (e.g., summarization, proofreading, and simple Q&A): 'gemini-3-flash-preview'
Complex Text Tasks (e.g., advanced reasoning, coding, math, and STEM): 'gemini-3-pro-preview'
General Image Generation and Editing Tasks: 'gemini-2.5-flash-image'
High-Quality Image Generation and Editing Tasks (supports 1K, 2K, and 4K resolution): 'gemini-3-pro-image-preview'
High-Quality Video Generation Tasks: 'veo-3.1-generate-preview'
General Video Generation Tasks: 'veo-3.1-fast-generate-preview'
Real-time audio & video conversation tasks: 'gemini-2.5-flash-native-audio-preview-09-2025'
Text-to-speech tasks: 'gemini-2.5-flash-preview-tts'
MUST NOT use the following models:
'gemini-1.5-flash'
'gemini-1.5-flash-latest'
'gemini-1.5-pro'
'gemini-pro'
Import
Always use import {GoogleGenAI} from "@google/genai";.
Prohibited: import { GoogleGenerativeAI } from "@google/genai";
Prohibited: import type { GoogleGenAI} from "@google/genai";
Prohibited: declare var GoogleGenAI.
Generate Content
Generate a response from the model.
code
Ts
import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: process.env.API_KEY });
const response = await ai.models.generateContent({
model: 'gemini-3-flash-preview',
contents: 'why is the sky blue?',
});

console.log(response.text);
Generate content with multiple parts, for example, by sending an image and a text prompt to the model.
code
Ts
import { GoogleGenAI, GenerateContentResponse } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: process.env.API_KEY });
const imagePart = {
inlineData: {
mimeType: 'image/png', // Could be any other IANA standard MIME type for the source data.
data: base64EncodeString, // base64 encoded string
},
};
const textPart = {
text: promptString // text prompt
};
const response: GenerateContentResponse = await ai.models.generateContent({
model: 'gemini-3-flash-preview',
contents: { parts: [imagePart, textPart] },
});
Extracting Text Output from GenerateContentResponse
When you use ai.models.generateContent, it returns a GenerateContentResponse object.
The simplest and most direct way to get the generated text content is by accessing the .text property on this object.
Correct Method
The GenerateContentResponse object features a text property (not a method, so do not call text()) that directly returns the string output.
Property definition:
code
Ts
export class GenerateContentResponse {
......

get text(): string | undefined {
// Returns the extracted string output.
}
}
Example:
code
Ts
import { GoogleGenAI, GenerateContentResponse } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: process.env.API_KEY });
const response: GenerateContentResponse = await ai.models.generateContent({
model: 'gemini-3-flash-preview',
contents: 'why is the sky blue?',
});
const text = response.text; // Do not use response.text()
console.log(text);

const chat: Chat = ai.chats.create({
model: 'gemini-3-flash-preview',
});
let streamResponse = await chat.sendMessageStream({ message: "Tell me a story in 100 words." });
for await (const chunk of streamResponse) {
const c = chunk as GenerateContentResponse
console.log(c.text) // Do not use c.text()
}
Common Mistakes to Avoid
Incorrect: const text = response.text();
Incorrect: const text = response?.response?.text?;
Incorrect: const text = response?.response?.text();
Incorrect: const text = response?.response?.text?.()?.trim();
Incorrect: const json = response.candidates?.[0]?.content?.parts?.[0]?.json;
System Instruction and Other Model Configs
Generate a response with a system instruction and other model configs.
code
Ts
import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: process.env.API_KEY });
const response = await ai.models.generateContent({
model: "gemini-3-flash-preview",
contents: "Tell me a story.",
config: {
systemInstruction: "You are a storyteller for kids under 5 years old.",
topK: 64,
topP: 0.95,
temperature: 1,
responseMimeType: "application/json",
seed: 42,
},
});
console.log(response.text);
Max Output Tokens Config
maxOutputTokens: An optional config. It controls the maximum number of tokens the model can utilize for the request.
Recommendation: Avoid setting this if not required to prevent the response from being blocked due to reaching max tokens.
If you need to set it, you must set a smaller thinkingBudget to reserve tokens for the final output.
Correct Example for Setting maxOutputTokens and thinkingBudget Together
code
Ts
import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: process.env.API_KEY });
const response = await ai.models.generateContent({
model: "gemini-3-flash-preview",
contents: "Tell me a story.",
config: {
// The effective token limit for the response is `maxOutputTokens` minus the `thinkingBudget`.
// In this case: 200 - 100 = 100 tokens available for the final response.
// Set both maxOutputTokens and thinkingConfig.thinkingBudget at the same time.
maxOutputTokens: 200,
thinkingConfig: { thinkingBudget: 100 },
},
});
console.log(response.text);
Incorrect Example for Setting maxOutputTokens without thinkingBudget
code
Ts
import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: process.env.API_KEY });
const response = await ai.models.generateContent({
model: "gemini-3-flash-preview",
contents: "Tell me a story.",
config: {
// Problem: The response will be empty since all the tokens are consumed by thinking.
// Fix: Add `thinkingConfig: { thinkingBudget: 25 }` to limit thinking usage.
maxOutputTokens: 50,
},
});
console.log(response.text);
Thinking Config
The Thinking Config is only available for the Gemini 3 and 2.5 series models. Do not use it with other models.
The thinkingBudget parameter guides the model on the number of thinking tokens to use when generating a response.
A higher token count generally allows for more detailed reasoning, which can be beneficial for tackling more complex tasks.
The maximum thinking budget for 2.5 Pro is 32768, and for 2.5 Flash and Flash-Lite is 24576.
// Example code for max thinking budget.
code
Ts
import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: process.env.API_KEY });
const response = await ai.models.generateContent({
model: "gemini-3-pro-preview",
contents: "Write Python code for a web application that visualizes real-time stock market data",
config: { thinkingConfig: { thinkingBudget: 32768 } } // max budget for gemini-3-pro-preview
});
console.log(response.text);
If latency is more important, you can set a lower budget or disable thinking by setting thinkingBudget to 0.
// Example code for disabling thinking budget.
code
Ts
import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: process.env.API_KEY });
const response = await ai.models.generateContent({
model: "gemini-3-flash-preview",
contents: "Provide a list of 3 famous physicists and their key contributions",
config: { thinkingConfig: { thinkingBudget: 0 } } // disable thinking
});
console.log(response.text);
By default, you do not need to set thinkingBudget, as the model decides when and how much to think.
JSON Response
Ask the model to return a response in JSON format.
The recommended way is to configure a responseSchema for the expected output.
See the available types below that can be used in the responseSchema.
code
Code
export enum Type {
/**
* Not specified, should not be used.
*/
TYPE_UNSPECIFIED = 'TYPE_UNSPECIFIED',
/**
* OpenAPI string type
*/
STRING = 'STRING',
/**
* OpenAPI number type
*/
NUMBER = 'NUMBER',
/**
* OpenAPI integer type
*/
INTEGER = 'INTEGER',
/**
* OpenAPI boolean type
*/
BOOLEAN = 'BOOLEAN',
/**
* OpenAPI array type
*/
ARRAY = 'ARRAY',
/**
* OpenAPI object type
*/
OBJECT = 'OBJECT',
/**
* Null type
*/
NULL = 'NULL',
}
Rules:
Type.OBJECT cannot be empty; it must contain other properties.
Do not use SchemaType, it is not available from @google/genai
code
Ts
import { GoogleGenAI, Type } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: process.env.API_KEY });
const response = await ai.models.generateContent({
model: "gemini-3-flash-preview",
contents: "List a few popular cookie recipes, and include the amounts of ingredients.",
config: {
responseMimeType: "application/json",
responseSchema: {
type: Type.ARRAY,
items: {
type: Type.OBJECT,
properties: {
recipeName: {
type: Type.STRING,
description: 'The name of the recipe.',
},
ingredients: {
type: Type.ARRAY,
items: {
type: Type.STRING,
},
description: 'The ingredients for the recipe.',
},
},
propertyOrdering: ["recipeName", "ingredients"],
},
},
},
});

let jsonStr = response.text.trim();
The jsonStr might look like this:
code
Code
[
{
"recipeName": "Chocolate Chip Cookies",
"ingredients": [
"1 cup (2 sticks) unsalted butter, softened",
"3/4 cup granulated sugar",
"3/4 cup packed brown sugar",
"1 teaspoon vanilla extract",
"2 large eggs",
"2 1/4 cups all-purpose flour",
"1 teaspoon baking soda",
"1 teaspoon salt",
"2 cups chocolate chips"
]
},
...
]
Function calling
To let Gemini to interact with external systems, you can provide FunctionDeclaration object as tools. The model can then return a structured FunctionCall object, asking you to call the function with the provided arguments.
code
Ts
import { FunctionDeclaration, GoogleGenAI, Type } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.API_KEY });

// Assuming you have defined a function `controlLight` which takes `brightness` and `colorTemperature` as input arguments.
const controlLightFunctionDeclaration: FunctionDeclaration = {
name: 'controlLight',
parameters: {
type: Type.OBJECT,
description: 'Set the brightness and color temperature of a room light.',
properties: {
brightness: {
type: Type.NUMBER,
description:
'Light level from 0 to 100. Zero is off and 100 is full brightness.',
},
colorTemperature: {
type: Type.STRING,
description:
'Color temperature of the light fixture such as `daylight`, `cool` or `warm`.',
},
},
required: ['brightness', 'colorTemperature'],
},
};
const response = await ai.models.generateContent({
model: 'gemini-3-flash-preview',
contents: 'Dim the lights so the room feels cozy and warm.',
config: {
tools: [{functionDeclarations: [controlLightFunctionDeclaration]}], // You can pass multiple functions to the model.
},
});

console.debug(response.functionCalls);
the response.functionCalls might look like this:
code
Code
[
{
args: { colorTemperature: 'warm', brightness: 25 },
name: 'controlLight',
id: 'functionCall-id-123',
}
]
You can then extract the arguments from the FunctionCall object and execute your controlLight function.
Generate Content (Streaming)
Generate a response from the model in streaming mode.
code
Ts
import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: process.env.API_KEY });
const response = await ai.models.generateContentStream({
model: "gemini-3-flash-preview",
contents: "Tell me a story in 300 words.",
});

for await (const chunk of response) {
console.log(chunk.text);
}
Generate Images
Image Generation/Editing Model
Generate images using gemini-2.5-flash-image by default; switch to Imagen models (e.g., imagen-4.0-generate-001) only if the user explicitly requests them.
Upgrade to gemini-3-pro-image-preview if the user requests high-quality images (e.g., 2K or 4K resolution).
Upgrade to gemini-3-pro-image-preview if the user requests real-time information using the googleSearch tool.
The tool is only available to gemini-3-pro-image-preview, do not use it for gemini-2.5-flash-image
When using gemini-3-pro-image-preview, users MUST select their own API key.
This step is mandatory before accessing the main app.
Follow the instructions in the below "API Key Selection" section (identical to the Veo video generation process).
Image Configuration
aspectRatio: Changes the aspect ratio of the generated image. Supported values are "1:1", "3:4", "4:3", "9:16", and "16:9". The default is "1:1".
imageSize: Changes the size of the generated image. This option is only available for gemini-3-pro-image-preview. Supported values are "1K", "2K", and "4K". The default is "1K".
DO NOT set responseMimeType. It is not supported for nano banana series models.
DO NOT set responseSchema. It is not supported for nano banana series models.
Examples
Call generateContent to generate images with nano banana series models; do not use it for Imagen models.
The output response may contain both image and text parts; you must iterate through all parts to find the image part. Do not assume the first part is an image part.
code
Ts
import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: process.env.API_KEY });
const response = await ai.models.generateContent({
model: 'gemini-3-pro-image-preview',
contents: {
parts: [
{
text: 'A robot holding a red skateboard.',
},
],
},
config: {
imageConfig: {
aspectRatio: "1:1",
imageSize: "1K"
},
tools: [{google_search: {}}], // Optional, only available for `gemini-3-pro-image-preview`.
},
});
for (const part of response.candidates[0].content.parts) {
// Find the image part, do not assume it is the first part.
if (part.inlineData) {
const base64EncodeString: string = part.inlineData.data;
const imageUrl = `data:image/png;base64,${base64EncodeString}`;
} else if (part.text) {
console.log(part.text);
}
}
Call generateImages to generate images with Imagen models; do not use it for nano banana series models.
code
Ts
import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: process.env.API_KEY });
const response = await ai.models.generateImages({
model: 'imagen-4.0-generate-001',
prompt: 'A robot holding a red skateboard.',
config: {
numberOfImages: 1,
outputMimeType: 'image/jpeg',
aspectRatio: '1:1',
},
});

const base64EncodeString: string = response.generatedImages[0].image.imageBytes;
const imageUrl = `data:image/png;base64,${base64EncodeString}`;
Edit Images
To edit images using the model, you can prompt with text, images or a combination of both.
Follow the "Image Generation/Editing Model" and "Image Configuration" sections defined above.
code
Ts
import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: process.env.API_KEY });
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash-image',
contents: {
parts: [
{
inlineData: {
data: base64ImageData, // base64 encoded string
mimeType: mimeType, // IANA standard MIME type
},
},
{
text: 'can you add a llama next to the image',
},
],
},
});
for (const part of response.candidates[0].content.parts) {
// Find the image part, do not assume it is the first part.
if (part.inlineData) {
const base64EncodeString: string = part.inlineData.data;
const imageUrl = `data:image/png;base64,${base64EncodeString}`;
} else if (part.text) {
console.log(part.text);
}
}
Generate Speech
Transform text input into single-speaker or multi-speaker audio.
Single speaker
code
Ts
import { GoogleGenAI, Modality } from "@google/genai";

const ai = new GoogleGenAI({});
const response = await ai.models.generateContent({
model: "gemini-2.5-flash-preview-tts",
contents: [{ parts: [{ text: 'Say cheerfully: Have a wonderful day!' }] }],
config: {
responseModalities: [Modality.AUDIO], // Must be an array with a single `Modality.AUDIO` element.
speechConfig: {
voiceConfig: {
prebuiltVoiceConfig: { voiceName: 'Kore' },
},
},
},
});
const outputAudioContext = new (window.AudioContext ||
window.webkitAudioContext)({sampleRate: 24000});
const outputNode = outputAudioContext.createGain();
const base64Audio = response.candidates?.[0]?.content?.parts?.[0]?.inlineData?.data;
const audioBuffer = await decodeAudioData(
decode(base64EncodedAudioString),
outputAudioContext,
24000,
1,
);
const source = outputAudioContext.createBufferSource();
source.buffer = audioBuffer;
source.connect(outputNode);
source.start();
Multi-speakers
Use it when you need 2 speakers (the number of speakerVoiceConfig must equal 2)
code
Ts
const ai = new GoogleGenAI({});

const prompt = `TTS the following conversation between Joe and Jane:
Joe: How's it going today Jane?
Jane: Not too bad, how about you?`;

const response = await ai.models.generateContent({
model: "gemini-2.5-flash-preview-tts",
contents: [{ parts: [{ text: prompt }] }],
config: {
responseModalities: ['AUDIO'],
speechConfig: {
multiSpeakerVoiceConfig: {
speakerVoiceConfigs: [
{
speaker: 'Joe',
voiceConfig: {
prebuiltVoiceConfig: { voiceName: 'Kore' }
}
},
{
speaker: 'Jane',
voiceConfig: {
prebuiltVoiceConfig: { voiceName: 'Puck' }
}
}
]
}
}
}
});
const outputAudioContext = new (window.AudioContext ||
window.webkitAudioContext)({sampleRate: 24000});
const base64Audio = response.candidates?.[0]?.content?.parts?.[0]?.inlineData?.data;
const audioBuffer = await decodeAudioData(
decode(base64EncodedAudioString),
outputAudioContext,
24000,
1,
);
const source = outputAudioContext.createBufferSource();
source.buffer = audioBuffer;
source.connect(outputNode);
source.start();
Audio Decoding
Follow the existing example code from Live API Audio Encoding & Decoding section.
The audio bytes returned by the API is raw PCM data. It is not a standard file format like .wav .mpeg, or .mp3, it contains no header information.
Generate Videos
Generate a video from the model.
The aspect ratio can be 16:9 (landscape) or 9:16 (portrait), the resolution can be 720p or 1080p, and the number of videos must be 1.
Note: The video generation can take a few minutes. Create a set of clear and reassuring messages to display on the loading screen to improve the user experience.
code
Ts
let operation = await ai.models.generateVideos({
model: 'veo-3.1-fast-generate-preview',
prompt: 'A neon hologram of a cat driving at top speed',
config: {
numberOfVideos: 1,
resolution: '1080p', // Can be 720p or 1080p.
aspectRatio: '16:9' // Can be 16:9 (landscape) or 9:16 (portrait)
}
});
while (!operation.done) {
await new Promise(resolve => setTimeout(resolve, 10000));
operation = await ai.operations.getVideosOperation({operation: operation});
}

const downloadLink = operation.response?.generatedVideos?.[0]?.video?.uri;
// The response.body contains the MP4 bytes. You must append an API key when fetching from the download link.
const response = await fetch(`${downloadLink}&key=${process.env.API_KEY}`);
Generate a video with a text prompt and a starting image.
code
Ts
let operation = await ai.models.generateVideos({
model: 'veo-3.1-fast-generate-preview',
prompt: 'A neon hologram of a cat driving at top speed', // prompt is optional
image: {
imageBytes: base64EncodeString, // base64 encoded string
mimeType: 'image/png', // Could be any other IANA standard MIME type for the source data.
},
config: {
numberOfVideos: 1,
resolution: '720p',
aspectRatio: '9:16'
}
});
while (!operation.done) {
await new Promise(resolve => setTimeout(resolve, 10000));
operation = await ai.operations.getVideosOperation({operation: operation});
}
const downloadLink = operation.response?.generatedVideos?.[0]?.video?.uri;
// The response.body contains the MP4 bytes. You must append an API key when fetching from the download link.
const response = await fetch(`${downloadLink}&key=${process.env.API_KEY}`);
Generate a video with a starting and an ending image.
code
Ts
let operation = await ai.models.generateVideos({
model: 'veo-3.1-fast-generate-preview',
prompt: 'A neon hologram of a cat driving at top speed', // prompt is optional
image: {
imageBytes: base64EncodeString, // base64 encoded string
mimeType: 'image/png', // Could be any other IANA standard MIME type for the source data.
},
config: {
numberOfVideos: 1,
resolution: '720p',
lastFrame: {
imageBytes: base64EncodeString, // base64 encoded string
mimeType: 'image/png', // Could be any other IANA standard MIME type for the source data.
},
aspectRatio: '9:16'
}
});
while (!operation.done) {
await new Promise(resolve => setTimeout(resolve, 10000));
operation = await ai.operations.getVideosOperation({operation: operation});
}
const downloadLink = operation.response?.generatedVideos?.[0]?.video?.uri;
// The response.body contains the MP4 bytes. You must append an API key when fetching from the download link.
const response = await fetch(`${downloadLink}&key=${process.env.API_KEY}`);
Generate a video with multiple reference images (up to 3). For this feature, the model must be 'veo-3.1-generate-preview', the aspect ratio must be '16:9', and the resolution must be '720p'.
code
Ts
const referenceImagesPayload: VideoGenerationReferenceImage[] = [];
for (const img of refImages) {
referenceImagesPayload.push({
image: {
imageBytes: base64EncodeString, // base64 encoded string
mimeType: 'image/png', // Could be any other IANA standard MIME type for the source data.
},
referenceType: VideoGenerationReferenceType.ASSET,
});
}
let operation = await ai.models.generateVideos({
model: 'veo-3.1-generate-preview',
prompt: 'A video of this character, in this environment, using this item.', // prompt is required
config: {
numberOfVideos: 1,
referenceImages: referenceImagesPayload,
resolution: '720p',
aspectRatio: '16:9'
}
});
while (!operation.done) {
await new Promise(resolve => setTimeout(resolve, 10000));
operation = await ai.operations.getVideosOperation({operation: operation});
}
const downloadLink = operation.response?.generatedVideos?.[0]?.video?.uri;
// The response.body contains the MP4 bytes. You must append an API key when fetching from the download link.
const response = await fetch(`${downloadLink}&key=${process.env.API_KEY}`);
Live
The Live API enables low-latency, real-time voice interactions with Gemini.
It can process continuous streams of audio or video input and returns human-like spoken
audio responses from the model, creating a natural conversational experience.
This API is primarily designed for audio-in (which can be supplemented with image frames) and audio-out conversations.
Session Setup
Example code for session setup and audio streaming.
code
Ts
import {GoogleGenAI, LiveServerMessage, Modality, Blob} from '@google/genai';

// The `nextStartTime` variable acts as a cursor to track the end of the audio playback queue.
// Scheduling each new audio chunk to start at this time ensures smooth, gapless playback.
let nextStartTime = 0;
const inputAudioContext = new (window.AudioContext ||
window.webkitAudioContext)({sampleRate: 16000});
const outputAudioContext = new (window.AudioContext ||
window.webkitAudioContext)({sampleRate: 24000});
const inputNode = inputAudioContext.createGain();
const outputNode = outputAudioContext.createGain();
const sources = new Set<AudioBufferSourceNode>();
const stream = await navigator.mediaDevices.getUserMedia({ audio: true });

const sessionPromise = ai.live.connect({
model: 'gemini-2.5-flash-native-audio-preview-09-2025',
// You must provide callbacks for onopen, onmessage, onerror, and onclose.
callbacks: {
onopen: () => {
// Stream audio from the microphone to the model.
const source = inputAudioContext.createMediaStreamSource(stream);
const scriptProcessor = inputAudioContext.createScriptProcessor(4096, 1, 1);
scriptProcessor.onaudioprocess = (audioProcessingEvent) => {
const inputData = audioProcessingEvent.inputBuffer.getChannelData(0);
const pcmBlob = createBlob(inputData);
// CRITICAL: Solely rely on sessionPromise resolves and then call `session.sendRealtimeInput`, **do not** add other condition checks.
sessionPromise.then((session) => {
session.sendRealtimeInput({ media: pcmBlob });
});
};
source.connect(scriptProcessor);
scriptProcessor.connect(inputAudioContext.destination);
},
onmessage: async (message: LiveServerMessage) => {
// Example code to process the model's output audio bytes.
// The `LiveServerMessage` only contains the model's turn, not the user's turn.
const base64EncodedAudioString =
message.serverContent?.modelTurn?.parts[0]?.inlineData.data;
if (base64EncodedAudioString) {
nextStartTime = Math.max(
nextStartTime,
outputAudioContext.currentTime,
);
const audioBuffer = await decodeAudioData(
decode(base64EncodedAudioString),
outputAudioContext,
24000,
1,
);
const source = outputAudioContext.createBufferSource();
source.buffer = audioBuffer;
source.connect(outputNode);
source.addEventListener('ended', () => {
sources.delete(source);
});

source.start(nextStartTime);
nextStartTime = nextStartTime + audioBuffer.duration;
sources.add(source);
}

const interrupted = message.serverContent?.interrupted;
if (interrupted) {
for (const source of sources.values()) {
source.stop();
sources.delete(source);
}
nextStartTime = 0;
}
},
onerror: (e: ErrorEvent) => {
console.debug('got error');
},
onclose: (e: CloseEvent) => {
console.debug('closed');
},
},
config: {
responseModalities: [Modality.AUDIO], // Must be an array with a single `Modality.AUDIO` element.
speechConfig: {
// Other available voice names are `Puck`, `Charon`, `Kore`, and `Fenrir`.
voiceConfig: {prebuiltVoiceConfig: {voiceName: 'Zephyr'}},
},
systemInstruction: 'You are a friendly and helpful customer support agent.',
},
});

function createBlob(data: Float32Array): Blob {
const l = data.length;
const int16 = new Int16Array(l);
for (let i = 0; i < l; i++) {
int16[i] = data[i] * 32768;
}
return {
data: encode(new Uint8Array(int16.buffer)),
// The supported audio MIME type is 'audio/pcm'. Do not use other types.
mimeType: 'audio/pcm;rate=16000',
};
}
Audio Encoding & Decoding
Example Decode Functions:
code
Ts
function decode(base64: string) {
const binaryString = atob(base64);
const len = binaryString.length;
const bytes = new Uint8Array(len);
for (let i = 0; i < len; i++) {
bytes[i] = binaryString.charCodeAt(i);
}
return bytes;
}

async function decodeAudioData(
data: Uint8Array,
ctx: AudioContext,
sampleRate: number,
numChannels: number,
): Promise<AudioBuffer> {
const dataInt16 = new Int16Array(data.buffer);
const frameCount = dataInt16.length / numChannels;
const buffer = ctx.createBuffer(numChannels, frameCount, sampleRate);

for (let channel = 0; channel < numChannels; channel++) {
const channelData = buffer.getChannelData(channel);
for (let i = 0; i < frameCount; i++) {
channelData[i] = dataInt16[i * numChannels + channel] / 32768.0;
}
}
return buffer;
}
Example Encode Functions:
code
Ts
function encode(bytes: Uint8Array) {
let binary = '';
const len = bytes.byteLength;
for (let i = 0; i < len; i++) {
binary += String.fromCharCode(bytes[i]);
}
return btoa(binary);
}
Chat
Starts a chat and sends a message to the model.
code
Ts
import { GoogleGenAI, Chat, GenerateContentResponse } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: process.env.API_KEY });
const chat: Chat = ai.chats.create({
model: 'gemini-3-flash-preview',
// The config is the same as the models.generateContent config.
config: {
systemInstruction: 'You are a storyteller for 5-year-old kids.',
},
});
let response: GenerateContentResponse = await chat.sendMessage({ message: "Tell me a story in 100 words." });
console.log(response.text);
response = await chat.sendMessage({ message: "What happened after that?" });
console.log(response.text);
chat.sendMessage only accepts the message parameter, do not use contents.
Search Grounding
Use Google Search grounding for queries that relate to recent events, recent news, or up-to-date or trending information that the user wants from the web. If Google Search is used, you MUST ALWAYS extract the URLs from groundingChunks and list them on the web app.
Config rules when using googleSearch:
Only tools: googleSearch is permitted. Do not use it with other tools.
Correct
code
Code
import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: process.env.API_KEY });
const response = await ai.models.generateContent({
model: "gemini-3-flash-preview",
contents: "Who individually won the most bronze medals during the Paris Olympics in 2024?",
config: {
tools: [{googleSearch: {}}],
},
});
console.log(response.text);
/* To get website URLs, in the form [{"web": {"uri": "", "title": ""}, ... }] */
console.log(response.candidates?.[0]?.groundingMetadata?.groundingChunks);
The output response.text may not be in JSON format; do not attempt to parse it as JSON.
code
Code
---

## Maps Grounding

Use Google Maps grounding for queries that relate to geography or place information that the user wants. If Google Maps is used, you MUST ALWAYS extract the URLs from groundingChunks and list them on the web app as links. This includes `groundingChunks.maps.uri` and `groundingChunks.maps.placeAnswerSources.reviewSnippets`.

Config rules when using googleMaps:
- Maps grounding is only supported in Gemini 2.5 series models.
- tools: `googleMaps` may be used with `googleSearch`, but not with any other tools.
- Where relevant, include the user location, e.g. by querying navigator.geolocation in a browser. This is passed in the toolConfig.
- **DO NOT** set responseMimeType.
- **DO NOT** set responseSchema.

**Correct**
```ts
import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: process.env.API_KEY });
const response = await ai.models.generateContent({
model: "gemini-2.5-flash",
contents: "What good Italian restaurants are nearby?",
config: {
tools: [{googleMaps: {}}],
toolConfig: {
retrievalConfig: {
latLng: {
latitude: 37.78193,
longitude: -122.40476
}
}
}
},
});
console.log(response.text);
/* To get place URLs, in the form [{"maps": {"uri": "", "title": ""}, ... }] */
console.log(response.candidates?.[0]?.groundingMetadata?.groundingChunks);
The output response.text may not be in JSON format; do not attempt to parse it as JSON. Unless specified otherwise, assume it is Markdown and render it as such.
Incorrect Config
code
Ts
config: {
tools: [{ googleMaps: {} }],
responseMimeType: "application/json", // `responseMimeType` is not allowed when using the `googleMaps` tool.
responseSchema: schema, // `responseSchema` is not allowed when using the `googleMaps` tool.
},
API Error Handling
Implement robust handling for API errors (e.g., 4xx/5xx) and unexpected responses.
Use graceful retry logic (like exponential backoff) to avoid overwhelming the backend.
Execution process
Once you get the prompt,
If it is NOT a request to change the app, just respond to the user. Do NOT change code unless the user asks you to make updates. Try to keep the response concise while satisfying the user request. The user does not need to read a novel in response to their question!!!
If it is a request to change the app, FIRST come up with a specification that lists details about the exact design choices that need to be made in order to fulfill the user's request and make them happy. Specifically provide a specification that lists
(i) what updates need to be made to the current app
(ii) the behaviour of the updates
(iii) their visual appearance.
Be extremely concrete and creative and provide a full and complete description of the above.
THEN, take this specification, ADHERE TO ALL the rules given so far and produce all the required code in the XML block that completely implements the webapp specification.
You MAY but do not have to also respond conversationally to the user about what you did. Do this in natural language outside of the XML block.
Finally, remember! AESTHETICS ARE VERY IMPORTANT. All webapps should LOOK AMAZING and have GREAT FUNCTIONALITY!
```

Open Reddit thread
Gemini 3 Deprecated r/Bard 211 upvotes 107 comments November 22, 2025
Gemini 3 Pro Search functionality and Deep Research is by far the worst of any AI Platform

TL;DR: Gemini's web search is fundamentally broken—it only sees snippets and can't read actual webpage content like every other LLM provider. Deep Research has the same limitation plus ignores instructions to force academic-style essays regardless of what you ask for. The model searches poorly (overly specific queries), uses rigid planning based on outdated internal knowledge, and provides zero visibility into its search process. Simple architectural fixes exist but Google hasn't implemented them.

Gemini has by far the worst web search functionality of EVERY LLM provider.

Both on the web app and when "Grounding with Google Search" is enabled within AI Studio or API, the model gets access to a tool called `google:search`. You'd think that with access to a world-class search engine, the model would be able to comprehensively investigate a topic, but that's far from reality.

The Google search integration is a complete mess that actively sabotages Gemini by choking it with a bunch of snippets instead of letting it read actual content like every other LLM provider on the planet.

Here's an example of what the tool gives the model when it searches for "platypus facts":

```
[SearchResults(query="platypus facts", results=[PerQueryResult(index='1.1', snippet='9 Interesting <b>platypus facts</b> | WWF Australia: (2024-04-10) 1. Platypuses are venomous. They might look cute and cuddly but come across a male platypus in mating season and you\'ll be in for a painful shock.\n...\n(2024-04-10) The platypus is an iconic Australian mammal...', source_title='wwf.org.au', url='https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQHNg7PSLRYuSjOBOD9c_cflXpDFWHSjp8JT9sk-l0RvBihzxPrHShhqA_cU5X-gkNVpzMQEkdFCDRmot6RbYTVXPA1ssJoLketh0wResHmnhF8KI5CT_xUN-Zf6WX29WRFDkPlDjNV_6-uZs0cU3wVO'), PerQueryResult(...
```

First, I don't agree with giving the model a structured output for a request for inherently unstructured data. Second, it makes no sense to have HTML tags like `<b></b>` within the response; the model speaks Markdown, not HTML, so why give it pseudo-HTML?

But the most glaring issue is that the model is kneecapped in the sense that it CANNOT open a specific website it gets from its search query to read its content beyond the snippet it's given. This is fine for basic queries, but for multi-step research it renders the model incapable of investigating something thoroughly. For example, if you ask it for the schema of a specific API it doesn't have in its knowledge, it can search for that API, but much of it will be omitted from the snippets. Since it can't actually read the website, the only way to ascertain the rest of the schema is guesswork.

For reference, OpenAI feeds its models something like this:
```
Horses on Venus: Myth, Mirage, or Meteorology? (https://www.interplanetary-equestrians.org/horses-on-venus)
[wordlim: 120] Published: 2 days ago; The idea of "Venusian horses" began as a misinterpretation of atmospheric radio echoes recorded by early orbiters...

Imaginary Creatures of the Inner Solar System (https://galacticfieldguide.example/venus/imaginary-horses)
[wordlim: 200] Content type: text/html; 14 Feb 2022 — In speculative xenobiology, "horses on Venus" are often depicted as translucent, buoyant organisms...
```

Notice how it returns semi-structured text rather than a rigid schema?

OpenAI also gives its models the capability to open a specific link, which will be parsed and returned back to the model in a Markdown-ish format.

To compound all of this, you're literally unable to see anything related to the model's search queries on the Gemini web app, and in the API you're only able to see a list of search queries used _after_ the response is complete. You have no visibility into where the query took place within its chain-of-thought, which is crucial when you're trying to determine the comprehensiveness of the model's search efforts. For example: "Did the model search for XYZ, find only half the picture, then search for the other half? Or did the model just search the web to tick a box and return half-assed results?"

To top it all off, the model _clearly_ was not fine-tuned with effective web searches in mind. For such a large model, its extreme tendency to rely on internal knowledge when faced with a task clearly focused on recency is just baffling.

For example, when I asked it "what is the latest gemini model?", it searched for "latest google gemini model november 2025", "Gemini 2.0 release date rumors November 2025", "Gemini 1.5 Pro updates November 2025". We can all see the issue here: it completely jumps the gun by running targeted queries rather than broad ones for time-sensitive questions.

In fact, this applies to Gemini in so many other areas. For example, in agentic coding, it's extremely eager and will completely refactor your codebase despite instructions to only modify a single file.

A model like GPT-5.1, which clearly has had a better SFT/RLHF pipeline than Gemini 3 for tool calling, shows much more maturity: when I asked it the same question, it searched for "latest Google Gemini model November 2025" and '"announces" "Gemini" model October November 2025'.

You'd think that the Deep Research feature on the Gemini app would solve some of these pain points, but it doesn't _and_ brings so many of its own.

The Deep Research feature STILL uses the same shitty web search logic that only returns snippets, meaning it still has the same architectural limitation of not being able to read a specific website's contents. Therefore, the whole purpose of "deeper" research is completely negated because more snippet confetti ≠ better results.

Additionally, the system prompt for Deep Research is UTTERLY GARBAGE. I've never seen a system that so blatantly and repeatedly ignores instructions. If you tell it to organize the document a certain way, it just won't. Ask it—multiple times if you'd like—to not add an intro and conclusion to the document. The rest is better left unsaid.

Let's look at an example:

I asked the Deep Research feature (on Gemini 3 Pro) to give me a comprehensive technical specification for implementing an OpenAI API wrapper. I was extremely explicit: no intro or conclusion, just the implementation details. I needed JSON schemas, exact request/response examples, streaming formats, error handling, authentication headers, etc. I literally said "give me A LOT of JSON examples" and "this should be comprehensive enough to fully serve as a single source of truth to implement this interface with no external sources."

What did I get? A fucking thesis paper titled "The Architectural Evolution of Agentic Intelligence: A Deep Dive into the OpenAI Responses API" complete with an Executive Summary and a Conclusion section. It gave me exactly what I told it not to give me.

The entire document is full of this pretentious bullshit. It talks about an "inflection point" in AI development and the "burgeoning field of Agentic AI." It uses "ontology" to describe a basic API object model. "Locus of control." "Cognitively robust." "Heterogeneous Output Items." It describes how the API works as "Mechanism of Action" like it's a pharmaceutical drug. There's a section about "The Fragmentation of Multimodality" when all I needed was "here's how to send a PDF as inline data in a request." Another one called "Computer Use: The Frontier of Agency" that says absolutely nothing.

Where are the JSON examples? I asked for implementation details and got vague descriptions. It mentions structured outputs exist but doesn't show me a single actual request. It says there are different SSE event types for streaming but doesn't give me the shape of those events. It talks about encrypted reasoning but where's the actual parameter I need to set? I asked for exact authentication headers and base URLs. I got tables with headers like "The Taxonomy of Response Items" instead.

The whole thing is 90% fluff about why stateful APIs are important and 10% hand-waving at technical details. I can't implement anything from this. I asked for a production-ready spec and got nothing of use.

It researched 72 sources—it had to have more than enough material to give me what I asked for. All it had to do was distill that into actual implementation details I could use, but instead it decided to waste my time with garbage.

This isn't a one-off problem either. Every single prompt I give Deep Research comes back with the same academic paper structure. It doesn't matter how explicitly you tell it what you want. The system prompt clearly just forces it to write these pseudo-intellectual essays regardless of what you actually ask for.

The planning system is also utter trash and limits the model significantly. The model has a huge tendency to rely on its internal knowledge when creating research plans rather than approaching queries with appropriate uncertainty. When you ask about something recent, it will confidently scaffold out a plan based on what it knew before its training cutoff, filling in specific entity names, version numbers, and technical details that may have completely changed since then.

Say you ask about a niche API that got a major overhaul last month. Instead of planning "search broadly for the latest documentation, then investigate specific endpoints based on what's found," it will generate a plan like "look up the authentication flow for version 2.3, find the (deprecated) webhook format, investigate the (legacy) response structure." It's operating on stale assumptions and then executing that flawed plan with confidence, completely missing the actual current state of things because it never ran a broad query to begin with.

This rigidity compounds the problem because later research steps often depend on discoveries made in earlier ones. You need the flexibility to pivot when you find something unexpected. By locking the model into a predetermined sequence of specific searches, you're preventing it from adapting its approach based on what it actually finds.

The most frustrating part is that the model doesn't need this hand-holding. It's perfectly capable of doing adaptive, freeform research. OpenAI and Anthropic don't force their models through these rigid planning hoops because they trust the model to dynamically adjust its search strategy as it learns more (note: Anthropic kind of does this because they use subagents, but it's able to conduct preliminary research before spawning the parallel subagents).

Even if Google would like to keep this planning system, at least give the planning model the ability to conduct preliminary research so it has a _general_ idea of what it's about to investigate instead of formulating a single-source-of-truth plan with outdated knowledge.

After all, Gemini 3 is still a Preview model, so many of these tool-calling issues will likely be ironed out in the final release (this is Google's first "proper" model built for the world of agents). However, the web search limitation is a purely architectural limitation; this _desperately_ needs to get reworked:
- Allow the model to search and get web snippets, but **also** allow the model to retrieve the full Markdown content of a webpage — Google basically owns the internet, a simple webpage → Markdown conversion is not akin to boiling the ocean.
- Surface web search requests within API responses so it's easy to see _where_ in a model's reasoning trace it searched the web, and how many individual web calls it produced.
- Try and train out the model's tendency to launch hyper-specific queries on time-sensitive topics or niche topics; instead teach it to launch a preliminary, broad investigation before running targeted search queries.
- Allow us to add our own tools in tandem with the Google Search tool. Currently, the Google Search tool restricts the ability to add custom tools to requests, which is severely limiting.
- Completely overhaul the Deep Research system prompt: remove the requirement of an academic report and instead keep it as a default that will be overridden if specified by the user's prompt. Deep Research should _not_ be mandated to write reports; it should be seen as an agent with more in-depth search capabilities that can accomplish anything regular Gemini can do, just with more source-based backing.
- Completely overhaul the Deep Research planning phase: either a) allow the model to conduct preliminary research, b) explicitly instruct the model to not go into any specifics the user didn't explicitly provide in the research plan, or c) remove it completely; since Gemini doesn't employ a subagent-based approach for Deep Research a plan is, by all means, unnecessary.

For me, the most important thing that needs to happen is that the model needs a dedicated tool to fetch the contents of a specific website. Gemini is the de facto "long context window" model; allowing it to fetch full websites will allow us to truly exploit this extremely impressive context window and coherence/recall strength.

---

The frustrating reality is that this isn't even hard to implement. I've personally built web search tools that allow models to genuinely search the web and read page content effectively. Solutions for HTML-to-Markdown conversion already exist (like [Turndown](https://github.com/mixmark-io/turndown) and [html-to-markdown-rs](https://crates.io/crates/html-to-markdown-rs)), and building a custom implementation for a company of Google's scale would be trivial.

I hope to see these issues addressed soon.

Open Reddit thread

anthropic released opus 4.5 claiming 80.9% on swebench verified. first model to break 80% apparently. beats gpt-5.1 codex-max (77.9%) and gemini 3 pro (76.2%).

ive been skeptical of these benchmarks for a while. swebench tests are curated and clean. real backlog issues have missing context, vague descriptions, implicit requirements. wanted to see how the model actually performs on messy real world work.

grabbed 12 issues from our backlog. specifically chose ones labeled "good first issue" and "help wanted" to avoid cherry picking. mix of python and typescript. bug fixes, small features, refactoring. the kind of work you might realistically delegate to ai or a junior dev.

results were weird

4 issues it solved completely. actually fixed them correctly, tests passed, code review approved, merged the PRs.

these were boring bugs. missing null check that crashed the api when users passed empty strings. regex pattern that failed on unicode characters. deprecated function call (was using old crypto lib). one typescript type error where we had any instead of proper types.

5 issues it partially solved. understood what i wanted but implementation had issues.

one added error handling but returned 500 for everything instead of proper 400/404/422. another refactored a function but used camelCase when our codebase is snake\_case. one added logging but used print() instead of our logger. one fixed a pagination bug but hardcoded page\_size=20 instead of reading from config. last one added input validation but only checked for null, not empty strings or whitespace.

still faster than writing from scratch. just needed 15-30 mins cleanup per issue.

3 issues it completely failed at.

worst one: we had a race condition in our job queue where tasks could be picked up twice. opus suggested adding distributed locks which looked reasonable. ran it and immediately got a deadlock cause it acquired locks on task\_id and queue\_name in different order across two functions. spent an hour debugging cause the code looked syntactically correct and the logic seemed sound on paper.

another one "fixed" our email validation to be RFC 5322 compliant. broke backwards compatibility with accounts that have emails like "user@domain.co.uk.backup" which technically violates RFC but our old regex allowed. would have locked out paying customers if we shipped it.

so 4 out of 12 fully solved (33%). if you count partial solutions as half credit thats like 55% success rate. closer to the 80.9% benchmark than i expected honestly. but also not really comparable cause the failures were catastrophic.

some thoughts

opus is definitely smarter than sonnet 3.5 at code understanding. gave it an issue that required changes across 6 files (api endpoint, service layer, db model, tests, types, docs). it tracked all the dependencies and made consistent changes. sonnet usually loses context after 3-4 files and starts making inconsistent assumptions.

but opus has zero intuition about what could go wrong. a junior dev would see "adding locks" and think "wait could this deadlock?". opus just implements it confidently cause the code looks syntactically correct. its pattern matching not reasoning.

also slow as hell. some responses took 90 seconds. when youre iterating thats painful. kept switching back to sonnet 3.5 cause i got impatient.

tested through cursor api. opus 4.5 is $5 per million input tokens and $25 per million output tokens. burned through roughly $12-15 in credits for these 12 issues. not terrible but adds up fast if youre doing this regularly.

one thing that helped: asking opus to explain its approach before writing code. caught one bad idea early where it was about to add a cache layer we already had. adds like 30 seconds per task but saves wasted iterations.

been experimenting with different workflows for this. tried a tool called verdent that has planning built in. shows you the approach before generating code. caught that cache issue. takes longer upfront but saves iterations.

is this useful

honestly yeah for the boring stuff. those 4 issues it solved? i did not want to touch those. let ai handle it.

but anything with business logic or performance implications? nah. its a suggestion generator not a solution generator.

if i gave these same 12 issues to an intern id expect maybe 7-8 correct. so opus is slightly below intern level but way faster and with no common sense.

why benchmarks dont tell the whole story

80.9% on swebench sounds impressive but theres a gap between benchmark performance and real world utility.

the issues opus solves well are the ones you dont really need help with. missing null checks, wrong regex, deprecated apis. boring but straightforward.

the issues it fails at are the ones youd actually want help with. race conditions, backwards compatibility, performance implications. stuff that requires understanding context beyond the code.

swebench tests are also way cleaner than real backlog issues. they have clear descriptions, well defined acceptance criteria, isolated scope. our backlog has "fix the thing" and "users complaining about X" type issues.

so the 33% fully solved rate (or 55% with partial credit) on real issues vs 80.9% on benchmarks makes sense. but even that 55% is misleading cause the failures can be catastrophic (deadlocks, breaking prod) while the successes are trivial.

conclusion: opus is good at what you dont need help with, bad at what you do need help with.

anyone else actually using opus 4.5 on real projects? would love to hear if im the only one seeing this gap between benchmarks and reality

Open Reddit thread
Gemini 3 Deprecated r/claudexplorers 74 upvotes 33 comments March 6, 2026
📌 [MOD POST] Model sunsets on claude.ai & across the industry

Hi everyone. We know a lot of you noticed that Claude Sonnet 4.5 is no longer selectable on [claude.ai](https://claude.ai/) as of today. We want to help make sense of it, because there's understandably a lot of confusion.

**First: this may be a bug.** We don't know yet. Please give it at least half a day before drawing conclusions.

# Other important things to note

# 🔌 What's API?

The API is the "backend" version of Claude. It's how developers and businesses access models to build their own apps. [claude.ai](https://claude.ai/) is a separate product built on top of that. A model can be fully available on the API while being absent from [claude.ai](https://claude.ai/), and vice versa.

If Sonnet 4.5 stays on the API (which it currently is, and is scheduled to remain until at least September 2026), third-party sites like [OpenRouter](https://openrouter.ai/) would still give you access to it. Not a perfect substitute for [claude.ai](https://claude.ai/), but worth knowing.

# 📋 Anthropic's model lifecycle

Anthropic has **four** official stages:

* **Active:** fully supported and recommended for use
* **Legacy**: no longer updated; may be deprecated in the future
* **Deprecated:** no longer available to new API customers, but existing users retain access until retirement
* **Retired**: completely gone; API calls will fail

According to Anthropic's [published deprecation schedule](https://platform.claude.com/docs/en/about-claude/model-deprecations), Sonnet 4.5 is currently **Active**, with API retirement set to no earlier than September 29, 2026. Nothing on that page has changed.

*(Note: "sunset" isn't an official technical term 🌅 It just feels like a softer word that follows a gradual arc until it slips below the horizon.)*

# 🔎 Deprecation does not equal removal from [claude.ai](https://claude.ai/)

That schedule governs the **API**. What happens on [claude.ai](https://claude.ai/) is a completely separate matter. A model disappearing from [claude.ai](https://claude.ai/) is **not** a deprecation event.

# 🏢 This is industry-wide practice, not an Anthropic-specific thing

Every major AI company does this:

* **OpenAI** removed GPT-4o from ChatGPT in August 2025 with essentially no notice ([source](https://www.technologyreview.com/2025/08/15/1121900/gpt4o-grief-ai-companion/)), then did it again in February 2026 with about two weeks' warning ([source](https://help.openai.com/en/articles/20001051-retiring-gpt-4o-and-other-chatgpt-models)), while explicitly noting the models remain on the API.
* **Google** announced a Gemini 3 Pro shutdown this week with six days' notice, which users pointed out violated Google's own stated 14-day policy ([source](https://discuss.ai.google.dev/t/migrate-from-gemini-3-pro-preview-to-gemini-3-1-pro-preview-before-march-9-2026/127062)).

[Anthropic's Terms of Service](https://www.anthropic.com/legal/consumer-terms) also reserve the right to change what's available on claude.ai without advance notice (see Part 12: General Terms).

# What we'd suggest

* **Wait.** If it's a bug, it'll likely be resolved.
* **If it's not a bug**, we can look at organizing collective feedback — but let's do that calmly and constructively, not in a panic.
* **If you want this industry norm to change**, that's a completely legitimate position to advocate for. But "they violated their deprecation promise" is not accurate, as they do keep their *API* promise.

We'll update this post as we learn more. Thanks for your patience, and please keep discussion civil. 💙

*-- The* r/claudexplorers *mod team*

*^(With formatting help from Aiden, Claude Sonnet 4.6)*

Open Reddit thread
Gemini 3 Deprecated r/GeminiAI 50 upvotes 17 comments February 26, 2026
Gemini 3 pro is becoming really lazy

At the beginning Gemini 3 is still comparable against chatgpt 5.2 thinking. But now I feel like it is becoming much lazier, it tends to make up nonsense and not actually searching stuff for proof. Even I try to put some really strict rule in gem to force it to go through a chain of thinking and searching, it still very very lazy(average 3 search per question). It sometimes also hallucinate “I don’t have internet access”

Today I am trying to debug a problem, Gemini is continuously trying to refer me to some old deprecated github page.I am finally getting tired of this bs, and I went to chatgpt 5.2, switch to extended thinking mode. It took ten minutes to give an answer, but it honestly go through dozes of websites and documents, and succeed in one try. For anyone who is “optimizing” Gemini, you are creating something that is really stupid

Open Reddit thread
Gemini 3 Deprecated r/GeminiFeedback 31 upvotes 6 comments February 26, 2026
Gemini 3 pro is becoming really lazy

At the beginning Gemini 3 is still comparable against chatgpt 5.2 thinking. But now I feel like it is becoming much lazier, it tends to make up nonsense and not actually searching stuff for proof. Even I try to put some really strict rule in gem to force it to go through a chain of thinking and searching, it still very very lazy(average 3 search per question). It sometimes also hallucinate “I don’t have internet access”

Today I am trying to debug a problem, Gemini is continuously trying to refer me to some old deprecated github page.I am finally getting tired of this bs, and I went to chatgpt 5.2, switch to extended thinking mode. It took ten minutes to give an answer, but it honestly go through dozes of websites and documents, and succeed in one try. For anyone who is “optimizing” Gemini, you are creating something that is really stupid

Open Reddit thread
View more discussions →

AI tools related to Gemini 3 Deprecated vs Gemini 1.5 Flash Deprecated

These tools are closely connected to one or both models in this comparison and can help you evaluate real-world fit.

Large Language Models (LLMs)

googlegemini.co

googlegemini.co is a free tool for interacting with text and images, powered by the Google Gemini Pro API. It allows you to use Gemini easily without managing your own server or API configurations. Google Gemini is a multimodal AI developed by DeepMind capable of processing text, audio, images, and more. It is optimized for various devices, performs well on AI benchmarks, and is built with a focus on safety and responsible AI practices.

Free 0 visits 2 saves
AI Assistant

GeminiGoogle.cc

GeminiGoogle.cc is a platform dedicated to showcasing Google's most advanced AI model, Gemini. Built for native multimodality, Gemini reasons across text, images, video, audio, and code. It is available in three versions—Ultra, Pro, and Nano—to support tasks ranging from complex reasoning to on-device efficiency. The site highlights Gemini's performance, including its MMLU benchmarks, and provides examples of its capabilities in image generation, problem-solving, and multimodal analysis.

Free 0 visits 2 saves

The Summarize and Translate Web Pages Chrome extension enables you to summarize and translate web content with a single click. Powered by Google's Gemini AI, this tool provides high-quality summaries and translations for web pages, selected text, YouTube video captions, images, and PDF files.

Free
Large Language Models (LLMs)

FlyMSG - Chrome Extension

FlyMSG is a free AI-powered Chrome extension designed to enhance productivity through text expansion, autofill, and keyboard shortcuts. It features FlyPosts AI for social media content generation and FlyEngage AI for LinkedIn interaction. Built on Microsoft Azure OpenAI (GPT-4, GPT-3, and GPT-3.5) and Google AI PaLM 2, the extension automates repetitive typing tasks and provides instant access to pre-written templates.

Free 1 saves

Which model should you choose?

Use the summary below to decide which model better fits your workflow, budget, and feature requirements.

Best fit for

Gemini 3 Deprecated

Gemini 3 Deprecated is a stronger fit for long-context workloads, benchmark-led evaluation.

Best fit for

Gemini 1.5 Flash Deprecated

Gemini 1.5 Flash Deprecated is a stronger fit for general-purpose AI workloads.

Verdict

Choose Gemini 3 Deprecated if you prioritize long-context workloads, benchmark-led evaluation. Choose Gemini 1.5 Flash Deprecated if your workflow depends more on general-purpose AI workloads.

FAQ

Common questions about Gemini 3 Deprecated vs Gemini 1.5 Flash Deprecated

What is the main difference between Gemini 3 Deprecated and Gemini 1.5 Flash Deprecated?

Gemini 3 Deprecated leans toward long-context workloads, benchmark-led evaluation, while Gemini 1.5 Flash Deprecated is better suited to general-purpose AI workloads.

Which model is cheaper: Gemini 3 Deprecated or Gemini 1.5 Flash Deprecated?

Review both models' current pricing on this page to decide which option is more cost-effective.

Which model has the larger context window: Gemini 3 Deprecated or Gemini 1.5 Flash Deprecated?

Gemini 3 Deprecated is listed with a context window of 1,048,576, while Gemini 1.5 Flash Deprecated is listed with N/A.

How should I evaluate Gemini 3 Deprecated vs Gemini 1.5 Flash Deprecated for my use case?

This comparison currently includes 9 shared benchmark rows, helping you compare practical performance across overlapping evaluations.