Google

Gemini 2.0 Flash

Gemini 2.0 Flash is a text generation model developed by Google, released as part of the Gemini 2.0 model family. It features a context window of 1,048,576 tokens and is designed to handle a broad range of everyday tasks with real-time response latency. The model's training data has a cutoff of June 2024. Gemini 2.0 Flash is positioned as an upgrade for users of the 1.5 Flash model who want meaningfully improved output quality, and for users of the 1.5 Pro model who want comparable or slightly improved quality at lower latency and cost. It is well-suited for applications that require processing long documents, maintaining extended conversations, or running high-throughput workloads where response speed matters.

Feb 05, 2025 1,048,576 context 8,192 tokens output

Large Context Window Real-Time Latency Text Generation Structured Output Function Calling Multimodal Input

Overview ↓ About ↓ Capabilities ↓ Pricing ↓ Price Comparison ↓ Benchmarks ↓ Compare ↓ Tools ↓ Daily ↓ Resources ↓ Community ↓ FAQ ↓

Model Overview

High-signal model metadata in a structured two-column overview table.

Provider

The entity that provides this model.

Google

Model ID

The routed model identifier exposed by upstream providers.

google/gemini-2.0-flash-001

Input Context Window

The number of tokens supported by the input context window.

1,048,576 tokens

Maximum Output Tokens

The number of tokens that can be generated by the model in a single request.

8,192 tokens tokens

Open Source

Whether the model's code is available for public use.

Release Date

When the model was first released.

Feb 05, 2025 1 year ago

Knowledge Cut-off Date

When the model's knowledge was last updated.

June 2024

API Providers

The providers that offer this model. This is not an exhaustive list.

Google, Vertex AI

Modalities

Types of data this model can process.

Text Image Audio Video

What is Gemini 2.0 Flash

A fuller summary of positioning, capabilities, and source-specific details for Gemini 2.0 Flash.

Gemini 2.0 Flash is a text generation model developed by Google, released as part of the Gemini 2.0 model family. It features a context window of 1,048,576 tokens and is designed to handle a broad range of everyday tasks with real-time response latency. The model's training data has a cutoff of June 2024.

Gemini 2.0 Flash is positioned as an upgrade for users of the 1.5 Flash model who want meaningfully improved output quality, and for users of the 1.5 Pro model who want comparable or slightly improved quality at lower latency and cost. It is well-suited for applications that require processing long documents, maintaining extended conversations, or running high-throughput workloads where response speed matters.

Capabilities

What Gemini 2.0 Flash supports

CTX

Large Context Window

Supports up to 1,048,576 tokens in a single context, enabling processing of long documents, codebases, or extended conversation histories in one request.

Real-Time Latency

Designed to return responses at real-time speeds, making it suitable for interactive applications and live user-facing workflows.

Text Generation

Generates coherent, contextually relevant text across tasks such as summarization, drafting, question answering, and instruction following.

JSON

Structured Output

Supports structured response formats, allowing developers to request JSON or other schema-conforming outputs for downstream processing.

Function Calling

Supports function calling, enabling the model to invoke developer-defined tools and integrate with external APIs or services within a workflow.

Multimodal Input

Accepts text, images, audio, and video as inputs, allowing mixed-media prompts to be processed within the same large context window.

Pricing for Gemini 2.0 Flash

Primary API pricing shown in the same “quick compare” spirit as the reference page.

Input tokens $0.15 Per million tokens

Output tokens $0.40 Per million tokens

Price Comparison

Additional usage-cost dimensions synced into the project for this model.

maxTemperature 2

maxResponseSize 8,192 tokens

API Access & Providers

Places where this model is available, based on the synced detail-page metadata.

Google Vertex AI

Model Performance

Benchmark scores synced from the current model source and normalized into the local catalog.

Benchmark	Score
AIME 2024 American math olympiad problems	33.0%
GPQA Diamond PhD-level science questions (biology, physics, chemistry)	62.3%
HLE Questions that challenge frontier models across many domains	5.3%
LiveCodeBench Real-world coding tasks from recent competitions	33.4%
MATH-500 Undergraduate and competition-level math problems	93.0%
MMLU-Pro Expert knowledge across 14 academic disciplines	77.9%
SciCode Scientific research coding and numerical methods	33.3%

Resources & Documentation

Official model cards, release notes, docs, and other references synced from the source page.

Official Website Other

→

Documentation Documentation

→

Gemini API Reference Documentation

→

Google AI Studio Playground Playground

→

Gemini 2.0 Announcement Announcements

→

OpenRouter Model Page OpenRouter

→