Google

Gemini 2.0 Flash-Lite Vision

Gemini 2.0 Flash-Lite Vision is a multimodal model developed by Google, designed to process both visual and textual inputs. It belongs to the Gemini 2.0 Flash family and is positioned as the fastest and most cost-efficient option within that lineup. The model supports a context window of over one million tokens, making it suitable for tasks that require processing large amounts of information in a single request. It was trained on data up to June 2024. This model is intended as an upgrade path for users of Gemini 1.5 Flash who want improved output quality without changes to cost or latency. Its vision capabilities allow it to handle image understanding tasks alongside text-based workflows. The combination of speed, large context support, and multimodal input handling makes it well-suited for applications such as document analysis, image captioning, and high-throughput pipelines where cost efficiency is a priority.

Feb 25, 2025 1,048,576 context 8,192 tokens output
Vision Understanding Large Context Window Multimodal Input High-Speed Inference Text Generation Document Analysis

Model Overview

High-signal model metadata in a structured two-column overview table.

Provider

The entity that provides this model.

Google

Input Context Window

The number of tokens supported by the input context window.

1,048,576 tokens

Maximum Output Tokens

The number of tokens that can be generated by the model in a single request.

8,192 tokens tokens

Open Source

Whether the model's code is available for public use.

No

Release Date

When the model was first released.

Feb 25, 2025 1 year ago

Knowledge Cut-off Date

When the model's knowledge was last updated.

June 2024

API Providers

The providers that offer this model. This is not an exhaustive list.

Google, Vertex AI

Modalities

Types of data this model can process.

Text Image

What is Gemini 2.0 Flash-Lite Vision

A fuller summary of positioning, capabilities, and source-specific details for Gemini 2.0 Flash-Lite Vision.

Gemini 2.0 Flash-Lite Vision is a multimodal model developed by Google, designed to process both visual and textual inputs. It belongs to the Gemini 2.0 Flash family and is positioned as the fastest and most cost-efficient option within that lineup. The model supports a context window of over one million tokens, making it suitable for tasks that require processing large amounts of information in a single request. It was trained on data up to June 2024.

This model is intended as an upgrade path for users of Gemini 1.5 Flash who want improved output quality without changes to cost or latency. Its vision capabilities allow it to handle image understanding tasks alongside text-based workflows. The combination of speed, large context support, and multimodal input handling makes it well-suited for applications such as document analysis, image captioning, and high-throughput pipelines where cost efficiency is a priority.

Capabilities

What Gemini 2.0 Flash-Lite Vision supports

AI

Vision Understanding

Processes and interprets image inputs alongside text, enabling tasks like image captioning, visual question answering, and scene description.

CTX

Large Context Window

Supports up to 1,048,576 tokens in a single context, allowing long documents, multi-image inputs, or extended conversations to be processed together.

MM

Multimodal Input

Accepts combinations of text and image inputs in a single request, enabling workflows that mix visual and textual data.

AI

High-Speed Inference

Optimized for low-latency responses, making it suitable for real-time or high-throughput production applications.

AI

Text Generation

Generates coherent text responses based on visual and textual prompts, supporting summarization, Q&A, and content extraction tasks.

AI

Document Analysis

Can process long-form documents or multi-page inputs within its million-token context window, extracting structured information or answering questions about content.

Pricing for Gemini 2.0 Flash-Lite Vision

Primary API pricing shown in the same “quick compare” spirit as the reference page.

Price Comparison

Additional usage-cost dimensions synced into the project for this model.

maxTemperature 2
maxResponseSize 8,192 tokens

API Access & Providers

Places where this model is available, based on the synced detail-page metadata.

Google Vertex AI

Configuration & Parameters

The configurable options currently documented for this model.

Temperature

Number
Default: 1 Range: 0 - 2 (step 0.1)

Max Response Tokens

Number
Default: 4096 Range: 1 - 8192 (step 1)

Supported Request Parameters

Parameters currently listed by OpenRouter or the local catalog for this model.

Temperature Max Response Tokens

Model Performance

Benchmark scores synced from the current model source and normalized into the local catalog.

Benchmark Score
AIME 2024
American math olympiad problems
27.7%
GPQA Diamond
PhD-level science questions (biology, physics, chemistry)
53.5%
HLE
Questions that challenge frontier models across many domains
3.6%
LiveCodeBench
Real-world coding tasks from recent competitions
18.5%
MATH-500
Undergraduate and competition-level math problems
87.3%
MMLU-Pro
Expert knowledge across 14 academic disciplines
72.4%
SciCode
Scientific research coding and numerical methods
25.0%

Resources & Documentation

Official model cards, release notes, docs, and other references synced from the source page.

Compare Gemini 2.0 Flash-Lite Vision with related models

Jump straight into the most relevant side-by-side comparison pages for this model.

Gemini 2.5 Pro vs Gemini 2.0 Flash-Lite Vision

Compare Gemini 2.5 Pro and Gemini 2.0 Flash-Lite Vision across pricing, context window, capabilities, benchmarks, and API access to choose the better fit for long-context workloads versus long-context workloads.

Gemini 2.0 Flash-Lite Vision vs Gemini 2.5 Flash Vision

Compare Gemini 2.0 Flash-Lite Vision and Gemini 2.5 Flash Vision across pricing, context window, capabilities, benchmarks, and API access to choose the better fit for long-context workloads versus long-context workloads.

Gemini 2.5 Flash Lite vs Gemini 2.0 Flash-Lite Vision

Compare Gemini 2.5 Flash Lite and Gemini 2.0 Flash-Lite Vision across pricing, context window, capabilities, benchmarks, and API access to choose the better fit for long-context workloads versus long-context workloads.

Gemini 2.0 Flash-Lite Vision vs Gemini 2.5 Flash Image

Compare Gemini 2.0 Flash-Lite Vision and Gemini 2.5 Flash Image across pricing, context window, capabilities, benchmarks, and API access to choose the better fit for long-context workloads versus long-context workloads.

Gemini 2.0 Flash-Lite Vision vs Gemini 2.5 Flash

Compare Gemini 2.0 Flash-Lite Vision and Gemini 2.5 Flash across pricing, context window, capabilities, benchmarks, and API access to choose the better fit for long-context workloads versus long-context workloads.

FAQ

Common questions about Gemini 2.0 Flash-Lite Vision

What is the context window size for Gemini 2.0 Flash-Lite Vision?

Gemini 2.0 Flash-Lite Vision supports a context window of 1,048,576 tokens, allowing very large inputs to be processed in a single request.

What is the knowledge cutoff date for this model?

The model's training data has a cutoff of June 2024, meaning it does not have knowledge of events or information published after that date.

What types of inputs does Gemini 2.0 Flash-Lite Vision accept?

The model accepts both image and text inputs, making it a multimodal model capable of handling visual understanding tasks alongside standard text-based prompts.

Who is this model intended for?

According to Google's description, it is designed as an upgrade path for Gemini 1.5 Flash users who want better output quality at the same price and speed.

Where can I access or deploy Gemini 2.0 Flash-Lite Vision?

The model is available through Google Cloud's Vertex AI platform. Documentation for deployment and usage can be found at the official Vertex AI documentation page.

More models from Google

Continue browsing adjacent models from the same provider.

← All AI Models