Google

Imagen 4 Ultra

Imagen 4 Ultra is Google's flagship image generation model and the top tier of the Imagen 4 family, trained through early 2025. It accepts text prompts of up to 10,000 tokens and is designed to handle complex, multi-element descriptions including specific art styles, multi-scene compositions, and nuanced visual storytelling. The model supports image URL arrays as input, allowing users to reference existing images alongside text prompts. It is licensed for commercial use, making it available to businesses and creative professionals working on production-grade projects. Imagena 4 Ultra is best suited for use cases where image fidelity and detail are priorities, such as professional design work, advertising, and high-resolution visual content creation. It covers a wide range of output styles, from photorealistic portraits and landscapes to stylized illustrations and pixel art. According to community benchmarking discussions, Imagen 4 Ultra has achieved competitive Elo ratings in image arenas, including a reported tie with GPT-Image-1 in the Image Arena as of mid-2025. The model is accessible via the Google Gemini API as well as third-party inference platforms such as fal.ai.

Unknown 10,000 context N/A output
Text-to-Image Generation Image URL Input Style Selection Commercial Licensing High-Resolution Output API Access

Model Overview

High-signal model metadata in a structured two-column overview table.

Provider

The entity that provides this model.

Google

Input Context Window

The number of tokens supported by the input context window.

10,000 tokens

Maximum Output Tokens

The number of tokens that can be generated by the model in a single request.

N/A tokens

Open Source

Whether the model's code is available for public use.

No

Release Date

When the model was first released.

Unknown

Knowledge Cut-off Date

When the model's knowledge was last updated.

Unknown

API Providers

The providers that offer this model. This is not an exhaustive list.

Google, Gemini API

Modalities

Types of data this model can process.

Image Text

What is Imagen 4 Ultra

A fuller summary of positioning, capabilities, and source-specific details for Imagen 4 Ultra.

Imagen 4 Ultra is Google's flagship image generation model and the top tier of the Imagen 4 family, trained through early 2025. It accepts text prompts of up to 10,000 tokens and is designed to handle complex, multi-element descriptions including specific art styles, multi-scene compositions, and nuanced visual storytelling. The model supports image URL arrays as input, allowing users to reference existing images alongside text prompts. It is licensed for commercial use, making it available to businesses and creative professionals working on production-grade projects.

Imagena 4 Ultra is best suited for use cases where image fidelity and detail are priorities, such as professional design work, advertising, and high-resolution visual content creation. It covers a wide range of output styles, from photorealistic portraits and landscapes to stylized illustrations and pixel art. According to community benchmarking discussions, Imagen 4 Ultra has achieved competitive Elo ratings in image arenas, including a reported tie with GPT-Image-1 in the Image Arena as of mid-2025. The model is accessible via the Google Gemini API as well as third-party inference platforms such as fal.ai.

Capabilities

What Imagen 4 Ultra supports

IMG

Text-to-Image Generation

Generates images from text prompts with up to 10,000 tokens, enabling detailed and complex scene descriptions.

IMG

Image URL Input

Accepts arrays of image URLs as input, allowing reference images to be passed alongside text prompts for guided generation.

AI

Style Selection

Supports a select input type for specifying output styles, covering photorealistic, illustrated, and stylized visual modes.

AI

Commercial Licensing

Licensed for commercial applications, making generated images usable in business and professional production contexts.

AI

High-Resolution Output

Produces high-resolution images suited for professional and commercial use cases where detail and fidelity are required.

API

API Access

Available via the Google Gemini API and third-party platforms like fal.ai, with documented endpoints for programmatic integration.

Pricing for Imagen 4 Ultra

Primary API pricing shown in the same “quick compare” spirit as the reference page.

Price Comparison

Additional usage-cost dimensions synced into the project for this model.

maxTemperature 1

API Access & Providers

Places where this model is available, based on the synced detail-page metadata.

Google Gemini API

Configuration & Parameters

The configurable options currently documented for this model.

Source Images

Image URL Array

If you want to edit an existing image, provide the URL(s) or variables

Aspect Ratio

Select
Default: 16:9
1:1 16:9 9:16 3:4 4:3 2:3 3:2

Supported Request Parameters

Parameters currently listed by OpenRouter or the local catalog for this model.

Source Images Aspect Ratio

Resources & Documentation

Official model cards, release notes, docs, and other references synced from the source page.

Community discussion

What people think about Imagen 4 Ultra

Imagen 4 Ultra discussions are most active in r/Bard, r/GeminiAI, r/singularity. Top Reddit threads cluster around benchmark and model-comparison threads, safety and censorship questions.

The strongest match in this snapshot has 347 upvotes and 62 comments.

Alibaba has officially ended 2025 by releasing **Qwen-Image-2512**, currently the world’s strongest open-source text-to-image model. Benchmarks from the AI Arena confirm it is now performing within the same tier as Google’s flagship proprietary models.

**The Performance Data:** In over 10,000 blind evaluation rounds, **Qwen-Image-2512** effectively matching Imagen 4 Ultra and challenging **Gemini 3 Pro.**

This is the **first time** an open-source weights model has consistently rivaled the top three closed-source giants in visual fidelity.

**Key Upgrades:**

**Skin & Hair Realism:** The model features a specific architectural update to reduce the **"AI plastic look"** focusing on natural skin pores and realistic hair textures.

**Complex Material Rendering:** Significant improvements in difficult-to-render textures like water ripples, landscapes and animal fur.

**Layout & Text Quality:** Building on the Qwen-VL foundation, it handles multi-line text and professional-grade layout composition with high precision.

**Open Weights Availability:** True to their roadmap, Alibaba has open-sourced the model **weights** under the Apache 2.0 license, making them available on Hugging Face and ModelScope for immediate local deployment.

[Source: Qwen Blog](https://qwen.ai/blog?id=qwen-image-2512)
[Source: Hugging Face Repository](https://huggingface.co/unsloth/Qwen-Image-2512-GGUF)

Open Reddit thread

Google's Imagen 4 and Imagen 4 Ultra are being sunset on June 30 but are essentially the only models out there that can reliably output a convincing 1990s "Disney renaissance" look, with the blurry-edge shading that defines the [CAPS](https://en.wikipedia.org/wiki/Computer_Animation_Production_System)\-style of that era. So I'm trying to distill it into something that can be used until I come across another model that can do this.

I've made my first Illustrious 2.0 LoRA (through TensorArt because my graphics card is busted and I already had an account with them since before they started censoring everything) with a purely Imagen 4-generated 100 image dataset of 16:9, 1408x768 graphics. I did Repeat 3 / Epoch 10 = 2910 steps. Auto-labelled with "wd-v1-4-vit-tagger-v2". And the resulting images absolutely do capture the style, but... the result is a little wonky, it's got random artifacts, often shitty lines, weird eyes, IDK, the way AI gen looked like 2 years ago? Back when "AI slop" didn't mean it looked too polished, but that it actually looked sloppy?

It'd be easy to just jump back in and add more images, do more steps, but I've already wasted nearly $10 so I'd be so thankful if somebody with more experience could hint what I might be doing wrong. Should I use Imagen 4 ultra images for training instead? They tend to be a little sharper and I can get at 2x the resolution, though they cost $0.06 per image. Or should I try and automate some de-noising or upscaling or sharpening of the training set I already have? Or is like... my LoRA essentially fine and what is vexing me is just the limitations of using an older local model like Illustrious 2.0?

Edit: also tried doing a Qwen Image Edit 2511 LoRA (through FAL's trainer) that would just change the character but the results were not great there either)

EDIT2: After a lot of back and forth I realized what's bothering me is probably just that Illustrious is a very out of date model that's pretty far behind the curve. I re-evaluaed my Qwen Image Edit 2511 LoRA and while it does also edit the background (despite me not touching the backgrounds at all in the pairs!) it's actually really good for getting the character design right, so I guess I'll just fix the backgrounds manually instead.

Open Reddit thread
View more discussions →
FAQ

Common questions about Imagen 4 Ultra

What is the context window for Imagen 4 Ultra?

Imagen 4 Ultra supports a context window of 10,000 tokens, which applies to the text prompt input describing the desired image.

Is Imagen 4 Ultra licensed for commercial use?

Yes, Imagen 4 Ultra is licensed for commercial applications, making it suitable for businesses and creative professionals producing commercial content.

When was Imagen 4 Ultra trained?

According to the model metadata, Imagen 4 Ultra has a training data cutoff of early 2025.

Where can I find pricing information for Imagen 4 Ultra?

Pricing for Imagen 4 Ultra via the Google Gemini API is listed on the Google Gemini API pricing page at ai.google.dev/gemini-api/docs/pricing#imagen.

What input types does Imagen 4 Ultra accept?

Imagen 4 Ultra accepts image URL arrays and select-type inputs, in addition to text prompts, allowing users to provide reference images and specify style options alongside their descriptions.

More models from Google

Continue browsing adjacent models from the same provider.

← All AI Models