Stability

Stable Image Ultra

Stable Image Ultra is Stability AI's flagship text-to-image generation model, designed to produce high-quality, photorealistic images from natural language text prompts. It sits at the top of Stability AI's image generation lineup and accepts concise text descriptions — up to a 77-token context window — to generate detailed visuals with strong coherence and fidelity. The model supports configurable inputs including text prompts, selection parameters, and a seed value for reproducible outputs. Stable Image Ultra is well-suited for applications such as marketing visuals, concept art, product visualization, and editorial illustration. It is available through Stability AI's own API and via AWS Bedrock, making it accessible for production-scale deployments without requiring infrastructure management. Developers and enterprises can integrate it directly into existing workflows through these managed cloud platforms.

A specific training data cutoff date is not publicly disclosed in the available metadata for this model. 77 context N/A output
Text-to-Image Generation Seed-Based Reproducibility Configurable Output Options API & Cloud Integration Photorealistic Detail

Model Overview

High-signal model metadata in a structured two-column overview table.

Provider

The entity that provides this model.

Stability

Input Context Window

The number of tokens supported by the input context window.

77 tokens

Maximum Output Tokens

The number of tokens that can be generated by the model in a single request.

N/A tokens

Open Source

Whether the model's code is available for public use.

No

Release Date

When the model was first released.

A specific training data cutoff date is not publicly disclosed in the available metadata for this model.

Knowledge Cut-off Date

When the model's knowledge was last updated.

A specific training data cutoff date is not publicly disclosed in the available metadata for this model.

API Providers

The providers that offer this model. This is not an exhaustive list.

Amazon Bedrock

Modalities

Types of data this model can process.

Image Text

What is Stable Image Ultra

A fuller summary of positioning, capabilities, and source-specific details for Stable Image Ultra.

Stable Image Ultra is Stability AI's flagship text-to-image generation model, designed to produce high-quality, photorealistic images from natural language text prompts. It sits at the top of Stability AI's image generation lineup and accepts concise text descriptions — up to a 77-token context window — to generate detailed visuals with strong coherence and fidelity. The model supports configurable inputs including text prompts, selection parameters, and a seed value for reproducible outputs.

Stable Image Ultra is well-suited for applications such as marketing visuals, concept art, product visualization, and editorial illustration. It is available through Stability AI's own API and via AWS Bedrock, making it accessible for production-scale deployments without requiring infrastructure management. Developers and enterprises can integrate it directly into existing workflows through these managed cloud platforms.

Capabilities

What Stable Image Ultra supports

IMG

Text-to-Image Generation

Converts natural language text prompts into photorealistic images. Accepts up to 77 tokens of prompt input per request.

AI

Seed-Based Reproducibility

Accepts a numeric seed value as input to produce consistent, reproducible image outputs across multiple runs with the same prompt.

AI

Configurable Output Options

Supports multiple select-type parameters to control generation settings such as output format and aspect ratio.

API

API & Cloud Integration

Available via Stability AI's REST API and AWS Bedrock, enabling production-scale deployment without managing underlying infrastructure.

AI

Photorealistic Detail

Generates images with high visual fidelity and coherence, including fine detail in textures, lighting, and composition from short prompts.

Pricing for Stable Image Ultra

Primary API pricing shown in the same “quick compare” spirit as the reference page.

API Access & Providers

Places where this model is available, based on the synced detail-page metadata.

Amazon Bedrock

Configuration & Parameters

The configurable options currently documented for this model.

Negative Prompt

Text

A blurb of text describing what you do not wish to see in the output image.

Aspect Ratio

Select
Default: 1:1
1:1 2:3 3:2 4:5 5:4 9:16 9:21 16:9 21:9

Seed

Seed

A specific value that is used to guide the 'randomness' of the generation. Omit this parameter or pass 0 to use a random seed.

Output Format

Select
Default: png
jpeg png webp

Supported Request Parameters

Parameters currently listed by OpenRouter or the local catalog for this model.

Negative Prompt Aspect Ratio Seed Output Format

Resources & Documentation

Official model cards, release notes, docs, and other references synced from the source page.

Community discussion

What people think about Stable Image Ultra

Stable Image Ultra discussions are most active in r/StableDiffusion. Top Reddit threads cluster around benchmark and model-comparison threads. The strongest match in this snapshot has 65 upvotes and 41 comments.

I'm not very good at posting news, but since nobody shared yet.

They are releasing 3 models, on amazon bedrock, no news about weights yet.
the models are: **Stable Image Ultra, Stable Diffusion 3 Large and Stable Image Core**

1. Stable Image Ultra: Photorealistic, Large-Scale Output
* Ideal For: Ultra-realistic imagery for luxury brands and high-end campaigns.
* Use Case Example: A luxury brand uses Stable Image Ultra to create stunning visuals of its latest collection for magazine spreads, ensuring a premium feel that matches its high standards.

1. Stable Diffusion 3 Large: High-Quality, High-Quantity Creative Assets
* Ideal For: High-volume outputs like marketing campaigns and digital assets.
* Use Case Example: A game development team uses SD3 Large to create detailed environmental textures and character concepts, accelerating their creative pipeline.

1. Stable Image Core: Fast and Affordable
* Ideal For: Rapid content generation at scale. Optimized for speedy image generation. 
* Use Case Example: An online retailer uses Stable Image Core to quickly generate product images for new arrivals, allowing it to list items faster and keep its catalog up-to-date.

More on: [https://stability.ai/news/stability-ais-top-3-text-to-image-models-now-available-in-amazon-bedrock](https://stability.ai/news/stability-ais-top-3-text-to-image-models-now-available-in-amazon-bedrock)

Hope to see this model weights any time.
I'm enjoying Flux alot thease days, and I really don't know what to expect from this model, I'm a little frustrated with the latest news I've had about stability, but I still recognize them for being one of those who started the movement.

Open Reddit thread
r/StableDiffusion 21 upvotes 28 comments August 2, 2024
Flux Schnell vs SD3 Large vs SD Image Ultra vs Midjourney 6.1

Didn't see many comparisons with SD3 Large for Flux so decided to do one myself.

Summary of models:

* **Flux Schnell**
* Apache 2.0 license, full commercial use allowed, finetunes allowed, pretty much completely open and free
* The only one of the Flux models to allow commercial use / creation of Finetunes & LoRAs without a special license
* **SD3 Large**
* Unreleased for local gen, but if Stability holds true to their claims (they haven't lied yet) it will eventually be released under their Creator License (free for those with <$1mill revenue, paid license otherwise)
* **SD Image Ultra**
* Most expensive offering from Stability, they claim this is their top-of-the-line
* API Only
* **Midjourney**
* v6.1 model is brand new, just released
* API Only

I added in SD Image Ultra and Midjourney just for fun since I already had Midjourney credits & had left-over Stability credits after doing the SD3 large tests

# Prompts

I did 3 prompts. I created 4 images from each prompt (always annoyed by those who generate only 1 image in their comparisons). I used a negative prompt of "blurry, low quality, low resolution" in all prompts.

Prompt 1:

`A woman in hiking gear with cargo shorts, a backpack, and black leather boots, standing on a cliff overlooking a valley of lush green foliage, trees, and a river. It is evening and the lights of a small village along the bank of the river twinkle in the darkness.`

Prompt 2:

`A photo taken from behind a man and a woman standing at the helm of a boat. A series of other boats are docked in the bay, looking out as blue and red fireworks illuminate the night sky.`

Prompt 3:

`A photo taken from over a man's shoulder, the man is standing, a woman is running towards him from a long distance away. Car headlights illuminate the woman from behind. Dark, creepy trees, mud, and fog abound.`

# Prompt 1:

`A woman in hiking gear with cargo shorts, a backpack, and black leather boots, standing on a cliff overlooking a valley of lush green foliage, trees, and a river. It is evening and the lights of a small village along the bank of the river twinkle in the darkness.`

# Flux Schnell

https://preview.redd.it/0mbbdhlr2agd1.png?width=2048&format=png&auto=webp&s=481e60b54145f56a60f7d6e38f5622fb831a3713

# SD3 Large

https://preview.redd.it/1xwt60ts2agd1.png?width=2048&format=png&auto=webp&s=8115cb3f552e9f49990e497435cfb7999701adf3

# Stable Image Ultra

https://preview.redd.it/b6q80gpt2agd1.png?width=2048&format=png&auto=webp&s=63e583e2b957d765f4bf8663da7ff6928ca892da

# Midjourney

https://preview.redd.it/yqffrhnu2agd1.png?width=2048&format=png&auto=webp&s=dff15152d3dd733695fc690d83929b8aee0c8db3

# Prompt 2:

`A photo taken from behind a man and a woman standing at the helm of a boat. A series of other boats are docked in the bay, looking out as blue and red fireworks illuminate the night sky.`

# Flux Schnell

https://preview.redd.it/53gl7ilv2agd1.png?width=2048&format=png&auto=webp&s=7c62294536373097ee4333ccd8e4360b8f902c32

# SD3 Large

https://preview.redd.it/kfk3junw2agd1.png?width=2048&format=png&auto=webp&s=b0f7537cd592de9afd01529c6818c2f1875198e2

# Stable Image Ultra

https://preview.redd.it/h9vjl9jx2agd1.png?width=2048&format=png&auto=webp&s=963847f9456ec12e496e64690405614333a16d30

# Midjourney

https://preview.redd.it/d5cfopcy2agd1.png?width=2048&format=png&auto=webp&s=a85970ec74ec2092d9bef680ed577e858b6a86eb

# Prompt 3:

`A photo taken from over a man's shoulder, the man is standing, a woman is running towards him from a long distance away. Car headlights illuminate the woman from behind. Dark, creepy trees, mud, and fog abound.`

# Flux Schnell

https://preview.redd.it/9yvoxemz2agd1.png?width=2048&format=png&auto=webp&s=d4a93ce85568523219e9d6ad48f29b0049c2a288

# SD3 Large

https://preview.redd.it/10zo78k03agd1.png?width=2048&format=png&auto=webp&s=3725515cd582f7577e1af3fff9c1d7f83b16752e

# Stable Image Ultra

https://preview.redd.it/ghnjjji13agd1.png?width=2048&format=png&auto=webp&s=4ced2b42bfbe5af6b5b8a2e3d1d6b99fe478d12f

# Midjourney

https://preview.redd.it/a53xucf23agd1.png?width=2048&format=png&auto=webp&s=542077c8e948598a4fcd626711688425616921a2

Open Reddit thread
View more discussions →
FAQ

Common questions about Stable Image Ultra

What is the context window for Stable Image Ultra?

Stable Image Ultra has a context window of 77 tokens, meaning text prompts should be concise and within that token limit for best results.

How is Stable Image Ultra priced?

Pricing details are available via the Model Pricing & Specs page on CloudPrice and may also vary depending on whether you access the model through Stability AI's own API or through AWS Bedrock. Check those sources for current per-image rates.

Does Stable Image Ultra have a training data cutoff date?

A specific training data cutoff date is not publicly disclosed in the available metadata for this model.

What platforms can I use to access Stable Image Ultra?

Stable Image Ultra is available through Stability AI's own API and via AWS Bedrock, which allows developers and enterprises to integrate it into applications without managing infrastructure directly.

What input types does Stable Image Ultra accept?

The model accepts text prompts, selection-based parameters (such as output format or aspect ratio), and a seed value for reproducible generation.

More models from Stability

Continue browsing adjacent models from the same provider.

← All AI Models