Text-to-Image Generation
Generates images from text prompts using the FLUX.1-schnell architecture with 8.9 billion parameters. Supports output resolutions up to 1536×1536 pixels.
Chroma is an 8.9 billion parameter text-to-image model developed by WaveSpeed AI, built on the FLUX.1-schnell architecture. It was trained using over 105,000 hours of NVIDIA H100 GPU time, with a dataset curated from 5 million selected images. The model is designed around a philosophy of unrestricted creative expression, removing the content filters found on many mainstream image generation platforms. It supports image output up to 1536×1536 pixels and is noted for clean renders, natural lighting, strong color harmony, and anatomical accuracy in human figures, hands, and faces. Chroma is well-suited for commercial photography, digital illustration, character design, concept art, and medical or educational illustration where content restrictions would otherwise be a barrier. It handles complex, multi-element scenes involving people, props, and environments with strong prompt adherence. The model responds particularly well to structured prompts organized around subject, context, style, lighting, camera, and mood. It is available through WaveSpeed AI and is optimized for both single-shot and batch generation workflows.
High-signal model metadata in a structured two-column overview table.
The entity that provides this model.
The number of tokens supported by the input context window.
The number of tokens that can be generated by the model in a single request.
Whether the model's code is available for public use.
When the model was first released.
When the model's knowledge was last updated.
The providers that offer this model. This is not an exhaustive list.
Types of data this model can process.
A fuller summary of positioning, capabilities, and source-specific details for Chroma.
Chroma is an 8.9 billion parameter text-to-image model developed by WaveSpeed AI, built on the FLUX.1-schnell architecture. It was trained using over 105,000 hours of NVIDIA H100 GPU time, with a dataset curated from 5 million selected images. The model is designed around a philosophy of unrestricted creative expression, removing the content filters found on many mainstream image generation platforms. It supports image output up to 1536×1536 pixels and is noted for clean renders, natural lighting, strong color harmony, and anatomical accuracy in human figures, hands, and faces.
Chroma is well-suited for commercial photography, digital illustration, character design, concept art, and medical or educational illustration where content restrictions would otherwise be a barrier. It handles complex, multi-element scenes involving people, props, and environments with strong prompt adherence. The model responds particularly well to structured prompts organized around subject, context, style, lighting, camera, and mood. It is available through WaveSpeed AI and is optimized for both single-shot and batch generation workflows.
Generates images from text prompts using the FLUX.1-schnell architecture with 8.9 billion parameters. Supports output resolutions up to 1536×1536 pixels.
Produces images at resolutions up to 1536×1536 pixels, configurable via numeric width and height inputs. Suitable for commercial and print-quality use cases.
Accepts a seed input to enable deterministic image generation, allowing the same prompt and seed combination to reproduce consistent results across runs.
Trained with a curated dataset of 5 million images to improve rendering of human figures, hands, and faces with reduced distortion artifacts.
Operates without the content restrictions present on many mainstream platforms, enabling mature artistic, medical, and experimental creative work.
Optimized for consistent generation across both single-shot and batch workflows, making it practical for high-volume creative production pipelines.
Primary API pricing shown in the same “quick compare” spirit as the reference page.
Places where this model is available, based on the synced detail-page metadata.
The configurable options currently documented for this model.
A specific value that is used to guide the 'randomness' of the generation.
Parameters currently listed by OpenRouter or the local catalog for this model.
Official model cards, release notes, docs, and other references synced from the source page.
Chroma discussions are most active in r/leagueoflegends, r/Warframe, r/marvelrivals. Top Reddit threads cluster around benchmark and model-comparison threads, safety and censorship questions.
The strongest match in this snapshot has 44516 upvotes and 217 comments.
I still use chroma for it's prompt adherence, totally uncensored, and use Klein to refine. I'm just wondering if there is something newer that is as or more uncensored as chroma?
I know it's asking a lot, but it'd be nice to see a model that can handle a prompt describing three or more characters
Big boi Chroma is with us since 2015, he's well known as employee of 'Double Cred.' Ind. and as Dad-Frame for babytennos.
After many years his Vex Armor & Effigy \[double credit reasons\] kept him useful till this day, but rest of his kit feels 'underwhelming' for him,
Therefore, for every Chromalover, we shall push the agenda forward, create art, create suggestion, spread awareness, do whatever your hearts desire and we shall rise on our wings!!
Feel free to share your love for Chroma :))
\#JoinTheChromalution
\#ReworkMyBoi
I know that because I'm using the flash lora my results are always going to be bad but people constantly call chroma a hidden gen or their favorite model but it seems impossible to get anything that actually looks good. Using the same prompts you would use on Z-Image Turbo or Base gives results that look like a wax figure. Non-photorealistic outputs always look alright at best. At \~30it/s it's incredibly slow as well. Am I missing something? I know some people use it for porn, but I'm certain that even SDXL models would give better results if that's what you want.
I am trying to understand why are people excited about Chroma. For photorealistic images I get improper faces, takes too long and quality is ok.
I use ComfyUI.
What is the use case of Chroma? Am I using it wrong?
I've been active on this sub basically since SD 1.5, and whenever something new comes out that ranges from "doesn't totally suck" to "Amazing," it gets wall to wall threads blanketing the entire sub during what I've come to view as a new model "Honeymoon" phase.
All a model needs to get this kind of attention is to meet the following criteria:
1: new in a way that makes it unique
2: can be run on consumer gpus reasonably
3: at least a 6/10 in terms of how good it is.
So far, anything that meets these 3 gets plastered all over this sub.
The one exception is Chroma, a model I've sporadically seen mentioned on here but never gave much attention to until someone impressed upon me how great it is in discord.
And yeah. This is it. This is Pony Flux. It's what would happen if you could type NLP Flux prompts into Pony.
I am incredibly impressed. With popular community support, this could EASILY dethrone all the other image gen models even hidream.
I like hidream too. But you need a lora for basically EVERYTHING in that and I'm tired of having to train one for every naughty idea.
Hidream also generates the exact same shit every time no matter the seed with only tiny differences. And despite using 4 different text encoders, it can only reliably do 127 tokens of input before it loses coherence. Seriously though all that vram on text encoders so you can enter like 4 fucking sentences at the most before it starts forgetting. I have no idea what they were thinking there.
Hidream DOES have better quality than Chroma but with community support Chroma could EASILY be the best of the best
Chroma has a context window of 10,000 tokens, which governs the length of text prompts it can process when generating images.
Chroma is built on the FLUX.1-schnell architecture and has 8.9 billion parameters. It was trained using over 105,000 hours of NVIDIA H100 GPU time.
Chroma supports image output up to 1536×1536 pixels. Width and height are configurable via numeric inputs.
According to the model metadata, Chroma's training date is listed as October 2025. Its dataset was curated from a pool of 5 million selected images.
Chroma is described as an uncensored model, meaning it does not apply the content filters common to many mainstream image generation platforms. It is intended for artists, designers, medical illustrators, and other professionals who require unrestricted creative output.
Chroma accepts a seed input. Using the same prompt, dimensions, and seed value will produce consistent, reproducible image outputs across generation runs.
Continue browsing adjacent models from the same provider.