Stability

SDXL LoRA

SDXL LoRA is a text-to-image generative AI model developed by Stability AI, built as a successor to Stable Diffusion. It runs on a 3.5 billion parameter architecture and generates images natively at 1024×1024 resolution, using dual text encoders — OpenCLIP-ViT/G and CLIP-ViT/L — to interpret complex prompts with reported 89% prompt adherence in benchmark testing. The model also supports an optional refiner stage that applies an ensemble-of-experts approach to add fine detail to generated outputs. What distinguishes SDXL LoRA from the base SDXL model is its built-in support for Low-Rank Adaptation (LoRA), a technique that enables efficient style and subject customization without full model retraining. Users can apply up to five LoRA adapters simultaneously, making it practical for tasks like consistent character design, brand-specific imagery, and specialized artistic styles. It is well-suited for digital artists, marketing teams, game developers, and product designers who need repeatable, customizable visual output at scale.

Jul 04, 2023 10,000 context N/A output

Text-to-Image Generation LoRA Style Customization Image-to-Image Transformation Inpainting Seed Control Optional Refiner Stage

Overview ↓ About ↓ Capabilities ↓ Pricing ↓ Parameters ↓ Tools ↓ Resources ↓ Community ↓ FAQ ↓

Model Overview

High-signal model metadata in a structured two-column overview table.

Provider

The entity that provides this model.

Stability

Input Context Window

The number of tokens supported by the input context window.

10,000 tokens

Maximum Output Tokens

The number of tokens that can be generated by the model in a single request.

N/A tokens

Open Source

Whether the model's code is available for public use.

Release Date

When the model was first released.

Jul 04, 2023 3 years ago

Knowledge Cut-off Date

When the model's knowledge was last updated.

Unknown

API Providers

The providers that offer this model. This is not an exhaustive list.

Hugging Face

Modalities

Types of data this model can process.

Image Text Code

What is SDXL LoRA

A fuller summary of positioning, capabilities, and source-specific details for SDXL LoRA.

SDXL LoRA is a text-to-image generative AI model developed by Stability AI, built as a successor to Stable Diffusion. It runs on a 3.5 billion parameter architecture and generates images natively at 1024×1024 resolution, using dual text encoders — OpenCLIP-ViT/G and CLIP-ViT/L — to interpret complex prompts with reported 89% prompt adherence in benchmark testing. The model also supports an optional refiner stage that applies an ensemble-of-experts approach to add fine detail to generated outputs.

What distinguishes SDXL LoRA from the base SDXL model is its built-in support for Low-Rank Adaptation (LoRA), a technique that enables efficient style and subject customization without full model retraining. Users can apply up to five LoRA adapters simultaneously, making it practical for tasks like consistent character design, brand-specific imagery, and specialized artistic styles. It is well-suited for digital artists, marketing teams, game developers, and product designers who need repeatable, customizable visual output at scale.

Capabilities

What SDXL LoRA supports

IMG

Text-to-Image Generation

Generates images from text prompts at a native 1024×1024 resolution using a 3.5 billion parameter architecture with dual text encoders for prompt interpretation.

LoRA Style Customization

Applies Low-Rank Adaptation weights to customize the model's output style or subject without full retraining; supports stacking up to 5 LoRAs simultaneously.

IMG

Image-to-Image Transformation

Transforms an existing image guided by a text prompt, with adjustable prompt strength to control how much the output deviates from the source image.

Inpainting

Fills or replaces specific masked regions of an image using text-guided generation, allowing targeted edits without regenerating the full image.

Seed Control

Accepts a seed value as input to make image generation reproducible, enabling consistent outputs across repeated runs with the same prompt and settings.

Optional Refiner Stage

Passes generated images through a secondary refiner model using an ensemble-of-experts approach to enhance fine detail and image sharpness.

Pricing for SDXL LoRA

Primary API pricing shown in the same “quick compare” spirit as the reference page.

Input tokens N/A Per million tokens

Output tokens N/A Per million tokens

API Access & Providers

Places where this model is available, based on the synced detail-page metadata.

Hugging Face

Configuration & Parameters

The configurable options currently documented for this model.

Width

Number

Default: 1024 Range: 256 - 1536

Height

Number

Default: 1024 Range: 256 - 1536

LoRAs

LoRA

Up to 3 LoRAs.

Guidance Scale

Number

Default: 3.5 Range: 1 - 20 (step 0.1)

Inference Steps

Number

Default: 28 Range: 1 - 50 (step 1)

Negative Prompt

Text

Description of what to exclude from the video.

Seed

A specific value that is used to guide the 'randomness' of the generation.

Supported Request Parameters

Parameters currently listed by OpenRouter or the local catalog for this model.

Width Height LoRAs Guidance Scale Inference Steps Negative Prompt Seed

Resources & Documentation

Official model cards, release notes, docs, and other references synced from the source page.

Model Page on WaveSpeedAI Other

→

Announcement Blog Post Announcements

→

SDXL Technical Report (arXiv) Research

→

Stability AI SDXL Blog Post Announcements

→

SDXL on Hugging Face Documentation

→

Diffusers SDXL Documentation Documentation

→

AI tools related to SDXL LoRA

These tools are strongly connected to SDXL LoRA through direct product references, provider mentions, or explicit model mappings.

AI Video Generator

stability-ai-video-generator

Stability AI Video Generator is a tool designed to create videos from static images. Users can upload files in SVG, PNG, JPG, or GIF formats to transform them into video content. This service is currently in a research preview phase and is intended for educational or creative use.

Free 0 visits 6 saves

AI Image Generator

Stability AI

Stability AI: Stability AI is a company that develops cutting-edge open models in image, video, 3D, and audio generation. Their flagship product, Stable Diffusion, is a deep learning, text-to-image model used to generate detailed images conditioned on text descriptions. It can also be applied to other tasks such as inpainting, outpainting, and generating image-to-image translations guided by a text prompt. Stability AI offers various tools and platforms for deploying and utilizing their models, including self-hosted licenses, a platform API, and cloud platform integrations.

AI Image Generator

DreamStudio

DreamStudio is an online creative platform for generating AI-powered images. Developed by Stability.ai, a leader in open-source generative AI, DreamStudio serves as the official interface and API for Stable Diffusion, a state-of-the-art open-source image generation model.

Free 56 visits 11 saves

AI Image Generator

SDXL Turbo

SDXL Turbo is a high-performance text-to-image model that uses Adversarial Diffusion Distillation (ADD) to enable real-time image synthesis. Available at sdxlturbo.ai, it allows users to generate detailed 512x512 images quickly without needing to log in. This model is well-suited for gaming, VR, and content creation, leveraging advanced distillation technology to provide efficient, high-quality results.

Free 79 visits 9 saves

Community discussion

What people think about SDXL LoRA

SDXL LoRA discussions are most active in r/StableDiffusion, r/comfyui, r/Lora. Top Reddit threads cluster around benchmark and model-comparison threads, safety and censorship questions.

The strongest match in this snapshot has 1989 upvotes and 114 comments.

r/StableDiffusion 5 comments October 31, 2025

SDXL LoRA trained on RTX 5080 - 40 images → ~95 % style match

Ran a local SDXL 1.0 LoRA on 40 reference images (same art style).

• Training time ≈ 2 h
• bf16 + PEFT = half VRAM use of DreamBooth
• Outputs retain 90-95 % style consistency

ComfyUI + LoRA pipeline feels way more stable than cloud runs, and no data ever leaves the machine.

Happy to share configs or talk optimization for small-dataset LoRAs. DM if you want to see samples or logs.

*(No promo—just showing workflow.)*

Open Reddit thread

r/StableDiffusion 664 upvotes 138 comments June 13, 2024

SD3 body anatomy for sdxl lora

Open Reddit thread

r/comfyui 323 upvotes 74 comments May 3, 2025

A workflow to train SDXL LoRAs (only need training images, will do the rest)

[A workflow to train SDXL LoRAs.](https://civitai.com/models/1538062)

*This workflow is based on the incredible work by Kijai (*[*https://github.com/kijai/ComfyUI-FluxTrainer*](https://github.com/kijai/ComfyUI-FluxTrainer)*) who created the training nodes for ComfyUI based on Kohya\_ss (*[*https://github.com/kohya-ss/sd-scripts*](https://github.com/kohya-ss/sd-scripts)*) work. All credits go to them. Thanks also to* u/tom83_be *on Reddit who posted his installation and basic settings tips.*

Detailed instructions on the Civitai page.

Open Reddit thread

r/StableDiffusion 475 upvotes 98 comments January 3, 2024

LoRA Ease 🧞‍♂️: Train a high quality SDXL LoRA in a breeze ༄ with state-of-the-art techniques

Open Reddit thread

r/StableDiffusion 11 upvotes 33 comments March 22, 2026

SDXL LoRA trained on real person - face not similar, tattoos not rendering properly

I trained a LoRA on a real person (my model) with 94 photos. Dataset breakdown: \~21 close-up portraits, rest is half-body and full-body shots with varied outfits, poses and environments.

**Training settings:**

* Base model: stabilityai/stable-diffusion-xl-base-1.0
* Optimizer: Prodigy, LR: 1
* Network Rank: 64, Alpha: 32
* Epochs: 10, Repeats: 2 per image = \~1880 total steps
* Scheduler: cosine\_with\_restarts, 5 cycles
* Flags: gradient\_checkpointing, cache\_latents, shuffle\_caption, no\_half\_vae

**Captioning strategy:** Removed all constant facial features from captions (hair color, eye color, tattoos, scar) — kept only pose, outfit, background, lighting.

**Problem:** Generated face doesn't look like her at all. Wrong jaw shape, wrong mouth. She has distinct features: black hair with purple highlights, moon phases neck tattoo, snake+rose shoulder tattoo, small scar on chin. Tattoos appear blurry/barely visible. Face geometry is completely wrong.

**What I tried:**

* 6 epochs with 15 repeats (\~8460 steps) — face too generic
* 10 epochs with 2 repeats (\~1880 steps) — face still doesn't match, tattoos not rendering

**Question:** What am I doing wrong? Is it the captioning strategy, training parameters, or something else entirely?

Open Reddit thread

View more discussions →

FAQ

Common questions about SDXL LoRA

What is the context window for SDXL LoRA?

The model has a context window of 10,000 tokens as listed in the metadata, though for image generation models this typically refers to the maximum prompt length or token budget for text input rather than a conversational context.

How many LoRAs can I apply at once?

You can stack up to 5 LoRA adapters simultaneously, allowing you to combine multiple styles or subject customizations in a single generation.

What output resolution does SDXL LoRA produce?

The model generates images natively at 1024×1024 resolution, which is larger than the 512×512 native output of earlier Stable Diffusion versions like SD 1.5.

Does SDXL LoRA support image editing, or only generation from scratch?

In addition to text-to-image generation, the model supports image-to-image transformation and inpainting, allowing you to modify existing images or fill specific masked regions using text prompts.

Is there a training cutoff date for this model?

No training date is specified in the available metadata for SDXL LoRA. For the most accurate information on training data cutoff, refer to Stability AI's official documentation.

More models from Stability

Continue browsing adjacent models from the same provider.

← All AI Models