Wan

Wan 2.2

Wan 2.2 is a multimodal video generation model developed by Alibaba's Tongyi Laboratory and released in July 2025 under the Apache 2.0 license. It is the first video diffusion model to apply a Mixture-of-Experts (MoE) architecture, which splits processing between high-noise expert networks that handle overall layout and composition and low-noise expert networks that refine fine details. The model supports both text-to-video and image-to-video generation, with native bilingual prompting in English and Chinese. It is available in a 5B parameter variant suited for consumer hardware and a 14B parameter variant for higher-quality output. Wan 2.2 was trained on a dataset expanded significantly from its predecessor, with image data increasing by 65.6% and video data by 83.2%. It includes a dedicated aesthetic fine-tuning stage informed by film industry standards, further refined through reinforcement learning to align with human visual preferences. Specialized modules — Wan-Animate and Wan-Move — allow users to animate a character from a single image or transfer motion from one video to another subject. The model is natively supported by ComfyUI and accepts LoRA adapters and source images as inputs alongside text prompts.

Mar 26, 2025 1,000 context N/A output
Text-to-Video Generation Image-to-Video Generation LoRA Support Character Animation Motion Transfer Cinematic Aesthetic Control

Model Overview

High-signal model metadata in a structured two-column overview table.

Provider

The entity that provides this model.

Wan

Input Context Window

The number of tokens supported by the input context window.

1,000 tokens

Maximum Output Tokens

The number of tokens that can be generated by the model in a single request.

N/A tokens

Open Source

Whether the model's code is available for public use.

No

Release Date

When the model was first released.

Mar 26, 2025 1 year ago

Knowledge Cut-off Date

When the model's knowledge was last updated.

Unknown

API Providers

The providers that offer this model. This is not an exhaustive list.

Hugging Face

Modalities

Types of data this model can process.

Video Text Image

What is Wan 2.2

A fuller summary of positioning, capabilities, and source-specific details for Wan 2.2.

Wan 2.2 is a multimodal video generation model developed by Alibaba's Tongyi Laboratory and released in July 2025 under the Apache 2.0 license. It is the first video diffusion model to apply a Mixture-of-Experts (MoE) architecture, which splits processing between high-noise expert networks that handle overall layout and composition and low-noise expert networks that refine fine details. The model supports both text-to-video and image-to-video generation, with native bilingual prompting in English and Chinese. It is available in a 5B parameter variant suited for consumer hardware and a 14B parameter variant for higher-quality output.

Wan 2.2 was trained on a dataset expanded significantly from its predecessor, with image data increasing by 65.6% and video data by 83.2%. It includes a dedicated aesthetic fine-tuning stage informed by film industry standards, further refined through reinforcement learning to align with human visual preferences. Specialized modules — Wan-Animate and Wan-Move — allow users to animate a character from a single image or transfer motion from one video to another subject. The model is natively supported by ComfyUI and accepts LoRA adapters and source images as inputs alongside text prompts.

Capabilities

What Wan 2.2 supports

VID

Text-to-Video Generation

Generates video clips from written text prompts, supporting both English and Chinese input natively. The 14B parameter variant targets higher visual fidelity while the 5B variant is optimized for consumer hardware.

IMG

Image-to-Video Generation

Animates a static reference image into a dynamic video clip using the I2V pipeline. Accepts an image URL as input alongside a text prompt to guide motion and style.

AI

LoRA Support

Accepts LoRA adapter weights to customize the model's visual style or subject matter without full retraining. LoRA inputs are specified directly in the generation request.

AI

Character Animation

The Wan-Animate module animates a character from a single source image, producing a video with natural motion from a still photo.

AI

Motion Transfer

The Wan-Move module transfers motion patterns from one video onto a different subject, enabling pose and movement replication across subjects.

AI

Cinematic Aesthetic Control

Provides control over lighting, color grading, lens composition, and camera movement through text prompts. Aesthetic fine-tuning was informed by film industry standards and refined with reinforcement learning.

AI

Seed-Based Reproducibility

Accepts a seed value as an input parameter, allowing users to reproduce identical outputs or systematically explore variations from a fixed starting point.

AI

MoE Architecture

Uses a Mixture-of-Experts architecture that routes work between high-noise experts for layout and low-noise experts for detail refinement within a single diffusion model.

Pricing for Wan 2.2

Primary API pricing shown in the same “quick compare” spirit as the reference page.

API Access & Providers

Places where this model is available, based on the synced detail-page metadata.

Hugging Face

Configuration & Parameters

The configurable options currently documented for this model.

Resolution

Select
Default: 720p
720p 480p

Duration

Select
Default: 5
5 seconds 8 seconds

LoRAs

LoRA

Up to 3 LoRAs.

Negative Prompt

Text

Description of what to exclude from the video.

Seed

Seed

A specific value that is used to guide the 'randomness' of the generation.

Supported Request Parameters

Parameters currently listed by OpenRouter or the local catalog for this model.

Resolution Duration LoRAs Negative Prompt Seed

Resources & Documentation

Official model cards, release notes, docs, and other references synced from the source page.

Community discussion

What people think about Wan 2.2

Wan 2.2 discussions are most active in r/StableDiffusion, r/comfyui, r/grok. Top Reddit threads cluster around benchmark and model-comparison threads, safety and censorship questions.

The strongest match in this snapshot has 6073 upvotes and 177 comments.

r/comfyui 8 upvotes 56 comments May 3, 2026
Did Wan 2.2 14B stop NSFW generations ?

I have been using Wan2.2 (12V) 14B on huggingface for a while now to do NSFW image to video generations and it always worked great. But for the last couple of days I keep getting ' Generation blocked by guardrails: The resulting video may contain explicit content.'.

Does Wan 2.2 14B no longer support NSFW ? It still shows up in huggingface if you type ' NSFW image to video' in the search bar, but is not allowing NSFW image to video.

Any help or insight into this would be really appreciated. Thanks!

Edit 1: I understand that the space on huggingface no longer allows nsfw generation and the model itself has not changed, so the question now becomes : what other alternatives are out there ? I am mostly looking for spaces on huggingface or platforms similar to huggingface which requires no prior set up. Running it locally for me takes too long for the workflows that I have.

Edit 2 (Fixed): Turns out they added a checkbox for 'Enable Safety Filter' in the advanced settings, with it being always turned on by default. Just had to flip the switch and voila! Huge thanks to @VisibleExchange7528 for pointing this out !!!

Open Reddit thread
r/StableDiffusion 1,885 upvotes 176 comments October 2, 2025
WAN 2.2 Animate - Character Replacement Test

Seems pretty effective.

Her outfit is inconsistent, but I used a reference image that only included the upper half of her body and head, so that is to be expected.

I should say, these clips are from the film "The Ninth Gate", which is excellent. :)

Open Reddit thread
View more discussions →
FAQ

Common questions about Wan 2.2

What is the context window for Wan 2.2?

Wan 2.2 has a context window of 1,000 tokens, which governs the length and complexity of text prompts it can process in a single generation request.

What model sizes are available for Wan 2.2?

Wan 2.2 is available in two sizes: a 5B parameter version designed for efficient use on consumer hardware and a 14B parameter version intended for higher-quality output. Both are available on Hugging Face under the Apache 2.0 license.

Is Wan 2.2 free to use commercially?

Yes. Wan 2.2 is released under the Apache 2.0 license, which permits free commercial use. The model weights are publicly available on Hugging Face.

What input types does Wan 2.2 accept?

Wan 2.2 accepts text prompts, image URLs (for image-to-video generation), LoRA adapter weights, configurable select options, and a seed value for reproducibility.

When was Wan 2.2 trained and released?

Wan 2.2 was released in July 2025 by Alibaba's Tongyi Laboratory. Its training data includes an image dataset 65.6% larger and a video dataset 83.2% larger than those used for its predecessor, Wan 2.1.

Does Wan 2.2 work with ComfyUI?

Yes. Wan 2.2 has native support in ComfyUI. Official tutorials and workflow documentation are available at docs.comfy.org.

More models from Wan

Continue browsing adjacent models from the same provider.

← All AI Models