[SANA-Video] Adding 5s pre-trained 480p SANA-Video inference #12584

lawrence-cj · 2025-11-04T03:09:51Z

What does this PR do?

This PR add SANA-Video, a new text/image-to-video model from NVIDIA
Paper
Project
HF weight

import torch
from diffusers import SanaPipeline, SanaVideoPipeline, UniPCMultistepScheduler, DPMSolverMultistepScheduler
from diffusers import AutoencoderKLWan
from diffusers.utils import export_to_video


model_id = "Efficient-Large-Model/SANA-Video_2B_480p_diffusers"
pipe = SanaVideoPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16)
# pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config, flow_shift=8.0)
# pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config, flow_shift=8.0)
pipe.vae.to(torch.float32)
pipe.text_encoder.to(torch.bfloat16)
pipe.to("cuda")
model_score = 30

prompt = "Evening, backlight, side lighting, soft light, high contrast, mid-shot, centered composition, clean solo shot, warm color. A young Caucasian man stands in a forest, golden light glimmers on his hair as sunlight filters through the leaves. He wears a light shirt, wind gently blowing his hair and collar, light dances across his face with his movements. The background is blurred, with dappled light and soft tree shadows in the distance. The camera focuses on his lifted gaze, clear and emotional."
negative_prompt = "A chaotic sequence with misshapen, deformed limbs in heavy motion blur, sudden disappearance, jump cuts, jerky movements, rapid shot changes, frames out of sync, inconsistent character shapes, temporal artifacts, jitter, and ghosting effects, creating a disorienting visual experience."
motion_prompt = f" motion score: {model_score}."
prompt = prompt + motion_prompt

video = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    height=480,
    width=832,
    frames=81,
    guidance_scale=6,
    num_inference_steps=50,
    generator=torch.Generator(device="cuda").manual_seed(42),
).frames[0]

export_to_video(video, "sana_video.mp4", fps=16)

Results:

sana_v2.mp4

2. add `SanaVideoPipeline` in pipeline_sana_video.py 3. add all code we need for import `SanaVideoPipeline`

2. add reshape function in sana-video-processor; 3. fix convert pth to safetensor bugs;

dg845 · 2025-11-04T23:16:11Z

src/diffusers/video_processor.py

+        return int(default_hw[0]), int(default_hw[1])
+
+    @staticmethod
+    def resize_and_crop_tensor(samples: torch.Tensor, new_width: int, new_height: int) -> torch.Tensor:


I think exposing an interface like VaeImageProcessor.resize:

diffusers/src/diffusers/image_processor.py

Lines 468 to 474 in dcfb18a

def resize(

self,

image: Union[PIL.Image.Image, np.ndarray, torch.Tensor],

height: int,

width: int,

resize_mode: str = "default", # "default", "fill", "crop"

) -> Union[PIL.Image.Image, np.ndarray, torch.Tensor]:

would be more robust, since different video preprocessing pipelines will probably make different choices here. Not blocking, on the diffusers side we can follow up to support more video pipelines here.

OK, I would let u guys help to finish this part. Thanks!!

src/diffusers/models/transformers/transformer_sana_video.py

src/diffusers/pipelines/sana/pipeline_sana_video.py

src/diffusers/models/transformers/transformer_sana_video.py

src/diffusers/pipelines/sana/pipeline_sana_video.py

src/diffusers/models/transformers/transformer_sana_video.py

dg845

Thanks for the PR! Would you be able to add tests and docs? We can help with both, especially the tests, but for the docs it may be harder for us as we are not as familiar with the intricacies of the model.

Documentation example (Wan): https://github.com/huggingface/diffusers/blob/main/docs/source/en/api/pipelines/wan.md
Model tests example (WanTransformer3DModel): https://github.com/huggingface/diffusers/blob/main/tests/models/transformers/test_models_transformer_wan.py
Pipeline tests example (WanPipeline): https://github.com/huggingface/diffusers/blob/main/tests/pipelines/wan/test_wan.py

Co-authored-by: dg845 <[email protected]>

Co-authored-by: YiYi Xu <[email protected]>

tests/pipelines/sana/test_sana_video.py

Co-authored-by: dg845 <[email protected]>

tests/pipelines/sana/test_sana_video.py

Co-authored-by: dg845 <[email protected]>

tests/pipelines/sana/test_sana_video.py

dg845

Thanks for the follow up changes! I have made some suggestions that should help the Sana Video pipeline tests pass.

Sorry for all the small change requests, but could you also do the following?

Can you run the following to make sure that the CI code quality check is green?

make style
make quality
make fix-copies

Can you add the new Sana Video markdown docs to docs/source/en/_toctree.yml? For reference, here is how the Sana pipeline docs were added:

diffusers/docs/source/en/_toctree.yml

Lines 562 to 563 in dcfb18a

- local: api/pipelines/sana

title: Sana

This change will help the docs build correctly.

src/diffusers/pipelines/sana/pipeline_sana_video.py

src/diffusers/video_processor.py

Co-authored-by: dg845 <[email protected]>

make quality make fix-copies

lawrence-cj · 2025-11-05T09:14:38Z

Thanks for the follow up changes! I have made some suggestions that should help the Sana Video pipeline tests pass.

Sorry for all the small change requests, but could you also do the following?

Can you run the following to make sure that the CI code quality check is green?
make style
make quality
make fix-copies
Can you add the new Sana Video markdown docs to docs/source/en/_toctree.yml? For reference, here is how the Sana pipeline docs were added:

diffusers/docs/source/en/_toctree.yml

Lines 562 to 563 in dcfb18a

- local: api/pipelines/sana

title: Sana

This change will help the docs build correctly.

Done! Let's test it.

dg845 · 2025-11-05T20:19:00Z

docs/source/en/_toctree.yml

+      - local: api/models/sana_video_transformer3d
+        title: SanaVideoTransformer3DModel


This will cause an error when building the docs since the api/models/sana_video_transformer3d file doesn't currently exist. Could you add a markdown doc for the transformer as well? For reference, here is the documentation for SanaTransformer2DModel: https://github.com/huggingface/diffusers/blob/main/docs/source/en/api/models/sana_transformer2d.md

HuggingFaceDocBuilderDev · 2025-11-06T03:44:59Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

dg845 · 2025-11-06T04:34:18Z

@bot /style

github-actions · 2025-11-06T04:34:43Z

Style bot fixed some files and pushed the changes.

dg845 · 2025-11-06T05:08:06Z

@lawrence-cj, thanks again for the PR! The CI errors are unrelated to the PR so merging.

lawrence-cj · 2025-11-06T05:09:49Z

Thank you so much for your support! ❤️

Cc @dg845 @sayakpaul @yiyixuxu

sayakpaul · 2025-11-06T05:14:19Z

Congratulations on the release!

lawrence-cj added 6 commits November 3, 2025 17:53

1. add SanaVideoTransformer3DModel in transformer_sana_video.py

13e516c

2. add `SanaVideoPipeline` in pipeline_sana_video.py 3. add all code we need for import `SanaVideoPipeline`

add a sample about how to use sana-video;

5eb5354

code update;

c6d7876

update hf model path;

d67ab2a

update code;

a5f19e0

sana-video can run now;

c15ae23

sayakpaul requested a review from dg845 November 4, 2025 03:16

lawrence-cj added 4 commits November 3, 2025 21:12

1. add aspect ratio in sana-video-pipeline;

ee79af3

2. add reshape function in sana-video-processor; 3. fix convert pth to safetensor bugs;

Merge branch 'main' into feat/sana-video

f06a93d

default to use use_resolution_binning;

49557c1

make style;

857ca30

lawrence-cj mentioned this pull request Nov 4, 2025

SANA-Video PR is under construction NVlabs/Sana#321

Merged

remove unused code;

3ed7000

dg845 reviewed Nov 4, 2025

View reviewed changes

src/diffusers/models/transformers/transformer_sana_video.py Outdated Show resolved Hide resolved

dg845 reviewed Nov 5, 2025

View reviewed changes

src/diffusers/models/transformers/transformer_sana_video.py Outdated Show resolved Hide resolved

dg845 reviewed Nov 5, 2025

View reviewed changes

src/diffusers/models/transformers/transformer_sana_video.py Outdated Show resolved Hide resolved

dg845 reviewed Nov 5, 2025

View reviewed changes

src/diffusers/models/transformers/transformer_sana_video.py Outdated Show resolved Hide resolved

yiyixuxu reviewed Nov 5, 2025

View reviewed changes

src/diffusers/pipelines/sana/pipeline_sana_video.py Outdated Show resolved Hide resolved

dg845 reviewed Nov 5, 2025

View reviewed changes

src/diffusers/models/transformers/transformer_sana_video.py Outdated Show resolved Hide resolved

dg845 reviewed Nov 5, 2025

View reviewed changes

src/diffusers/models/transformers/transformer_sana_video.py Outdated Show resolved Hide resolved

dg845 reviewed Nov 5, 2025

View reviewed changes

src/diffusers/models/transformers/transformer_sana_video.py Show resolved Hide resolved

dg845 reviewed Nov 5, 2025

View reviewed changes

src/diffusers/pipelines/sana/pipeline_sana_video.py Outdated Show resolved Hide resolved

dg845 reviewed Nov 5, 2025

View reviewed changes

src/diffusers/pipelines/sana/pipeline_sana_video.py Outdated Show resolved Hide resolved

dg845 reviewed Nov 5, 2025

View reviewed changes

src/diffusers/models/transformers/transformer_sana_video.py Outdated Show resolved Hide resolved

dg845 reviewed Nov 5, 2025

View reviewed changes

lawrence-cj and others added 4 commits November 5, 2025 11:59

Update src/diffusers/models/transformers/transformer_sana_video.py

439bf58

Co-authored-by: dg845 <[email protected]>

Update src/diffusers/models/transformers/transformer_sana_video.py

de4cf31

Co-authored-by: dg845 <[email protected]>

Update src/diffusers/models/transformers/transformer_sana_video.py

fe73287

Co-authored-by: dg845 <[email protected]>

Update src/diffusers/pipelines/sana/pipeline_sana_video.py

118677a

Co-authored-by: YiYi Xu <[email protected]>

dg845 reviewed Nov 5, 2025

View reviewed changes

tests/pipelines/sana/test_sana_video.py Outdated Show resolved Hide resolved

lawrence-cj and others added 2 commits November 5, 2025 16:04

Update tests/pipelines/sana/test_sana_video.py

1379391

Update tests/pipelines/sana/test_sana_video.py

b359240

Co-authored-by: dg845 <[email protected]>

dg845 reviewed Nov 5, 2025

View reviewed changes

tests/pipelines/sana/test_sana_video.py Outdated Show resolved Hide resolved

Update tests/pipelines/sana/test_sana_video.py

7256023

Co-authored-by: dg845 <[email protected]>

dg845 reviewed Nov 5, 2025

View reviewed changes

tests/pipelines/sana/test_sana_video.py Outdated Show resolved Hide resolved

dg845 reviewed Nov 5, 2025

View reviewed changes

tests/pipelines/sana/test_sana_video.py Outdated Show resolved Hide resolved

dg845 reviewed Nov 5, 2025

View reviewed changes

tests/pipelines/sana/test_sana_video.py Show resolved Hide resolved

dg845 approved these changes Nov 5, 2025

View reviewed changes

dg845 reviewed Nov 5, 2025

View reviewed changes

src/diffusers/pipelines/sana/pipeline_sana_video.py Outdated Show resolved Hide resolved

dg845 reviewed Nov 5, 2025

View reviewed changes

src/diffusers/video_processor.py Outdated Show resolved Hide resolved

lawrence-cj and others added 7 commits November 5, 2025 17:07

Update tests/pipelines/sana/test_sana_video.py

25d1a4c

Co-authored-by: dg845 <[email protected]>

Update tests/pipelines/sana/test_sana_video.py

a9c16eb

Co-authored-by: dg845 <[email protected]>

Update tests/pipelines/sana/test_sana_video.py

8a27d58

Co-authored-by: dg845 <[email protected]>

Update src/diffusers/pipelines/sana/pipeline_sana_video.py

4c25427

Co-authored-by: dg845 <[email protected]>

Update src/diffusers/video_processor.py

31c9fa5

Co-authored-by: dg845 <[email protected]>

make style

0ed7eee

make quality make fix-copies

toctree yaml update;

e31f91b

dg845 reviewed Nov 5, 2025

View reviewed changes

lawrence-cj added 2 commits November 5, 2025 18:39

add sana-video-transformer3d markdown;

cb31fc2

Merge branch 'main' into feat/sana-video

2b8c3e3

Apply style fixes

f3c87f4

dg845 merged commit b3e9dfc into huggingface:main Nov 6, 2025
9 of 11 checks passed

	def resize(
	self,
	image: Union[PIL.Image.Image, np.ndarray, torch.Tensor],
	height: int,
	width: int,
	resize_mode: str = "default", # "default", "fill", "crop"
	) -> Union[PIL.Image.Image, np.ndarray, torch.Tensor]:

		- local: api/models/sana_video_transformer3d
		title: SanaVideoTransformer3DModel

[SANA-Video] Adding 5s pre-trained 480p SANA-Video inference #12584

[SANA-Video] Adding 5s pre-trained 480p SANA-Video inference #12584

Conversation

lawrence-cj commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

dg845 Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

lawrence-cj Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dg845 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dg845 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

lawrence-cj commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dg845 Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

lawrence-cj Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Nov 6, 2025

Uh oh!

dg845 commented Nov 6, 2025

Uh oh!

github-actions bot commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dg845 commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

lawrence-cj commented Nov 6, 2025

Uh oh!

sayakpaul commented Nov 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

lawrence-cj commented Nov 4, 2025 •

edited

Loading

dg845 left a comment •

edited

Loading

lawrence-cj commented Nov 5, 2025 •

edited

Loading

github-actions bot commented Nov 6, 2025 •

edited

Loading

dg845 commented Nov 6, 2025 •

edited

Loading