DreamID-V: Bridging the Image-to-Video Gap for High-Fidelity Face Swapping via Diffusion Transformer

DreamID-V: Bridging the Image-to-Video Gap for High-Fidelity Face Swapping via Diffusion Transformer
Xu Guo^*, Fulong Ye^*, Xinghui Li^*, Pengqi Tu, Pengze Zhang, Qichao Sun, Songtao Zhao^†, Xiangwang Hou^† Qian He
^*Equal contribution,^†Corresponding author
Tsinghua University | Intelligent Creation Team, ByteDance

🔥 News

[01/06/2026] 🔥 Our paper is released!
[01/05/2026] 🔥 Our code is released!
[12/17/2025] 🔥 Our project is released!
[08/11/2025] 🎉 Our image version DreamID is accepted by SIGGRAPH Asia 2025!

⚡️ Quickstart

Model Preparation

Models	Download Link	Notes
DreamID-V	🤗 Huggingface	Supports 480P & 720P
Wan-2.1	🤗 Huggingface	VAE & Text encoder

Installation

Install dependencies:

# Ensure torch >= 2.4.0
pip install -r requirements.txt

DreamID-V-Wan-1.3B

Single-GPU inference

python generate_dreamidv.py \
    --size 832*480 \
    --ckpt_dir wan2.1-1.3B path \
    --dreamidv_ckpt dreamidv.pth path  \
    --sample_steps 50 \
    --base_seed 42

Multi-GPU inference using FSDP + xDiT USP

pip install "xfuser>=0.4.1"
torchrun --nproc_per_node=2 generate_dreamidv.py \
    --size 832*480 \
    --ckpt_dir wan2.1-1.3B path \
    --dreamidv_ckpt dreamidv.pth path  \
    --sample_steps 50 \
    --dit_fsdp \
    --t5_fsdp \
    --ulysses_size 2 \
    --ring_size 1 \
    --base_seed 42

👍 Acknowledgements

Our work builds upon and is greatly inspired by several outstanding open-source projects, including Wan2.1, Phantom, OpenHumanVid, Follow-Your-Emoji. We sincerely thank the authors and contributors of these projects for generously sharing their excellent codes and ideas.

📧 Contact

If you have any comments or questions regarding this open-source project, please open a new issue or contact Xu Guo and Fulong Ye.

⭐ Citation

If you find our work helpful, please consider citing our paper and leaving valuable stars

@misc{guo2026dreamidvbridgingimagetovideogaphighfidelity,
      title={DreamID-V:Bridging the Image-to-Video Gap for High-Fidelity Face Swapping via Diffusion Transformer}, 
      author={Xu Guo and Fulong Ye and Xinghui Li and Pengqi Tu and Pengze Zhang and Qichao Sun and Songtao Zhao and Xiangwang Hou and Qian He},
      year={2026},
      eprint={2601.01425},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2601.01425}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
assets		assets
dreamidv_wan		dreamidv_wan
express_adaption		express_adaption
LICENSE.txt		LICENSE.txt
README.md		README.md
generate.py		generate.py
generate_dreamidv.py		generate_dreamidv.py
infer.sh		infer.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DreamID-V: Bridging the Image-to-Video Gap for High-Fidelity Face Swapping via Diffusion Transformer

🔥 News

⚡️ Quickstart

Model Preparation

Installation

DreamID-V-Wan-1.3B

👍 Acknowledgements

📧 Contact

⭐ Citation

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

bytedance/DreamID-V

Folders and files

Latest commit

History

Repository files navigation

DreamID-V: Bridging the Image-to-Video Gap for High-Fidelity Face Swapping via Diffusion Transformer

🔥 News

⚡️ Quickstart

Model Preparation

Installation

DreamID-V-Wan-1.3B

👍 Acknowledgements

📧 Contact

⭐ Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages