feat: Add support for Tensor Parallelism to the Step-Video-T2V model #454

LiaoYuanF · 2025-02-25T09:39:24Z

为Custom Model(Step-Video-T2V)添加Tensor Parallelism支持

通过下列改进在SelfAttention、CrossAttention和FFN模块中实现TP支持：

总卡数	并行策略	并行维度	单迭代时长	加速比	显存占用
1	基准线	TP1 SP1	213.60s	1.00x	92,170M
2	TP	TP2 (Self+Cross+FFN)	108.97s	0.98x	57,458M (-37.7%)
2	SP	SP2	108.13s	0.99x	86,258M (-6.4%)
4	TP	TP4 (Self+Cross+FFN)	57.61s	0.93x	36,566M (-60.3%)
4	SP	SP4	57.01s	0.94x	78,226M (-15.1%)
8	TP	TP8 (Self+Cross+FFN)	30.40s	0.88x	30,028M (-67.4%)
8	SP	SP8	30.10s	0.89x	79,684M (-13.5%)

维度说明：

fengliaoyuan added 3 commits February 25, 2025 17:06

feat: Add support for Tensor Parallelism to the Step-Video-T2V model

fa26976

feat: adjust directory structure

c499d03

feat: adjust import directory

53a5a6f

feifeibear approved these changes Feb 25, 2025

View reviewed changes

feifeibear merged commit 6875fca into xdit-project:main Feb 25, 2025
2 of 3 checks passed