Comment in `parallel_state` is inaccurate. #472

c8ef · 2025-03-06T16:14:52Z

xDiT/xfuser/core/distributed/parallel_state.py

Lines 335 to 345 in 47f2071

    
               Let's say we have a total of 16 GPUs denoted by g0 ... g15 and we 
        
               use 2 groups to parallelize the batch dim(dp), 2 groups to parallelize 
        
               splited batch caused by CFG, and 2 GPUs to parallelize sequence. 
        
               dp_degree (2) * cfg_degree (2) * sp_degree (2) * pp_degree (2) = 16. 
        
               The present function will create 2 data parallel-groups, 
        
               8 CFG group, 8 pipeline-parallel group, and 
        
               8 sequence-parallel groups: 
        
                   2 data-parallel groups: 
        
                       [g0, g1, g2, g3, g4, g5, g6, g7],

The comment is a bit confusing. As shown in the following script, it appears that we will have 8 DP groups instead of 2.

from xfuser.core.distributed.utils import RankGenerator

rank = RankGenerator(tp=1, dp=2, cfg=2, sp=2, pp=2, order="tp-sp-pp-cfg-dp")
print("dp :", rank.get_ranks("dp"))
print("cfg:", rank.get_ranks("cfg"))
print("sp :", rank.get_ranks("sp"))
print("pp :", rank.get_ranks("pp"))

# dp : [[0, 8], [1, 9], [2, 10], [3, 11], [4, 12], [5, 13], [6, 14], [7, 15]]
# cfg: [[0, 4], [1, 5], [2, 6], [3, 7], [8, 12], [9, 13], [10, 14], [11, 15]]
# sp : [[0, 1], [2, 3], [4, 5], [6, 7], [8, 9], [10, 11], [12, 13], [14, 15]]
# pp : [[0, 2], [1, 3], [4, 6], [5, 7], [8, 10], [9, 11], [12, 14], [13, 15]]

c8ef mentioned this issue Mar 6, 2025

Fix inaccurate comment in parallel_state. #473

Merged

feifeibear closed this as completed in #473 Mar 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comment in `parallel_state` is inaccurate. #472

Comment in `parallel_state` is inaccurate. #472

c8ef commented Mar 6, 2025

Comment in parallel_state is inaccurate. #472

Comment in parallel_state is inaccurate. #472

Comments

c8ef commented Mar 6, 2025

Comment in `parallel_state` is inaccurate. #472

Comment in `parallel_state` is inaccurate. #472