DDP is not accelerating my training #12771
Answered
by
rohitgr7
nian-liu
asked this question in
DDP / multi-GPU / multi-node
-
Beta Was this translation helpful? Give feedback.
Answered by
rohitgr7
Apr 20, 2022
Replies: 1 comment 1 reply
-
looks like in your case, DDP is not triggered for some reason since if you are not changing the batch_size and total batches in the progress bar should be reduced with DDP on 4 GPUs. did you see any logs like this when you call Initializing distributed: GLOBAL_RANK: 1, MEMBER: 2/4
Initializing distributed: GLOBAL_RANK: 2, MEMBER: 3/4
Initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/4
Initializing distributed: GLOBAL_RANK: 3, MEMBER: 4/4 |
Beta Was this translation helpful? Give feedback.
1 reply
Answer selected by
akihironitta
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
looks like in your case, DDP is not triggered for some reason since if you are not changing the batch_size and total batches in the progress bar should be reduced with DDP on 4 GPUs.
did you see any logs like this when you call
trainer.fit
??