save/load deepspeed checkpoint #12132
Unanswered
Jiaxin-Wen
asked this question in
DDP / multi-GPU / multi-node
Replies: 1 comment 3 replies
-
maybe try this: I think it applies to stage 2 as well. |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I've trained a T5 model with deepspeed stage2 and pytorch-lightning have automatically saved the checkpoints as usual.
However, when I try to load the checkpoints, I got the following error
Beta Was this translation helpful? Give feedback.
All reactions