You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Try to print model.parameters() in transfomers trainer(), but get Parameter containing: tensor([], device='cuda:0', dtype=torch.bfloat16, requires_grad=True) for all layers
In fact, I am trying to return the correct model.parameters() in DeepSpeed Zero-3 mode and use the EMA model. Could you suggest any ways to solve the above issue, or any other methods to use the EMA model under Zero-3?
System Info
transformers 4.44.2
accelerate 1.2.1
deepspeed 0.12.2
torch 2.2.2
torchaudio 2.2.2
torchvision 0.17.2
Expected behavior
Expect to see the gathered parameters
The text was updated successfully, but these errors were encountered:
Describe the bug
Try to print model.parameters() in transfomers trainer(), but get Parameter containing: tensor([], device='cuda:0', dtype=torch.bfloat16, requires_grad=True) for all layers
In fact, I am trying to return the correct model.parameters() in DeepSpeed Zero-3 mode and use the EMA model. Could you suggest any ways to solve the above issue, or any other methods to use the EMA model under Zero-3?
System Info
transformers 4.44.2
accelerate 1.2.1
deepspeed 0.12.2
torch 2.2.2
torchaudio 2.2.2
torchvision 0.17.2
Expected behavior
Expect to see the gathered parameters
The text was updated successfully, but these errors were encountered: