You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Your question
I'm using tools/checkpoint/convert.py to convert a llama model to mcore model format for training. The tools/checkpoint/loader_mcore.py support virtual pipeline model loading, but tools/checkpoint/saver_mcore.py doesn't support to save a virtual pipeline model.
Do I have any other way to do this convert? Or do I need to modify saver_mcore.py to support this? Maybe with support for args like target_num_layers_per_virtual_pipeline_stage and target_virtual_pipeline_model_parallel_size?
This discussion was converted from issue #1212 on October 23, 2024 21:04.
Heading
Bold
Italic
Quote
Code
Link
Numbered list
Unordered list
Task list
Attach files
Mention
Reference
Menu
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Your question
I'm using
tools/checkpoint/convert.py
to convert a llama model to mcore model format for training. Thetools/checkpoint/loader_mcore.py
support virtual pipeline model loading, buttools/checkpoint/saver_mcore.py
doesn't support to save a virtual pipeline model.Do I have any other way to do this convert? Or do I need to modify
saver_mcore.py
to support this? Maybe with support for args liketarget_num_layers_per_virtual_pipeline_stage
andtarget_virtual_pipeline_model_parallel_size
?Beta Was this translation helpful? Give feedback.
All reactions