Loading fine-tuned model built from pretrained subnetworks #10152
Replies: 9 comments 5 replies
-
https://pytorch-lightning.readthedocs.io/en/latest/common/weights_loading.html |
Beta Was this translation helpful? Give feedback.
-
https://forums.pytorchlightning.ai/t/save-load-model-for-inference/542 |
Beta Was this translation helpful? Give feedback.
-
https://forums.pytorchlightning.ai/t/how-to-load-and-use-model-checkpoint-ckpt/677 |
Beta Was this translation helpful? Give feedback.
-
https://forums.pytorchlightning.ai/t/saving-loading-lightningmodule-with-injected-network/394 |
Beta Was this translation helpful? Give feedback.
-
https://forums.pytorchlightning.ai/t/saving-loading-the-model-for-inference-later/589 |
Beta Was this translation helpful? Give feedback.
-
Hi @Programmer-RD-AI @awaelchli Thanks for the answers, I have read before the Lightning documentation about the basic usage of loading functions for pre-trained model checkpoints. This didn't answer my current problem and within the links listed, the closest discussion to mine is https://forums.pytorchlightning.ai/t/saving-loading-lightningmodule-with-injected-network/394 However this discussion was not answered and I am still looking for best practices on this matter, i.e. handling model injection in the Lightning module class and restoring such models. I would highly appreciate if you could comment on the situation I posted or ask me any additional detail if I didn't expose properly enough the problem. Thanks ! |
Beta Was this translation helpful? Give feedback.
-
@adrienchaton If you add Model1.load_from_checkpoint(ckpt_1) One less path to worry about :)
I don't have a good answer how to get the path of the right yaml file, but you could use the checkpoint path from the trainer: trainer.checkpoint_callback.best_model_path
or
trainer.checkpoint_callback.last_model_path So together: # trainer model1
...
ckpt1 = trainer.checkpoint_callback.best_model_path
# trainer model2
...
ckpt2 = trainer.checkpoint_callback.best_model_path
# train combined model
model = CombinedModel(ckpt1, ckpt2)
...
ckpt3 = trainer.checkpoint_callback.best_model_path Let me know if that's useful for you. |
Beta Was this translation helpful? Give feedback.
-
Hello everyone,
I would like to ask for confirmation if I get the expected behaviour please and if there would be best practices to handle the following situation.
I have two LightningModule that I call e.g. model_1 and model_2, which I pretrain separately. After saving them, I get ckpt_1,yaml_1 and ckpt_2,yaml_2 which describe their trained parameters and hyper-parameters.
Now I put them together in a model e.g. combined_model and I fine-tune them on the task of the combined_model.
At the beginning of the fine-tuning I build the model as:
→ combined_model optimizes the trainable parameters of model_1 and model_2, starting from the pretrained checkpoints, right ?
After the fine-tuning is done, I have ckpt_3 and yaml_3 which give the fine-tuned parameters and the destinations of the pretrained checkpoints used to build combined_model.
Usually I could just restore the fine-tuned model as
The problem I have is working with remote servers, the paths change in between the fine-tuning run and another test run so in the end yaml_3 point to wrong paths for ckpt_1,yaml_1 and ckpt_2,yaml_2 when I want to restore the fine-tuned combined_model.
What I do then is that I manually specify these new paths ckpt_1bis,yaml_1bis and ckpt_2bis,yaml_2bis in
→ in this case, am I for sure properly loading the fine-tuned weights of ckpt_3 and not the pretrained weights of ckpt_1bis and ckpt_2bis ?
I think so but I would like to be sure and also, are there any recommended ways to better handle this situation please ?
Thanks !
Beta Was this translation helpful? Give feedback.
All reactions