You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, author, first of all, thank you for your MoE-PEFT contact work. I encountered a problem when fine-tuning with your MoE-PEFT. The moe_peft.json configuration I used is as follows:
Hello, author, first of all, thank you for your MoE-PEFT contact work. I encountered a problem when fine-tuning with your MoE-PEFT. The
moe_peft.json
configuration I used is as follows:The model I chose is
Meta-Llama-3-8B-Instruct
, I printed the output of the model's trainable parameters as follows:Your code prints like this:
But the code printed in your
trainer.py
seems to print the total trainable parameters for each expert:Under the above parameters, my model still exceeds the video memory on a single 24GB 4090, which seems a bit abnormal. Is there any good solution?
Looking forward to your answer and thank you!
The text was updated successfully, but these errors were encountered: