-
Notifications
You must be signed in to change notification settings - Fork 88
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
Found TE RMS Norm is numerical instable in experiments. Discover the right version/setup for better accuracy.
Steps/Code to reproduce bug
Examples:
- Qwen2.5-7B: https://wandb.ai/nvidia/automodel-dev-zhiyul?nw=ouoluvgy6a
- llama3.2-3B: https://wandb.ai/nvidia/automodel-dev-zhiyul/workspace?nw=7nq9jmggxxm
The above 2 exps is valid and convergent only with TE RMS Norm disable.
A helpful guide on on how to craft a minimal bug report http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports.
Expected behavior
Match convergence with TE RMS Norm enabled.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working