Skip to content

TE RMS Norm numerical Instability #1432

@ZhiyuLi-Nvidia

Description

@ZhiyuLi-Nvidia

Describe the bug

Found TE RMS Norm is numerical instable in experiments. Discover the right version/setup for better accuracy.

Steps/Code to reproduce bug

Examples:

The above 2 exps is valid and convergent only with TE RMS Norm disable.

A helpful guide on on how to craft a minimal bug report http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports.

Expected behavior

Match convergence with TE RMS Norm enabled.

Metadata

Metadata

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions