Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

benchmark_litgpt - update low_precision_mode accepted values #1779

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

kshitij12345
Copy link
Collaborator

With TE 2.0, the default recipe is chosen based on the platform

https://github.com/NVIDIA/TransformerEngine/blob/b39397c541292f336c5964dd1661d80c08dc4c78/transformer_engine/pytorch/fp8.py#L46-L51

For B200, it is MXFP8BlockScaling
For H100 and others, it is DelayedScaling

This PR updates benchmark_litgpt.py script to update the accepted value for --low-precision-mode from fp8-delayed-te and fp8-delayed-te-wo_layernorm to fp8-default-te and fp8-default-te-wo_layernorm to hint at this.

@kshitij12345 kshitij12345 changed the title update: low_precision_mode accepted values benchmark_litgpt - update low_precision_mode accepted values Feb 18, 2025
@kshitij12345
Copy link
Collaborator Author

cc: @AdamRajfer

@kshitij12345 kshitij12345 marked this pull request as ready for review March 3, 2025 19:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant