fix regression issue and command in mix-precision example #2317

xin3he · 2025-10-16T04:12:55Z

User description

Type of Change

bug fix

PR Type

Enhancement, Bug fix

Description

Added mem_per_param_scale and enable_torch_compile arguments
Updated dtype handling for uNVFP4 and NVFP4+
Fixed regression issues in dtype mapping and layer configuration
Updated README to include enable_torch_compile in example command

Diagram Walkthrough

flowchart LR
  A["Add mem_per_param_scale"] -- "New argument" --> B["Update dtype handling"]
  B -- "Fix regression" --> C["Add enable_torch_compile"]
  C -- "Update README" --> D["Enhance quantization"]

File Walkthrough

Relevant files

Enhancement

quantize.py `Add mem_per_param_scale and enable_torch_compile` examples/pytorch/nlp/huggingface_models/language-modeling/quantization/mix-precision/quantize.py Added `mem_per_param_scale` and `enable_torch_compile` arguments Updated dtype handling for `uNVFP4` and `NVFP4+` Fixed regression issues in dtype mapping and layer configuration	+27/-27

Documentation

README.md `Update README with enable_torch_compile` examples/pytorch/nlp/huggingface_models/language-modeling/quantization/mix-precision/README.md Updated example command to include `enable_torch_compile`	+3/-3

Signed-off-by: He, Xin3 <[email protected]>

PRAgent4INC · 2025-10-16T04:13:32Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review Command Consistency The command for running the script with `deepspeed` has different node specifications (`localhost:0,1,2,3` vs `localhost:4,5,6,7`). Ensure these are consistent or clarify the reason for the difference. deepspeed --include="localhost:0,1,2,3" --master_port=29500 quantize.py \ --model_name_or_path meta-llama/Llama-3.3-70B-Instruct/ \

PRAgent4INC · 2025-10-16T04:13:55Z

PR Code Suggestions ✨

Signed-off-by: He, Xin3 <[email protected]>

thuang6

Better change AR dependency to pip released v0.8 version after AR v0.8 released

thuang6 · 2025-10-20T03:59:53Z

...ples/pytorch/nlp/huggingface_models/language-modeling/quantization/mix-precision/quantize.py

    parser.add_argument("--device_map", type=str, default=None, help="device map for model")
    parser.add_argument("--use_recipe", action="store_true", help="whether to use recipe to quantize model")
    parser.add_argument("--recipe_file", type=str, default="recipes/Meta-Llama-3.1-8B-Instruct_6bits.json", help="path of recipe file")
+    parser.add_argument("--mem_per_param_scale", default=13, type=int, help="memory per param scale factor")


not see this arg is used in example, is it added for further tuning consideration? any guideline on how user set the value?

Yes, It's for llama3.3 70b pipeline parallel. It's added in case that user wants to run 70b without TP.
It's not the suggested way, the suggested way is using main branch with my fix of compile, so I intend not to introduce it.

thuang6

merge can wait binary published

xin3he added 3 commits October 14, 2025 23:39

add torch compile usage

bd301f0

Signed-off-by: He, Xin3 <[email protected]>

fix regression

0874a9f

Signed-off-by: He, Xin3 <[email protected]>

update readme

48f918b

Signed-off-by: He, Xin3 <[email protected]>

PRAgent4INC added the Review effort 2/5 label Oct 16, 2025

xin3he mentioned this pull request Oct 16, 2025

fix regression issue and command in mix-precision example #2312

Closed

xin3he added 2 commits October 16, 2025 02:07

add torch_compile in command

9cf7244

Signed-off-by: He, Xin3 <[email protected]>

Update README.md

8411b19

chensuyue added this to the 3.6 milestone Oct 17, 2025

xin3he requested review from XuehaoSun and thuang6 October 20, 2025 03:43

thuang6 approved these changes Oct 20, 2025

View reviewed changes

Update requirements.txt

74bda35

xin3he requested a review from thuang6 October 20, 2025 07:32

Update requirements.txt

5c7e9ef

thuang6 approved these changes Oct 20, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix regression issue and command in mix-precision example #2317

fix regression issue and command in mix-precision example #2317

Uh oh!

xin3he commented Oct 16, 2025 •

edited by PRAgent4INC

Loading

Uh oh!

PRAgent4INC commented Oct 16, 2025

Uh oh!

PRAgent4INC commented Oct 16, 2025

Uh oh!

thuang6 left a comment

Uh oh!

thuang6 Oct 20, 2025

Uh oh!

xin3he Oct 20, 2025

Uh oh!

thuang6 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fix regression issue and command in mix-precision example #2317

Are you sure you want to change the base?

fix regression issue and command in mix-precision example #2317

Uh oh!

Conversation

xin3he commented Oct 16, 2025 • edited by PRAgent4INC Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

Type of Change

PR Type

Description

Diagram Walkthrough

File Walkthrough

Uh oh!

PRAgent4INC commented Oct 16, 2025

PR Reviewer Guide 🔍

Uh oh!

PRAgent4INC commented Oct 16, 2025

PR Code Suggestions ✨

Uh oh!

thuang6 left a comment

Choose a reason for hiding this comment

Uh oh!

thuang6 Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

xin3he Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

thuang6 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

xin3he commented Oct 16, 2025 •

edited by PRAgent4INC

Loading