Skip to content

Conversation

Kaihui-intel
Copy link
Contributor

@Kaihui-intel Kaihui-intel commented Oct 15, 2025

User description

Type of Change

bug fix

Description

fix return int_weight torch.zeros(weight. Shape).to(device))

Expected Behavior & Potential Risk

the expected behavior that triggered by this PR

How has this PR been tested?

how to reproduce the test (including hardware information)

Dependency Change?

any library dependency introduced or removed


PR Type

Bug fix


Description

  • Fix incorrect assignment of quantized weights for NF4 & FP4

  • Ensure int_weight is correctly updated with quantized values


Diagram Walkthrough

flowchart LR
  A["Initialize int_weight"] -- "Loop through groups" --> B["Quantize group"]
  B -- "Update int_weight for NF4/FP4" --> C["Copy quantized values"]
  C -- "Handle tail group" --> D["Quantize tail group"]
  D -- "Update int_weight for NF4/FP4" --> E["Copy tail quantized values"]
Loading

File Walkthrough

Relevant files
Bug fix
utility.py
Correct weight assignment in quant_weight_w_scale               

neural_compressor/torch/algorithms/weight_only/utility.py

  • Added correct assignment of quantized weights to int_weight for NF4 &
    FP4
  • Ensured int_weight is updated with quantized values in both loop and
    tail handling
+2/-0     

Signed-off-by: Kaihui-intel <[email protected]>
@PRAgent4INC
Copy link
Collaborator

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Incorrect Indexing

The indexing in the new lines seems incorrect. The slice int_weight[:, leng * group_size :] is used twice, which will always point to the tail part of the tensor, not the current group being processed.

int_weight[:, leng * group_size :].copy_(int_weight_tmp)
Incorrect Indexing

Similar to the first issue, the indexing in the new lines seems incorrect. The slice int_weight[:, leng * group_size :] is used twice, which will always point to the tail part of the tensor, not the current group being processed.

int_weight[:, leng * group_size :].copy_(int_weight_tmp)

@PRAgent4INC
Copy link
Collaborator

PR Code Suggestions ✨

Signed-off-by: Kaihui-intel <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants