Skip to content

[CPU] Introduce Int4WoqCpuTensor to replace Int4CPULayout in AQT #2798

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

Xia-Weiwen
Copy link
Collaborator

Summary
This PR adds Int4WoqCpuTensor to replace the AQT tensor with Int4CPULayout since AQT will be deprecated.

Test plan

pytest -sv test/quantization/quantize_/workflows/int4/test_int4_woq_cpu_tensor.py

Copy link

pytorch-bot bot commented Aug 19, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2798

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 9012a61 with merge base 72b35bf (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 19, 2025
@Xia-Weiwen Xia-Weiwen requested a review from Copilot August 19, 2025 03:19
@Xia-Weiwen Xia-Weiwen added the topic: new feature Use this tag if this PR adds a new feature label Aug 19, 2025
Copilot

This comment was marked as outdated.

@Xia-Weiwen Xia-Weiwen requested a review from Copilot August 19, 2025 03:26
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces Int4WoqCpuTensor as a replacement for AQT tensor with Int4CPULayout to support int4 weight-only quantization on CPU with groupwise quantization.

Key changes:

  • Adds new Int4WoqCpuTensor class for CPU-specific int4 weight-only quantization
  • Integrates the new tensor type into the quantization API and workflow system
  • Adds comprehensive test coverage for the new tensor implementation

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
torchao/quantization/quantize_/workflows/int4/int4_woq_cpu_tensor.py Implements the core Int4WoqCpuTensor class with CPU-optimized int4 quantization
torchao/quantization/quantize_/workflows/init.py Adds export for the new tensor class
torchao/quantization/quantize_/common/packing_format.py Adds INT4_WOQ_CPU packing format enum value
torchao/quantization/quant_api.py Integrates new tensor into quantization workflow
torchao/quantization/init.py Adds public API export for the tensor class
test/quantization/quantize_/workflows/int4/test_int4_woq_cpu_tensor.py Comprehensive test suite for the new tensor implementation

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@Xia-Weiwen Xia-Weiwen marked this pull request as ready for review August 19, 2025 03:30
@Xia-Weiwen Xia-Weiwen requested a review from jerryzh168 August 19, 2025 03:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: new feature Use this tag if this PR adds a new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant