Skip to content

Implement an AWQ algorithm with dynamic activation quantization for ExecuTorch #2388

@metascroy

Description

@metascroy

TorchAO has implemented AWQ, and we'd like to extend the implementation to cover dynamic activation quantization to support lowering to ExecuTorch.

  1. We should modify the existing AWQ algorithm to support QDQLayout here to support ExecuTorch

  2. The scales should have 8-bit dynamic activation quantization applied before computing the AWQ scaling.

With the above changes, we should be able to quantize a model with AWQ and lower it to ExecuTorch following instructions similar to here.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions