diff --git a/docs/en/advanced/low-precision.md b/docs/en/advanced/low-precision.md index 1341266fc..50affb12f 100644 --- a/docs/en/advanced/low-precision.md +++ b/docs/en/advanced/low-precision.md @@ -87,6 +87,7 @@ First, download the PTQ (Post-Training Quantization) calibration dataset from Hu Next, use the `tools/convert_hf_to_int4.py` script to convert BF16 weights to INT4 format. Ensure that the `--hf-checkpoint` parameter points to a directory where `config.json` contains the correct `quantization_config`. slime will automatically utilize INT4 quantization during weight updates. ```bash +pip install llmcompressor python tools/convert_hf_to_int4.py \ --input-dir /path/to/your/original/models \ --output-dir /path/to/your/save/models \ diff --git a/docs/zh/advanced/low-precision.md b/docs/zh/advanced/low-precision.md index e076661a1..208bce7c9 100644 --- a/docs/zh/advanced/low-precision.md +++ b/docs/zh/advanced/low-precision.md @@ -75,6 +75,7 @@ bash scripts/low_precision/run-qwen3-30b-a3b-fp8.sh [wikitext-2-raw-v1](https://huggingface.co/datasets/Salesforce/wikitext/tree/main/wikitext-2-raw-v1) 接着,使用 `tools/convert_hf_to_int4.py` 脚本进行转换。确保 `--hf-checkpoint` 指向的目录中 `config.json` 包含正确的 `quantization_config`。 ```bash +pip install llmcompressor python tools/convert_hf_to_int4.py \ --input-dir /path/to/your/original/models \ --output-dir /path/to/your/save/models \ diff --git a/tools/convert_hf_to_int4.py b/tools/convert_hf_to_int4.py index ba76a987f..2ac27fff1 100644 --- a/tools/convert_hf_to_int4.py +++ b/tools/convert_hf_to_int4.py @@ -77,6 +77,7 @@ def main(): "re:.*embed.*", "re:.*self_attn.*", "re:.*shared_experts.*", + "re:.*mlp\\.gate.*", "re:.*mlp\\.(gate|up|gate_up|down)_proj.*", ]