Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/en/advanced/low-precision.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,7 @@ First, download the PTQ (Post-Training Quantization) calibration dataset from Hu
Next, use the `tools/convert_hf_to_int4.py` script to convert BF16 weights to INT4 format. Ensure that the `--hf-checkpoint` parameter points to a directory where `config.json` contains the correct `quantization_config`. slime will automatically utilize INT4 quantization during weight updates.

```bash
pip install llmcompressor
python tools/convert_hf_to_int4.py \
--input-dir /path/to/your/original/models \
--output-dir /path/to/your/save/models \
Expand Down
1 change: 1 addition & 0 deletions docs/zh/advanced/low-precision.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,7 @@ bash scripts/low_precision/run-qwen3-30b-a3b-fp8.sh
[wikitext-2-raw-v1](https://huggingface.co/datasets/Salesforce/wikitext/tree/main/wikitext-2-raw-v1)
接着,使用 `tools/convert_hf_to_int4.py` 脚本进行转换。确保 `--hf-checkpoint` 指向的目录中 `config.json` 包含正确的 `quantization_config`。
```bash
pip install llmcompressor
python tools/convert_hf_to_int4.py \
--input-dir /path/to/your/original/models \
--output-dir /path/to/your/save/models \
Expand Down
1 change: 1 addition & 0 deletions tools/convert_hf_to_int4.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ def main():
"re:.*embed.*",
"re:.*self_attn.*",
"re:.*shared_experts.*",
"re:.*mlp\\.gate.*",
"re:.*mlp\\.(gate|up|gate_up|down)_proj.*",
]

Expand Down