Skip to content

Commit

Permalink
Add instruction block for converting gguf file to irpa file (#924)
Browse files Browse the repository at this point in the history
Add a section to `llama_serving` that tells users how to convert from
`gguf` to `irpa`
  • Loading branch information
stbaione authored Feb 6, 2025
1 parent d33ef0b commit 3cccc20
Showing 1 changed file with 8 additions and 0 deletions.
8 changes: 8 additions & 0 deletions docs/shortfin/llm/user/llama_serving.md
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,14 @@ python -m sharktank.utils.hf_datasets llama3_8B_fp16 --local-dir $EXPORT_DIR
> python3 convert_hf_to_gguf.py $WEIGHTS_DIR --outtype f16 --outfile $EXPORT_DIR/<output_gguf_name>.gguf
> ```
> Now this GGUF file can be used in the instructions ahead.
>
> If you would like to convert the model from a [`.gguf`](https://iree.dev/guides/parameters/#gguf)
> file to a [`.irpa`](https://iree.dev/guides/parameters/#irpa) file, you can
> use our [`sharktank.tools.dump_gguf`](https://github.com/nod-ai/shark-ai/blob/main/sharktank/sharktank/tools/dump_gguf.py)
> script:
> ```bash
> python -m sharktank.tools.dump_gguf --gguf-file $EXPORT_DIR/<output_gguf_name>.gguf --save $EXPORT_DIR/<output_irpa_name>.irpa
> ```
### Define environment variables
Expand Down

0 comments on commit 3cccc20

Please sign in to comment.