Add instruction block for converting gguf file to irpa file (#924)

Add a section to `llama_serving` that tells users how to convert from `gguf` to `irpa`
nod-ai · Feb 6, 2025 · 3cccc20 · 3cccc20
1 parent d33ef0b
commit 3cccc20
Showing 1 changed file with 8 additions and 0 deletions.
diff --git a/docs/shortfin/llm/user/llama_serving.md b/docs/shortfin/llm/user/llama_serving.md
@@ -112,6 +112,14 @@ python -m sharktank.utils.hf_datasets llama3_8B_fp16 --local-dir $EXPORT_DIR
 > python3 convert_hf_to_gguf.py $WEIGHTS_DIR --outtype f16 --outfile $EXPORT_DIR/<output_gguf_name>.gguf
 > ```
 > Now this GGUF file can be used in the instructions ahead.
+>
+> If you would like to convert the model from a [`.gguf`](https://iree.dev/guides/parameters/#gguf)
+> file to a [`.irpa`](https://iree.dev/guides/parameters/#irpa) file, you can
+> use our [`sharktank.tools.dump_gguf`](https://github.com/nod-ai/shark-ai/blob/main/sharktank/sharktank/tools/dump_gguf.py)
+> script:
+> ```bash
+> python -m sharktank.tools.dump_gguf --gguf-file $EXPORT_DIR/<output_gguf_name>.gguf --save $EXPORT_DIR/<output_irpa_name>.irpa
+> ```
 
 ### Define environment variables