Offline Model Loading When Previously Downloaded via llama-server fails #23

1-ashraful-islam · 2025-02-04T22:20:26Z

Issue Description:

When attempting to launch llama-server without an internet connection, it fails with a timeout error while making a GET request to Hugging Face, despite the model having been previously downloaded using the same command.

Steps to Reproduce:

Run the following command while connected to the internet to download and use the model:

llama-server \
    -hf ggml-org/Qwen2.5-Coder-3B-Q8_0-GGUF \
    --port 8012 -ngl 99 -fa -ub 1024 -b 1024 \
    --ctx-size 0 --cache-reuse 256

Disconnect from the internet.
Run the same command again.
The command fails, appearing to attempt a GET request to Hugging Face.
Using the llama-server command without internet doesn't work even though the model was already downloaded previously using the same command. The command seems to timeout on a GET request to hugging face.

Expected Behavior:

If the model was previously downloaded, llama-server should detect and load it from cache without requiring an internet connection.

Actual Behavior:

The command times out while trying to reach Hugging Face, preventing offline usage.

Would appreciate any guidance on how to enforce offline usage or a workaround to bypass this issue! 🚀

If this is not a solved issue and you would like me to open the issue in the llama.cpp instead, let me know.

ggerganov · 2025-02-05T14:44:32Z

You can use the -m /path/to/model.gguf flag instead of -hf. This will skip the curl queries to the remote server.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Offline Model Loading When Previously Downloaded via llama-server fails #23

Offline Model Loading When Previously Downloaded via llama-server fails #23

1-ashraful-islam commented Feb 4, 2025

ggerganov commented Feb 5, 2025

Offline Model Loading When Previously Downloaded via llama-server fails #23

Offline Model Loading When Previously Downloaded via llama-server fails #23

Comments

1-ashraful-islam commented Feb 4, 2025

Issue Description:

Steps to Reproduce:

Expected Behavior:

Actual Behavior:

ggerganov commented Feb 5, 2025