Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Offline Model Loading When Previously Downloaded via llama-server fails #23

Open
1-ashraful-islam opened this issue Feb 4, 2025 · 1 comment

Comments

@1-ashraful-islam
Copy link

Issue Description:

When attempting to launch llama-server without an internet connection, it fails with a timeout error while making a GET request to Hugging Face, despite the model having been previously downloaded using the same command.

Steps to Reproduce:

  1. Run the following command while connected to the internet to download and use the model:
llama-server \
    -hf ggml-org/Qwen2.5-Coder-3B-Q8_0-GGUF \
    --port 8012 -ngl 99 -fa -ub 1024 -b 1024 \
    --ctx-size 0 --cache-reuse 256
  1. Disconnect from the internet.
  2. Run the same command again.
  3. The command fails, appearing to attempt a GET request to Hugging Face.
    Using the llama-server command without internet doesn't work even though the model was already downloaded previously using the same command. The command seems to timeout on a GET request to hugging face.

Expected Behavior:

If the model was previously downloaded, llama-server should detect and load it from cache without requiring an internet connection.

Actual Behavior:

The command times out while trying to reach Hugging Face, preventing offline usage.

Would appreciate any guidance on how to enforce offline usage or a workaround to bypass this issue! 🚀

If this is not a solved issue and you would like me to open the issue in the llama.cpp instead, let me know.

@ggerganov
Copy link
Member

You can use the -m /path/to/model.gguf flag instead of -hf. This will skip the curl queries to the remote server.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants