llama.cpp unresponsive for 20 seconds

I'm trying to use this to run Auto-GPT. As a test, before hooking it up to use Auto-GPT, I tried it with Chatbot-UI. However, gpt-llama.cpp keeps locking up with `LLAMA.CPP UNRESPONSIVE FOR 20 SECS. ATTEMPTING TO RESUME GENERATION` whenever the LLM finishes its response. I'm using [gpt4-x-alpaca-13B-GGML](https://huggingface.co/TheBloke/gpt4-x-alpaca-13B-GGML/blob/main/gpt4-x-alpaca-13b.ggmlv3.q3_K_L.bin) which I converted to gguf with the tools in llama.cpp. Using llama.cpp alone the model works fine (albeit not the smartest). What can I do to solve this issue?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama.cpp unresponsive for 20 seconds #62

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

llama.cpp unresponsive for 20 seconds #62

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions