Skip to content

Conversation

@ServeurpersoCom
Copy link
Collaborator

@ServeurpersoCom ServeurpersoCom commented Oct 26, 2025

  • Add no-cache headers to /props and /slots
  • Throttle slot checks to 30s
  • Prevent concurrent fetches with promise guard
  • Trigger refresh from chat streaming for legacy and ModelSelector
  • Show dynamic serverWarning when using cached data

Close #16771

Testing video, Raspberry Pi 5 + master branch + this PR :

Legacy.mp4

Cmdline used on legacy (Raspberry Pi 5) :

./build/bin/llama-server \
 -m /root/ia/models/mradermacher/OLMoE-1B-7B-0125-Instruct-i1-GGUF/OLMoE-1B-7B-0125-Instruct.i1-Q6_K.gguf \
 -ctk q8_0 -ctv q8_0 -fa on \
 --jinja --ctx-size 8192 --mlock --port 8081

./build/bin/llama-server \
 -m /root/ia/models/mradermacher/Qwen3-30B-A3B-Instruct-2507-i1-GGUF/Qwen3-30B-A3B-Instruct-2507.i1-Q4_K_M.gguf \
 --temp 0.7 --top-p 0.8 --top-k 20 --min-p 0 \
 -ctk q8_0 -ctv q8_0 -fa on \
 --jinja --ctx-size 4096 --port 8081

./build/bin/llama-server \
 -m /root/ia/models/lmstudio-community/gpt-oss-20b-GGUF/gpt-oss-20b-MXFP4.gguf \
 -ctk q8_0 -ctv q8_0 -fa on \
 --jinja --ctx-size 4096 --port 8081

ModelSelector enabled (same UI refresh behavior on first chunk):

ModelSelector.mp4

- Add no-cache headers to /props and /slots
- Throttle slot checks to 30s
- Prevent concurrent fetches with promise guard
- Trigger refresh from chat streaming for legacy and ModelSelector
- Show dynamic serverWarning when using cached data
@ServeurpersoCom ServeurpersoCom force-pushed the webui-props-auto-refresh branch from 2e9e0eb to 9397c4d Compare October 26, 2025 15:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Misc. bug: WebUI incorrectly displays local model names

1 participant