UPSTREAM PR #19374: WebUI hide models in router mode by loci-dev · Pull Request #1156 · auroralabs-loci/llama.cpp

loci-dev · 2026-02-07T02:17:13Z

Note

Source pull request: ggml-org/llama.cpp#19374

When using llama-server with presets in router mode add the ability to completely hide certain models from the WebUI.

Summary

Adding no-webui = true to a model's preset configuration will exclude it from the WebUI.

Why?

I tend to use the preset capability to load a group of models for offline vibe coding, but I don't want my embedding, reranking, and FIM models to be selectable via the WebUI. I also wanted it so that if there is only one available model after filtering, it would be automatically selected. I chose to reuse the no-webui argument for this. So that any model in the preset file that has no-webui = true will be excluded from the WebUI model selection list. This also means that the excluded models cannot be loaded or unloaded via the WebUI.

Example:

models.ini

[*]
ngl = 999
threads = -1

; chat, tools
[gpt-oss-120b]
hf = ggml-org/gpt-oss-120b-GGUF
load-on-startup = true
jinja = true
flash-attn = 1
ubatch-size = 2048
batch-size = 32768
parallel = 2
; ctx-size = 131072*params.n_parallel
ctx-size = 262144
temp = 1.0
;min-p = 0.0
min-p = 0.01
top-p = 1.0
top-k = 0.0
; --- Speculative
spec-type = ngram-mod
spec-ngram-size-n = 24
draft-min = 48
draft-max = 64

; FIM
[Qwen2.5-Coder-7B-Q8_0]
load-on-startup = true
hf = ggml-org/Qwen2.5-Coder-7B-Q8_0-GGUF:Q8_0
flash-attn = 1
ubatch-size = 1024
batch-size = 1024
ctx-size = 0
cache-reuse = 256
parallel = 2
; ctx-size = 32768*params.n_parallel
ctx-size = 65536
no-webui = true

; embedding
[Qwen3-Embedding-0.6B]
load-on-startup = true
hf = Qwen/Qwen3-Embedding-0.6B-GGUF:Q8_0
flash-attn = 1
embedding = true
pooling = last
ubatch-size = 8192
verbose-prompt = true
parallel = 2
; ctx-size = 8192*params.n_parallel
ctx-size = 16384
no-webui = true

; reranking
[Qwen3-Reranker-0.6B]
load-on-startup = true
hf = ggml-org/Qwen3-Reranker-0.6B-Q8_0-GGUF:Q8_0
flash-attn = 1
rerank = true
parallel = 2
; ctx-size = 4096*params.n_parallel
ctx-size = 8192
no-webui = true

And now run it:

llama-server --offline --log-colors on --log-prefix --log-timestamps --no-models-autoload --models-preset models.ini

Note: I haven't tested the ini file as written above. I pre-downloaded the models and use model = modelfile.gguf instead of the listed hf = org/model. Aside from that, this ini file was what I tested against.

…ection

loci-review · 2026-02-07T03:06:38Z

No meaningful performance changes were detected across 115474 analyzed functions in the following binaries: build.bin.libllama.so, build.bin.llama-cvector-generator, build.bin.llama-tts, build.bin.libmtmd.so, build.bin.llama-tokenize, build.bin.llama-quantize, build.bin.llama-qwen2vl-cli, build.bin.llama-gemma3-cli, build.bin.llama-gguf-split, build.bin.llama-llava-cli, build.bin.llama-minicpmv-cli, build.bin.libggml-base.so, build.bin.libggml-cpu.so, build.bin.libggml.so, build.bin.llama-bench.

🔎 Full breakdown: Loci Inspector.
💬 Questions? Tag @loci-dev.

Caleb Sawyer added 2 commits February 5, 2026 14:09

fix(models): filter out non-webui models and handle single option sel…

40388cf

…ection

formatting changes from linting

89ee864

loci-dev temporarily deployed to PROD__AL_DEMO February 7, 2026 02:17 — with GitHub Actions Inactive

loci-dev force-pushed the main branch 10 times, most recently from 823244c to bab7d39 Compare February 19, 2026 02:17

loci-dev force-pushed the main branch 10 times, most recently from a92fe2a to 6495042 Compare February 27, 2026 02:17

loci-dev force-pushed the main branch 6 times, most recently from 1d064d0 to 504cad7 Compare March 4, 2026 02:17

loci-dev force-pushed the main branch 5 times, most recently from 8019888 to 17452e3 Compare March 9, 2026 02:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UPSTREAM PR #19374: WebUI hide models in router mode#1156

UPSTREAM PR #19374: WebUI hide models in router mode#1156
loci-dev wants to merge 2 commits intomainfrom
loci/pr-19374-webui-hide-model

loci-dev commented Feb 7, 2026

Uh oh!

loci-review bot commented Feb 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

loci-dev commented Feb 7, 2026

Summary

Why?

Example:

Uh oh!

loci-review bot commented Feb 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant