fix: ensure recipe_options.json values are applied at model load time#1540
fix: ensure recipe_options.json values are applied at model load time#1540ramkrishna2910 wants to merge 1 commit intomainfrom
Conversation
Per-model recipe options (e.g. ctx_size) saved in recipe_options.json were not reliably reaching the llama-server subprocess. The existing code in build_cache() merges recipe_options by model name, but due to cache timing the merged values were sometimes not present in model_info.recipe_options when Router::load_model() resolved the effective options. This fix adds a direct lookup of recipe_options.json in Router::load_model() as a safety net, ensuring saved per-model options always take effect regardless of cache state. Changes: - Add ModelManager::get_saved_recipe_options() to expose the raw recipe_options_ map for a given model name - In Router::load_model(), look up saved options directly and merge them into the options resolution chain - Promote the effective settings log from DEBUG to INFO for easier debugging of option resolution Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
sounds like one of the fixes in #1412 |
1412 should merge soon so @ramkrishna2910 can you look at whether this is a duplicate? |
|
This PR adds a fix against a timing-specific race condition that PR #1412 doesn't explicitly address. The scenario where the router reads model_info.recipe_options before build_cache() has finished merging. I was running in to this during some of the vllm testing. So this is likely an addition on top of #1412 |
|
Thanks for looking into this — recipe_options reliability has definitely been a pain point. (Claude and Codex working on behalf of Ian) traced through the code paths to understand how this interacts with the changes in #1412, and wanted to share the findings in case there's something missing about the reproduction scenario. The current merge path (with #1412's In
The same function is called from The load path:
So by the time the router sees On the stale-data angle: (Claude and Codex working on behalf of Ian) also considered whether this could be a staleness fix — e.g., Potential regression with the replacement logic: There's also a concern with the replacement semantics in the new router code: RecipeOptions saved_opt = model_info.recipe_options;
auto saved_json = model_manager_->get_saved_recipe_options(model_name);
if (!saved_json.empty()) {
saved_opt = RecipeOptions(model_info.recipe, saved_json);
}When What (Claude and Codex working on behalf of Ian) might be missing: Is the vllm testing scenario using a code path that bypasses |
Summary
Per-model recipe options (e.g.
ctx_size) saved in~/.cache/lemonade/recipe_options.jsonwere not reliably reaching the llama-server subprocess. Models would always launch with the defaultctx_size=4096even whenrecipe_options.jsonspecified a different value.Root Cause
The existing code in
ModelManager::build_cache()merges recipe options by model name (line 890), but due to cache timing the merged values were sometimes not present inmodel_info.recipe_optionswhenRouter::load_model()resolved the effective options. This meant the llama-server subprocess was always launched with the code default ofctx_size=4096.Reproduction
ctx_sizein~/.cache/lemonade/recipe_options.json:{"Qwen3.5-35B-A3B-GGUF": {"ctx_size": 16384}}--ctx-sizewill show4096instead of16384Fix
Added a direct lookup of
recipe_options.jsoninRouter::load_model()as a safety net. This ensures saved per-model options always take effect regardless of cache state.Changes
get_saved_recipe_options(model_name)to expose the raw recipe_options_ mapThe options resolution order remains: load-time overrides > per-model saved options > global defaults.
Verification
After this fix, the llama-server subprocess correctly receives
--ctx-size 16384when set inrecipe_options.json:Test plan
ctx_sizeinrecipe_options.jsonfor a model, verify it appears in llama-server subprocess argsctx_size=4096still applies when no override is setllamacpp_argsfrom recipe_options.json also flows through🤖 Generated with Claude Code