fix(sessions): hit /api/session/new fast path on cold start (#2518 follow-up)#3410
Conversation
66ca780 to
3727e4a
Compare
|
Reading the client diff at 1. Dead term in the fallback chainreqBody.model_provider=newModelState.model_provider||null||(window._activeProvider||null)||(S.session&&S.session.model_provider)||null;The middle 2. Edge case:
|
…llback Closes the open follow-up from nesquena#2518 - addresses the cross-provider regression flagged in PR nesquena#3410 review: when the persisted state carries a stale foreign-slug model (e.g. "gemini/gemini-2.5") from a session served by a different provider than the now-active one, the window._activeProvider fallback would attach the wrong provider and let _resolve_compatible_session_model_state's fast path pass it through without consulting the catalog - silently re-pointing the new session at the wrong backend. The new client guard wraps the active-provider fallback in a _bareModel ternary (rejects '/' and '@' prefixes) so slash-qualified and @-qualified models keep model_provider=null on the wire and the slow path's cross-provider normalization still runs. Also drops a vestigial mid-chain '||null' no-op. Adds 6 regression tests in test_issue2518_active_provider_fallback.py (test_slash_qualified_model_keeps_active_provider_behind_guard, test_at_qualified_model_also_keeps_active_provider_behind_guard, test_explicit_picker_provider_still_wins, test_no_op_null_terminal_in_fallback_chain, test_slash_slug_keeps_provider_null_in_wire_shape, test_bare_model_uses_active_provider_when_no_picker). Behavior-contract assertions, not source-string pins, so future refactors of the same contract still satisfy them. Builds on nesquena#2528 (in-flight guard) and nesquena#1855 (fast path). PR body draft: docs/pr-media/2518/PR_BODY.md
|
Thanks for the careful read of the client/server contract — both points are well taken. Pushed a fix on top of 1. Dead 2. Cross-provider slash-slug guard — went with the client-side fix as you suggested. The active-provider fallback is now gated behind a const _bareModel = !/[/]/.test(newModelState.model) && !newModelState.model.startsWith('@');
reqBody.model_provider = newModelState.model_provider
|| (_bareModel ? (window._activeProvider || (S.session && S.session.model_provider)) : null)
|| null;Behavior:
Regression coverage. Added a new
Verification: Also re-ran with Python 3.12 (the CI version): same 54/54. Did not touch any of the pre-existing test files. Server-side follow-up, separate PR. I'm leaving the general "slug-prefix provider ≠ requested_provider → fall to slow path" generalization of Happy to adjust if you'd prefer the server-side fix instead, or if there's a third option I missed. |
…#2518 follow-up) The frontend in-flight guard (PR nesquena#2528) made repeated + clicks safe but left a single cold click waiting 3-4s behind get_available_models(): newSession() carried the dropdown's model_provider as reqBody.model_provider. When the dropdown option has no data-provider attribute (or its value is 'default') and the persisted state predates provider tracking, newModelState.model_provider is null. The server's fast path in _resolve_compatible_session_model_state requires both model AND a truthy model_provider; without that, the request falls into the cold catalog rebuild. The catalog warms after the first response, so subsequent clicks are fast. newSession() now falls back through a 3-step chain: 1. newModelState.model_provider (explicit picker) 2. window._activeProvider (boot-hydrated active route) 3. S.session.model_provider (previous session) Whenever a usable default exists, the request hits the server's fast path and stays out of get_available_models() entirely. The slow path remains the safety net for genuinely provider-less clients. Closes the open follow-up from nesquena#2518. Tests: - tests/test_issue2518_active_provider_fallback.py (new, 7 cases): source shape (fallback present, prev-session present, chain order, issue reference) + end-to-end (fast path on real + slow path still fires without provider). - tests/test_new_chat_default_model_frontend.py: rewrote test_new_session_posts_picker_model_before_server_default from a literal-string snapshot into a behavior-contract assertion (chain members + ordering), per AGENTS.md change-detector guidance.
…llback Closes the open follow-up from nesquena#2518 - addresses the cross-provider regression flagged in PR nesquena#3410 review: when the persisted state carries a stale foreign-slug model (e.g. "gemini/gemini-2.5") from a session served by a different provider than the now-active one, the window._activeProvider fallback would attach the wrong provider and let _resolve_compatible_session_model_state's fast path pass it through without consulting the catalog - silently re-pointing the new session at the wrong backend. The new client guard wraps the active-provider fallback in a _bareModel ternary (rejects '/' and '@' prefixes) so slash-qualified and @-qualified models keep model_provider=null on the wire and the slow path's cross-provider normalization still runs. Also drops a vestigial mid-chain '||null' no-op. Adds 6 regression tests in test_issue2518_active_provider_fallback.py (test_slash_qualified_model_keeps_active_provider_behind_guard, test_at_qualified_model_also_keeps_active_provider_behind_guard, test_explicit_picker_provider_still_wins, test_no_op_null_terminal_in_fallback_chain, test_slash_slug_keeps_provider_null_in_wire_shape, test_bare_model_uses_active_provider_when_no_picker). Behavior-contract assertions, not source-string pins, so future refactors of the same contract still satisfy them. Builds on nesquena#2528 (in-flight guard) and nesquena#1855 (fast path). PR body draft: docs/pr-media/2518/PR_BODY.md
67704a1 to
5bdfd2d
Compare
…ior contract The previous literal-string assertion 'reqBody.model_provider=newModelState.model_provider||null' pinned the fallback chain to a single-source shape that no longer matches the source after PR nesquena#3410's cross-provider slash-slug guard. The test's intent is behavioral (verify newSession() sends newModelState.model_provider first, with _activeProvider and prev-session as ordered fallbacks), but the implementation pinned syntax. Per AGENTS.md 'Don't write change-detector tests' and the behavior-contract template, replace the literal with substring + ordering checks that survive future refactors of the same contract (e.g. the _bareModel ternary gate, or a future helper function). This unblocks CI for PR nesquena#3410 (the source change in that PR is correct; the assertion just needed the same behavior-contract upgrade that the existing nesquena#2518 follow-up tests in test_issue2518_active_provider_fallback.py and test_new_chat_default_model_frontend.py already use).
|
Shipped in v0.51.238 (Release HF) via release PR #3493 — thank you @franksong2702! 🙏 Your cold-start fast-path fix (fill One guard added on the way in (Codex pre-release finding): the server fast path passes Both reviewers signed off (Codex re-reviewed → SAFE TO SHIP after the guard; Opus had judged the original acceptable). Authorship preserved via Closing as merged-via-release-stage (recommitted on the stage branch after rebase onto current master). |
Thinking Path
New Conversation button is the most-clicked affordance and must feel
immediate.
get_available_models(); PR fix: guard new conversation cold-start clicks #2528 (b76d698) added the in-flight guardthat prevents rapid duplicate clicks and surfaces a visible "creating…"
state, but the slow click itself was left for follow-up.
model_provider, so_resolve_compatible_session_model_state's fastpath (introduced by bug: /api/chat/start and /api/sessions can wedge for ~125s while health remains responsive #1855) returns immediately and the catalog
rebuild is never triggered on the new-session path.
exactly the slow-path-on-cold / fast-path-on-warm asymmetry: after this
PR the first click takes the fast path too.
What Changed
static/sessions.jsnewSession()now falls back throughwindow._activeProvider(thenS.session.model_provider) when the dropdown'sdata-provideris missing/'default', when the persisted state predates provider tracking, or when the dropdown is unhydrated at boot.tests/test_issue2518_active_provider_fallback.py(new)tests/test_new_chat_default_model_frontend.pytest_new_session_posts_picker_model_before_server_defaultrewritten from a literal-string snapshot into a behavior-contract assertion (per AGENTS.md change-detector guidance): the contract is now "reqBody.model_provider is the explicit picker value, with_activeProviderandS.session.model_provideras ordered fallbacks."CHANGELOG.md[Unreleased]Fixed entry, opening with the d5dcd60/#872 phrase "New conversations now resync…" so the existing CHANGELOG literal-snapshot test keeps passing.docs/pr-media/2518/bench.py(new)PYTHONPATH=. .venv/bin/python docs/pr-media/2518/bench.py.Why It Matters
User-visible behavior: the first + click after server boot (or after
clearing the model catalog cache) is no longer 3-4s slower than subsequent
clicks. State layer touched: the WebUI new-session request path and the
server's
_resolve_compatible_session_model_statefast path are nowactually wired together — the fast path has existed since #1855, but the
client rarely reached it because it sent
model_provider: nullwheneverthe dropdown was unhydrated or the persisted state predated provider
tracking.
The slow path is preserved as the safety net for genuinely provider-less
clients (no
_activeProvider, no previous session). The fix is purelyadditive on the client side and does not change any server contract.
Verification
Bench output (
docs/pr-media/2518/bench.py)Reading the two halves together:
OpenRouter /models, no credential refresh). The catalog rebuild is
near-instant, but the 158x speedup between the slow and fast paths
is the structural gain — fast path skips an entire function call and
the lock dance around it.
time.sleepintoget_available_models()to approximate the production scenario fromthe original New Conversation button appears unresponsive during cold model catalog resolution #2518 triage. First + click goes from ~3060 ms to
~0 ms because the patched client never reaches the catalog call at
all.
get_available_models() invocations: 0in the fast-path blockproves the contract end-to-end: when the client supplies a truthy
model_provider, the server does not touch the model catalog on thenew-session path.
Test suite
The 7 new cases in
test_issue2518_active_provider_fallback.pyare thedirect regression coverage; the other 41 cases confirm the change does
not regress #1855 (fast-path behavior on
/api/chat/startetc.), #2528(in-flight guard), or the d5dcd60/#872 picker-default-provider sync.
Manual smoke
Run
python server.py(or./ctl.sh start), open the UI, click + fivetimes. The cursor takes the
cursor:waithint on the first click only(PR #2528's busy state); subsequent clicks of the + button or Cmd+K
shortcut are deduped through the in-flight promise. The wait behind
get_available_models()is gone for any client that has a hydrated_activeProvider(which is the boot default).Risks / Follow-ups
localStoragecarriesmodel: "gpt-5.5"from a session that wasactually served by a different provider than the currently active
one, the fallback chain could pin the wrong provider on the new
session. The server's
_resolve_compatible_session_model_state(lines 1841-1930 of
api/routes.py) still runs and the slow-pathrepair branch will normalize a stale
openai/gpt-*shape onopenai-codex, so the worst case is a still-fast request thatnormalizes provider to the active route — exactly what
S.session.model_providerpreviously carried. Not a regression.hermes-webui-modellocalStorage key (no provider) now falls backthrough the new chain. The first request from a user who has never
updated their model picker still works because the server's slow path
is intact; the speedup only kicks in once the dropdown has
hydrated (i.e. from the second + click onward). The user's
reported "first slow, then fast" pattern is therefore expected to
become "always fast" from the first click onward once the picker
has been touched at least once on the current profile.
exists for genuinely provider-less clients. A separate PR can
asynchronously warm the model catalog in the background on boot so
even a fully unhydrated client gets sub-second first clicks.
await newSession()doesn't block the composer at all. The newsession is empty by definition, so the user could see a blank
composer the moment they click + while the server still does its
bookkeeping. This is a bigger UX change; deferred.
Model Used
gitfor branch/commit/push; read-only git history traversal (no
delegate_tasksub-agents were used for this change). Theimplementer read
_resolve_compatible_session_model_stateend-to-end before changing the client fallback chain so the server
contract stays intact, and ran
docs/pr-media/2518/bench.pyinboth halves (real isolated env + 3.0s monkeypatched cold rebuild)
to produce the Verification numbers above.
Cross-references
appears unresponsive during cold model catalog resolution).
conversation cold-start clicks) and the fast-path branch introduced
by bug: /api/chat/start and /api/sessions can wedge for ~125s while health remains responsive #1855 (PR bug: /api/chat/start and /api/sessions can wedge for ~125s while health remains responsive #1855 — /api/chat/start wedge on
resolve_model_provider stage).
sync) only insofar as
reqBody.model_provideris now sourced from aricher chain; the picker-→-server contract from bug(settings): Default Model preference shows incomplete list and is not applied to new chats #872 is preserved
and the existing test was upgraded from a literal-snapshot to a
behavior-contract assertion.