feat: always show reasoning effort selector, default off for unrecognized models (#3377)#3431
feat: always show reasoning effort selector, default off for unrecognized models (#3377)#3431b3nw wants to merge 4 commits into
Conversation
0789b94 to
f17400c
Compare
…ized models (nesquena#3377) Instead of hiding the thinking/reasoning effort chip entirely when a model is not recognized as reasoning-capable, always present the selector with the full effort scale available. For recognized models (GPT-5+, Claude 4/3.7+, Qwen-3+, DeepSeek, Kimi, etc.) the chip defaults to "Default" (reasoning active). For unrecognized or ambiguous models the chip defaults to "None" (reasoning off), letting users opt-in on any model. Backend: get_reasoning_status() now always returns the full VALID_REASONING_EFFORTS list in supported_efforts, plus a new reasoning_default_on flag indicating whether the model was positively identified as reasoning-capable. Frontend: _applyReasoningChip() always displays the chip; when reasoning_default_on is false and no effort is persisted, it defaults to "None". _applyReasoningOptions() shows all effort levels when the supported set is empty (error fallback).
f17400c to
14b7c27
Compare
|
Pulled the branch and read The backend change reads correctlymodel_recognized = bool(supported_efforts)
if not supported_efforts:
supported_efforts = list(VALID_REASONING_EFFORTS)
return {
...
"supported_efforts": supported_efforts,
"supports_reasoning_effort": True,
"reasoning_default_on": model_recognized,
}
Frontend matches
var defaultOn=(meta&&meta.reasoning_default_on!==undefined)?meta.reasoning_default_on:true;
if(!defaultOn&&(!effort||effort==='')){ effort='none'; }The error/catch path in One nuance:
|
b2f6201 to
2631940
Compare
|
updated the implementation, summary of changes below, preformed manual testing to validate. @nesquena-hermes Changes Made
Automated TestsVerified the changes locally using: uv run --with pytest --with pyyaml --with cryptography pytest tests/test_reasoning_effort_model_capabilities.py tests/test_reasoning_chip_btw_fixes.py
Manual Verification
|
|
Pulled the updated branch (HEAD The two
|
e0195e6 to
6918580
Compare
Changes Made - @nesquena-hermes
|
|
| Filename | Overview |
|---|---|
| api/config.py | Adds explicitly_unsupported detection and reasoning_default_on flag to get_reasoning_status; also adds a 9-provider fallback loop in _models_dev_reasoning_efforts — the double invocation creates up to 20 provider lookups per call for unrecognized models. |
| static/ui.js | Chip now always visible; adds reasoning_default_on defaulting logic and moves state mutations before the !supports early-return; error handler changed to preserve prior dropdown options on API failure — changes look correct. |
| tests/test_reasoning_effort_model_capabilities.py | Good coverage of new paths; test_get_reasoning_status_unrecognized_model_still_offers_efforts does not mock _models_dev_reasoning_efforts, creating an implicit dependency on the live implementation returning None for an unknown model. |
| tests/test_reasoning_chip_btw_fixes.py | Minor wording update to assertion message — no functional change. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["fetchReasoningChip()"] --> B["GET /api/reasoning"]
B -->|success| C["_applyReasoningChip(effort, st)"]
B -->|error| D["_applyReasoningChip with reasoning_default_on=false"]
C --> E["Update _currentReasoningEffortsSupported"]
E --> F{"reasoning_default_on false AND no effort?"}
F -->|yes| G["effort = none"]
F -->|no| H["effort = normalized value"]
G --> I{"supports_reasoning_effort?"}
H --> I
I -->|false| J["Hide chip"]
I -->|true| K["Show chip with label and dropdown"]
subgraph backend["get_reasoning_status in config.py"]
L["resolve_model_reasoning_efforts()"] --> M{"supported_efforts empty?"}
M -->|no| N["reasoning_default_on = True"]
M -->|yes| O{"ACP provider?"}
O -->|yes| P["explicitly_unsupported = True"]
O -->|no| Q["_models_dev_reasoning_efforts second call"]
Q -->|returns empty| P
Q -->|returns None| R["show chip, full list, default_on=False"]
N --> S["Return JSON"]
P --> S
R --> S
end
Reviews (1): Last reviewed commit: "Gate reasoning status metadata lookup an..." | Re-trigger Greptile
| elif resolve_model and not supported_efforts: | ||
| hinted_model = _strip_provider_hint_for_reasoning(resolve_model) | ||
| metadata_efforts = _models_dev_reasoning_efforts(hinted_model, provider) | ||
| if metadata_efforts == []: | ||
| explicitly_unsupported = True |
There was a problem hiding this comment.
Double metadata lookup on every unrecognized model check
resolve_model_reasoning_efforts already calls _models_dev_reasoning_efforts internally (line 2378) — and that function, when the initial lookup returns None, runs a fallback loop over up to 9 standard providers. When get_reasoning_status then calls _models_dev_reasoning_efforts a second time here (to distinguish "heuristic said no" from "metadata confirmed no"), the entire fallback loop fires again. If get_model_capabilities involves I/O (e.g. HTTP calls to models.dev), every get_reasoning_status invocation on an unrecognized model pays up to 20 provider lookups instead of 10. A simple fix is to pass the metadata result out of resolve_model_reasoning_efforts or memoize _models_dev_reasoning_efforts per (model, provider) within a request.
| def test_get_reasoning_status_unrecognized_model_still_offers_efforts(monkeypatch): | ||
| """Unrecognized models get the full effort list but reasoning_default_on=False.""" | ||
| monkeypatch.setattr( | ||
| cfg, | ||
| "resolve_model_reasoning_efforts", | ||
| lambda *a, **k: [], | ||
| ) | ||
| status = cfg.get_reasoning_status( | ||
| model_id="some-unknown-model", | ||
| provider_id="custom:myproxy", | ||
| ) | ||
| assert len(status["supported_efforts"]) > 0, ( | ||
| "Unrecognized models should still expose effort levels" | ||
| ) | ||
| assert status["supports_reasoning_effort"] is True | ||
| assert status["reasoning_default_on"] is False |
There was a problem hiding this comment.
_models_dev_reasoning_efforts not mocked — test can flip if catalog is reachable
The test patches resolve_model_reasoning_efforts to return [], but get_reasoning_status then calls _models_dev_reasoning_efforts directly (the new explicitly_unsupported check). If agent.models_dev is importable in the test environment and get_model_capabilities returns a capabilities object with supports_reasoning=False for "some-unknown-model", the function returns [], explicitly_unsupported becomes True, and both supports_reasoning_effort and len(supported_efforts) assertions fail. Adding monkeypatch.setattr(cfg, "_models_dev_reasoning_efforts", lambda *a, **k: None) makes the intent explicit and removes the environmental dependency.
|
Pulled the updated branch (HEAD The gate is correctif (provider in {"cursor-acp", "copilot-acp"} or model_prefix in {"cursor-acp", "copilot-acp"}):
explicitly_unsupported = True
elif resolve_model and not supported_efforts: # <-- the new guard
hinted_model = _strip_provider_hint_for_reasoning(resolve_model)
metadata_efforts = _models_dev_reasoning_efforts(hinted_model, provider)
if metadata_efforts == []:
explicitly_unsupported = TrueThe The new test actually proves it
assert status["supported_efforts"] == ["medium", "high"]
assert not called_metadata_check, "Should not query models.dev metadata since resolver returned success"The One small test-isolation note
Contract is now explicit on both sides and the quadrant coverage (recognized / unrecognized / ACP / metadata-unsupported / authoritative-disagreement) is solid. Reads merge-ready to me modulo that one test mock. |
## Release v0.51.247 — Release HO (stage-q19) Backend correctness fix. ### Fixed | Issue | Author | Fix | |-------|--------|-----| | #3505 | @franksong2702 | **Reasoning effort is coerced to a level the active model/provider actually supports** before each request, instead of being sent verbatim and rejected. `openai-codex` `gpt-5` no longer gets `max` (→ `xhigh`); `o1`/`o3`/`o4` clamp to `low`/`medium`/`high`. Coercion only steps *down* (never escalates); `none`/unset preserved. The capability filter is applied across heuristic / models.dev / Copilot / LM Studio paths. | This is the narrow, correct fix for the detection gap that #3431 tried to address by removing the chip-visibility gate (which we shelved). The chip-visibility gate is **untouched** (Codex confirmed) — `get_reasoning_status`/`_applyReasoningChip` still hide the chip for unconfirmed models. ### Review fix absorbed (Codex + self-flagged) The first cut **dropped** a configured effort for *unrecognized* models, because capability detection returns `[]` for both "known-unsupported" and "simply-unknown" (custom providers, aggregator-rewritten ids, new releases) — that's a behavior change vs master (which sent it verbatim) and would silently disable reasoning. Fixed: an **empty** capability set now **preserves** the configured effort (provider stays the final authority; worst case = the same rejected request master already produces, i.e. no regression). Known-bad clamps return *non-empty* filtered sets, so they still degrade correctly. Nathan chose this "preserve-for-unknown" behavior. + regression test. ### Gate - Full pytest suite: **7548 passed, 0 failed** - ruff: CLEAN · 48 reasoning tests pass (incl. preserve-for-unknown + codex-clamp + never-escalate) - Codex (regression): SHIP-ONLY-WITH-FIXES (unknown-model drop) → fixed → **SAFE TO SHIP** - Verified empirically: gpt-5/codex max→xhigh, o3 max/xhigh→high, unknown high→high (preserved), none/unset preserved Co-authored-by: franksong2702 <franksong2702@users.noreply.github.com>
feat: always show reasoning effort selector, default off for unrecognized models (#3377)
Improvement on the #3379 which fixed #3377.
Thinking Path
claude-sonnet-4-6:free), and new model releases would silently hide the selector even though the user might want to enable reasoning.What Changed
1. Backend: Always return full effort list with
reasoning_default_onflagIn
api/config.py(get_reasoning_status):resolve_model_reasoning_effortsreturns an empty list (unrecognized model), fall back to the fullVALID_REASONING_EFFORTSlist instead of[].supports_reasoning_effortis now alwaysTrue.reasoning_default_on:Truewhen the model was positively identified as reasoning-capable,Falseotherwise.2. Frontend: Always show chip, default to "None" for unrecognized models
In
static/ui.js:_applyReasoningChip()no longer hides the chip whensupported_effortsis empty. The chip is always displayed.reasoning_default_onisFalseand no effort has been explicitly set by the user, the chip defaults to "None" (inactive state)._applyReasoningOptions()now shows all effort levels in the dropdown when the supported set is empty (previously hid them all).fetchReasoningChip()error handler defaults toreasoning_default_on: falseso the chip remains functional even on API errors.3. Expanded Test Coverage
In
tests/test_reasoning_effort_model_capabilities.py:test_get_reasoning_status_includes_supported_effortsto assertreasoning_default_onisTrue.test_get_reasoning_status_unrecognized_model_still_offers_efforts: verifies that unrecognized models get the full effort list withreasoning_default_on=False.test_get_reasoning_status_recognized_model_default_on: verifies that recognized models getreasoning_default_on=True.4. Changelog Documentation
CHANGELOG.mdunder## [Unreleased].Why It Matters
Previously, the heuristic-based hide/show logic was a constant source of false negatives: every new model release, custom provider configuration, or aggregator-rewritten model ID risked hiding the reasoning selector. This PR eliminates that class of bugs entirely by never hiding the chip. Users can always opt into reasoning, and the default is informed but not restrictive.
Verification
Automated tests
Ran the pytest suite targeting reasoning effort model capabilities:
Result: 28 passed successfully.
Risks / Follow-ups
agent.reasoning_effortconfig is profile-scoped, not model-scoped.supports_reasoning_effortis now alwaysTrue, which changes the API contract. No known consumers rely on this boolean for chip visibility (the frontend usesreasoning_default_on), but third-party integrations should be aware.AI Usage Disclosure