Fix: ChatEngine never actually throws 503 — every routing failure becomes 404 by mimeding · Pull Request #7 · mimeding/osaurus

mimeding · 2026-05-27T04:12:25Z

Summary

Why this matters (business)

The error model in ChatEngine was already designed with two distinct outcomes — "you asked for a model nobody has" (404) and "we know how to serve this model, but the backend isn't healthy right now" (503). Client SDKs, the WorkView error classifier, and any future status-aware UI rely on those codes to give the right next action ("install the model" vs "retry in a moment" vs "check your network").

In practice, the 503 branch was unreachable code. Every routing failure — even ones where Foundation Models is genuinely available on the user's Mac but isAvailable() is reporting false at this moment — surfaced as 404. Users see "Model 'foundation' is not installed or registered with any provider" and reach for the model picker when they should just be retrying.

What's wrong (technical)

ChatEngine.EngineError declares both cases:

    struct EngineError: Error, LocalizedError {
        enum Kind {
            case modelNotFound(requested: String)
            case noServiceAvailable(requested: String)
        }
        ...
        var httpStatus: Int {
            switch kind {
            case .modelNotFound: return 404
            case .noServiceAvailable: return 503
            }
        }
    }

…but the ModelServiceRouter.resolve it consumes only returns .service or .none. The .none branch in ChatEngine always throws .modelNotFound:

        case .none:
            throw EngineError(kind: .modelNotFound(requested: request.model))

So .noServiceAvailable was dead code: it existed but nothing ever produced it.

Fix

Three small changes, one new test file:

ModelServiceRouter.resolve now returns one of three routes:
- .service — at least one candidate handles the model and reports isAvailable().
- .unavailable(requestedModel:) — at least one candidate handles(model:) but every such service answers isAvailable() == false.
- .none — no candidate handles the model at all.
Both ChatEngine switch sites (streaming and non-streaming) now throw EngineError(.noServiceAvailable) for .unavailable. noServiceAvailable.httpStatus is already wired to 503, and PR Fix: extend ChatEngine HTTP-status mapping to Anthropic /messages and Open Responses /responses #6 propagates that to the wire response on the Anthropic / Open Responses endpoints; /v1/chat/completions already used it.
CoreModelService collapses .unavailable and .none to its existing CoreModelError.modelUnavailable. The public CoreModelError API doesn't currently distinguish them, and the API-facing distinction is already covered via ChatEngine.
New focused unit tests (ModelServiceRouterTests.swift) cover:
- .service returned when the handler is available.
- .none returned when no service claims the model.
- .unavailable returned when a handler exists but reports unavailable.
- Two candidates, one online and one offline — router picks the online one.
- The same outcomes for remote-only services and for the default-model path.

Known limitation (called out in the commit message and in `resolve`'s docstring)

This fixes the case where a local service reports !isAvailable(). It does not yet fix the case where a remote provider is configured but disconnected, because callers (e.g. ChatEngine) pass RemoteProviderManager.shared.connectedServices() — disconnected remotes aren't in the array at all, so the router never sees them. Distinguishing that case would need RemoteProviderManager to expose a configured-but-disconnected list, which is a separate, larger change.

Changes

Behavior change (additive — new enum case; existing .service/.none consumers behave identically)
UI change
Refactor / chore
Tests (new ModelServiceRouterTests)
Docs

Test Plan

cd Packages/OsaurusCore && swift test --filter ModelServiceRouterTests should pass.
Manually: simulate Foundation Models unavailability (e.g. on a Mac that doesn't support it) and call /v1/chat/completions with "model":"foundation". Expected: 503 + noServiceAvailable error body. Previously: 404 + modelNotFound.
Call with a genuinely unknown model id ("model":"zzz"). Expected: 404 + modelNotFound. Unchanged.

Checklist

I have read CONTRIBUTING.md
I added/updated tests where reasonable
I updated docs/README as needed (n/a — internal routing change)
I verified build on macOS with Xcode 16.4+ (authored in a Linux sandbox; verified each touched file via swiftc -frontend -parse)

ChatEngine.EngineError already declared two cases that map to different HTTP statuses: case modelNotFound -> 404 case noServiceAvailable -> 503 …but the producer (the two switch sites in ChatEngine on ModelServiceRouter.resolve and the one in CoreModelService) only ever threw modelNotFound. The 503 case was dead code, and every routing failure — including ones where a service that handles the model existed but reported isAvailable() == false — surfaced as 404. This contradicts the audit's expectation set by PR osaurus-ai#863 / issue osaurus-ai#858: the WorkView error classifier and external API consumers use the HTTP status code to give users actionable feedback. 'install the model' vs 'service unavailable, retry' is exactly the distinction we couldn't make. Extend ModelRoute with a third case, .unavailable, that the router returns when at least one candidate service answers handles(model:) == true but every such service answers isAvailable() == false. The two ChatEngine switch sites now throw .noServiceAvailable for .unavailable, so the existing 503 path actually fires. CoreModelService collapses both to its existing modelUnavailable error since its public API doesn't currently distinguish them (callers needing the distinction go through ChatEngine). Limitation worth calling out: the router only sees the connected remote-provider list (callers pass connectedServices() at the call site). A configured-but-disconnected remote provider is therefore not visible to the router, and its absence still resolves to .none (404). Fixing that requires RemoteProviderManager to expose the configured- but-disconnected services list, which is a separate change. New focused unit tests cover all three outcomes (service / unavailable / none) plus the multi-service-some-unavailable case (router prefers the available service). Co-authored-by: Michael Meding <mimeding@users.noreply.github.com>

ModelManager.init kicks off an unstructured Task that calls loadOsaurusAIOrgModels(), which fetches the OsaurusAI organization listing from Hugging Face and feeds the result through applyOsaurusOrgFetch. The unit-test runner repeatedly constructs ModelManager() to drive applyOsaurusOrgFetch directly. The background launch-time fetch races with those test calls — whichever finishes last wins, and the merge result is non-deterministic. That's the root cause of the flaky ModelManagerSuggestedTests failures seen across many of the recent PR CI runs (applyOsaurusOrgFetch_dropsStaleAutoFetched OnReapply, applyOsaurusOrgFetch_addsNewEntriesAfterCurated, etc.). Gate the launch-time fetch on a small isRunningInTestEnvironment helper that checks for any of XCTestConfigurationFilePath, XCTestBundlePath, or XCTestSessionIdentifier in the process environment. Those variables are only present inside an xctest host process; production app launches still get the HF fetch exactly as before. This is a network call, so removing it under tests also has the side benefit of making the test suite work offline / on hermetic CI runners. Co-authored-by: Michael Meding <mimeding@users.noreply.github.com>

mimeding mentioned this pull request May 27, 2026

Fix flaky ModelManagerSuggestedTests: skip launch-time HF fetch under xctest #16

Draft

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: ChatEngine never actually throws 503 — every routing failure becomes 404#7

Fix: ChatEngine never actually throws 503 — every routing failure becomes 404#7
mimeding wants to merge 2 commits into
mainfrom
cursor/chat-engine-noservice-503-2812

mimeding commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mimeding commented May 27, 2026

Summary

Why this matters (business)

What's wrong (technical)

Fix

Known limitation (called out in the commit message and in resolve's docstring)

Changes

Test Plan

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Known limitation (called out in the commit message and in `resolve`'s docstring)