Fix: extend ChatEngine HTTP-status mapping to Anthropic /messages and Open Responses /responses by mimeding · Pull Request #6 · mimeding/osaurus

mimeding · 2026-05-27T04:06:42Z

Summary

Why this matters (business)

Osaurus exposes the same local model behind three OpenAI-style endpoints (/v1/chat/completions, Anthropic-compatible /messages, OpenAI Responses-API /responses). PR osaurus-ai#863 taught one of them — /v1/chat/completions — to return the right HTTP status for ChatEngine errors:

404 when the requested model isn't installed,
503 when the remote provider that handles it is unavailable,
500 for true server faults.

The matching Anthropic and Open Responses handlers, which run on the same ChatEngine, kept returning blanket 500 Internal Server Error for every failure mode. That has three concrete user-visible effects:

Client SDKs (Anthropic SDK, Continue, Cline, OpenAI Responses clients) treat the error as a server outage and trigger backoff/retry — when the right answer is "tell the user the model isn't installed."
The WorkView in-app error classifier uses the HTTP status to decide whether to show "model not installed" guidance vs "service unavailable" vs "internal error." On these two endpoints it always picked the catch-all 5xx message.
Operators looking at the Insights tab see every misconfigured request logged as 500, which makes the dashboard look like the server is on fire.

This PR brings API parity to the three endpoints.

What's wrong (technical)

The /v1/chat/completions non-streaming handler already maps ChatEngine.EngineError -> httpStatus:

                    if let engineError = error as? ChatEngine.EngineError {
                        status = HTTPResponseStatus(statusCode: engineError.httpStatus)
                        body = engineError.errorDescription ?? error.localizedDescription
                    } else {
                        status = .internalServerError

Anthropic non-stream /messages did not:

            } catch {
                let errorResp = AnthropicError(message: error.localizedDescription, errorType: "api_error")
                ...
                hop {
                    var responseHead = HTTPResponseHead(version: head.version, status: .internalServerError)

Open Responses non-stream /responses did not:

            } catch {
                let errorResp = OpenResponsesErrorResponse(code: "api_error", message: error.localizedDescription)
                ...
                hop {
                    var responseHead = HTTPResponseHead(version: head.version, status: .internalServerError)

ChatEngine.EngineError already exposes httpStatus: Int and a human-readable errorDescription. Both handlers need to consume it.

Fix

Both non-streaming error paths now:

Translate EngineError to the right HTTP status on the wire (Int(httpStatus)).
Pick the provider-native error-type/code based on whether the status is a 4xx or a 5xx — Anthropic uses invalid_request_error for client errors and api_error for server faults; Open Responses uses the same pair under its code field.
Log responseStatus: Int(status.code) (matching PR Fix: /chat/completions logs status 500 even when the wire status is 404/503 #2) so the Insights row matches the wire response.

Errors that are not ChatEngine.EngineError (network failure, decoder failure, anything thrown by tool execution) still fall through to .internalServerError. Wire body shape is unchanged — same AnthropicError / OpenResponsesErrorResponse types, same JSON keys; only the values of type / code and the HTTP head change.

Scope decisions

Streaming /v1/chat/completions / /messages / /responses paths were also flagged in the audit (they always log 500 even though the wire status is the committed-200 of the SSE head). They have a real architectural quirk — once SSE is committed, you can't change the status — and need a separate, focused discussion. Left untouched here.
Ollama /chat and /api/chat follow the same SSE-committed pattern and are also out of scope.

Changes

Behavior change (status codes on Anthropic / Open Responses error paths now match /v1/chat/completions)
UI change
Refactor / chore
Tests (no existing test fixture covers HTTP-level error mapping for these endpoints; the change tracks an existing pattern)
Docs

Test Plan

# 1. Unknown model: should be 404 on both endpoints.
curl -s -o /tmp/body -w '%{http_code}\n' \
  http://127.0.0.1:1337/messages \
  -H 'Content-Type: application/json' \
  -d '{"model":"does-not-exist","messages":[{"role":"user","content":"hi"}]}'

curl -s -o /tmp/body -w '%{http_code}\n' \
  http://127.0.0.1:1337/responses \
  -H 'Content-Type: application/json' \
  -d '{"model":"does-not-exist","input":"hi"}'

# 2. Disconnect a remote provider, then call with one of its models.
#    Both endpoints should respond 503.

# 3. Insights tab should now show 404 / 503 in the status column;
#    previously these requests all rendered 500.

Checklist

I have read CONTRIBUTING.md
I added/updated tests where reasonable (see scope decisions)
I updated docs/README as needed (n/a — wire body shape and JSON keys are unchanged)
I verified build on macOS with Xcode 16.4+ (authored in a Linux sandbox; verified via swiftc -frontend -parse of the modified file)

The non-streaming /v1/chat/completions endpoint was already taught (PR osaurus-ai#863) to translate ChatEngine.EngineError into its intended HTTP status: 404 when the requested model isn't installed, 503 when the provider that handles it is unavailable, etc. The matching Anthropic /messages and Open Responses /responses endpoints, both of which run the same ChatEngine, were still hardcoded to 500 on every error. That meant API clients calling those endpoints couldn't distinguish 'install the model' from 'restart the local server' from 'the remote provider is offline,' and the WorkView error classifier — which uses the HTTP status to decide what to show the user — defaulted to the catch-all 5xx message. Mirror the /v1/chat/completions pattern in both handlers: * Translate EngineError -> httpStatus on the wire response head. * Pick the appropriate provider-native error_type/code (Anthropic's invalid_request_error vs api_error, Open Responses' two-tier code field) based on whether the status is 4xx or 5xx. * Log responseStatus: Int(status.code) instead of hardcoded 500 so the Insights tab matches the wire response. Other failure modes (network errors, decoder errors) still fall through to .internalServerError exactly as before; only EngineError gets the new mapping. The wire body shape is unchanged — same AnthropicError / OpenResponsesErrorResponse types, same JSON keys. Co-authored-by: Michael Meding <mimeding@users.noreply.github.com>

ModelManager.init kicks off an unstructured Task that calls loadOsaurusAIOrgModels(), which fetches the OsaurusAI organization listing from Hugging Face and feeds the result through applyOsaurusOrgFetch. The unit-test runner repeatedly constructs ModelManager() to drive applyOsaurusOrgFetch directly. The background launch-time fetch races with those test calls — whichever finishes last wins, and the merge result is non-deterministic. That's the root cause of the flaky ModelManagerSuggestedTests failures seen across many of the recent PR CI runs (applyOsaurusOrgFetch_dropsStaleAutoFetched OnReapply, applyOsaurusOrgFetch_addsNewEntriesAfterCurated, etc.). Gate the launch-time fetch on a small isRunningInTestEnvironment helper that checks for any of XCTestConfigurationFilePath, XCTestBundlePath, or XCTestSessionIdentifier in the process environment. Those variables are only present inside an xctest host process; production app launches still get the HF fetch exactly as before. This is a network call, so removing it under tests also has the side benefit of making the test suite work offline / on hermetic CI runners. Co-authored-by: Michael Meding <mimeding@users.noreply.github.com>

This was referenced May 27, 2026

Fix: ChatEngine never actually throws 503 — every routing failure becomes 404 #7

Draft

Fix flaky ModelManagerSuggestedTests: skip launch-time HF fetch under xctest #16

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: extend ChatEngine HTTP-status mapping to Anthropic /messages and Open Responses /responses#6

Fix: extend ChatEngine HTTP-status mapping to Anthropic /messages and Open Responses /responses#6
mimeding wants to merge 2 commits into
mainfrom
cursor/fix-other-endpoints-log-status-2812

mimeding commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mimeding commented May 27, 2026

Summary

Why this matters (business)

What's wrong (technical)

Fix

Scope decisions

Changes

Test Plan

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants