Skip to content

Fix: extend ChatEngine HTTP-status mapping to Anthropic /messages and Open Responses /responses#6

Draft
mimeding wants to merge 2 commits into
mainfrom
cursor/fix-other-endpoints-log-status-2812
Draft

Fix: extend ChatEngine HTTP-status mapping to Anthropic /messages and Open Responses /responses#6
mimeding wants to merge 2 commits into
mainfrom
cursor/fix-other-endpoints-log-status-2812

Conversation

@mimeding

Copy link
Copy Markdown
Owner

Summary

Why this matters (business)

Osaurus exposes the same local model behind three OpenAI-style endpoints (/v1/chat/completions, Anthropic-compatible /messages, OpenAI Responses-API /responses). PR osaurus-ai#863 taught one of them — /v1/chat/completions — to return the right HTTP status for ChatEngine errors:

  • 404 when the requested model isn't installed,
  • 503 when the remote provider that handles it is unavailable,
  • 500 for true server faults.

The matching Anthropic and Open Responses handlers, which run on the same ChatEngine, kept returning blanket 500 Internal Server Error for every failure mode. That has three concrete user-visible effects:

  1. Client SDKs (Anthropic SDK, Continue, Cline, OpenAI Responses clients) treat the error as a server outage and trigger backoff/retry — when the right answer is "tell the user the model isn't installed."
  2. The WorkView in-app error classifier uses the HTTP status to decide whether to show "model not installed" guidance vs "service unavailable" vs "internal error." On these two endpoints it always picked the catch-all 5xx message.
  3. Operators looking at the Insights tab see every misconfigured request logged as 500, which makes the dashboard look like the server is on fire.

This PR brings API parity to the three endpoints.

What's wrong (technical)

The /v1/chat/completions non-streaming handler already maps ChatEngine.EngineError -> httpStatus:

                    if let engineError = error as? ChatEngine.EngineError {
                        status = HTTPResponseStatus(statusCode: engineError.httpStatus)
                        body = engineError.errorDescription ?? error.localizedDescription
                    } else {
                        status = .internalServerError

Anthropic non-stream /messages did not:

            } catch {
                let errorResp = AnthropicError(message: error.localizedDescription, errorType: "api_error")
                ...
                hop {
                    var responseHead = HTTPResponseHead(version: head.version, status: .internalServerError)

Open Responses non-stream /responses did not:

            } catch {
                let errorResp = OpenResponsesErrorResponse(code: "api_error", message: error.localizedDescription)
                ...
                hop {
                    var responseHead = HTTPResponseHead(version: head.version, status: .internalServerError)

ChatEngine.EngineError already exposes httpStatus: Int and a human-readable errorDescription. Both handlers need to consume it.

Fix

Both non-streaming error paths now:

  1. Translate EngineError to the right HTTP status on the wire (Int(httpStatus)).
  2. Pick the provider-native error-type/code based on whether the status is a 4xx or a 5xx — Anthropic uses invalid_request_error for client errors and api_error for server faults; Open Responses uses the same pair under its code field.
  3. Log responseStatus: Int(status.code) (matching PR Fix: /chat/completions logs status 500 even when the wire status is 404/503 #2) so the Insights row matches the wire response.

Errors that are not ChatEngine.EngineError (network failure, decoder failure, anything thrown by tool execution) still fall through to .internalServerError. Wire body shape is unchanged — same AnthropicError / OpenResponsesErrorResponse types, same JSON keys; only the values of type / code and the HTTP head change.

Scope decisions

  • Streaming /v1/chat/completions / /messages / /responses paths were also flagged in the audit (they always log 500 even though the wire status is the committed-200 of the SSE head). They have a real architectural quirk — once SSE is committed, you can't change the status — and need a separate, focused discussion. Left untouched here.
  • Ollama /chat and /api/chat follow the same SSE-committed pattern and are also out of scope.

Changes

  • Behavior change (status codes on Anthropic / Open Responses error paths now match /v1/chat/completions)
  • UI change
  • Refactor / chore
  • Tests (no existing test fixture covers HTTP-level error mapping for these endpoints; the change tracks an existing pattern)
  • Docs

Test Plan

# 1. Unknown model: should be 404 on both endpoints.
curl -s -o /tmp/body -w '%{http_code}\n' \
  http://127.0.0.1:1337/messages \
  -H 'Content-Type: application/json' \
  -d '{"model":"does-not-exist","messages":[{"role":"user","content":"hi"}]}'

curl -s -o /tmp/body -w '%{http_code}\n' \
  http://127.0.0.1:1337/responses \
  -H 'Content-Type: application/json' \
  -d '{"model":"does-not-exist","input":"hi"}'

# 2. Disconnect a remote provider, then call with one of its models.
#    Both endpoints should respond 503.

# 3. Insights tab should now show 404 / 503 in the status column;
#    previously these requests all rendered 500.

Checklist

  • I have read CONTRIBUTING.md
  • I added/updated tests where reasonable (see scope decisions)
  • I updated docs/README as needed (n/a — wire body shape and JSON keys are unchanged)
  • I verified build on macOS with Xcode 16.4+ (authored in a Linux sandbox; verified via swiftc -frontend -parse of the modified file)
Open in Web Open in Cursor 

The non-streaming /v1/chat/completions endpoint was already taught
(PR osaurus-ai#863) to translate ChatEngine.EngineError into its intended HTTP
status: 404 when the requested model isn't installed, 503 when the
provider that handles it is unavailable, etc. The matching Anthropic
/messages and Open Responses /responses endpoints, both of which run
the same ChatEngine, were still hardcoded to 500 on every error.

That meant API clients calling those endpoints couldn't distinguish
'install the model' from 'restart the local server' from 'the remote
provider is offline,' and the WorkView error classifier — which uses
the HTTP status to decide what to show the user — defaulted to the
catch-all 5xx message.

Mirror the /v1/chat/completions pattern in both handlers:
  * Translate EngineError -> httpStatus on the wire response head.
  * Pick the appropriate provider-native error_type/code (Anthropic's
    invalid_request_error vs api_error, Open Responses' two-tier
    code field) based on whether the status is 4xx or 5xx.
  * Log responseStatus: Int(status.code) instead of hardcoded 500
    so the Insights tab matches the wire response.

Other failure modes (network errors, decoder errors) still fall
through to .internalServerError exactly as before; only EngineError
gets the new mapping. The wire body shape is unchanged — same
AnthropicError / OpenResponsesErrorResponse types, same JSON keys.

Co-authored-by: Michael Meding <mimeding@users.noreply.github.com>
ModelManager.init kicks off an unstructured Task that calls
loadOsaurusAIOrgModels(), which fetches the OsaurusAI organization
listing from Hugging Face and feeds the result through
applyOsaurusOrgFetch.

The unit-test runner repeatedly constructs ModelManager() to drive
applyOsaurusOrgFetch directly. The background launch-time fetch
races with those test calls — whichever finishes last wins, and
the merge result is non-deterministic. That's the root cause of
the flaky ModelManagerSuggestedTests failures seen across many of
the recent PR CI runs (applyOsaurusOrgFetch_dropsStaleAutoFetched
OnReapply, applyOsaurusOrgFetch_addsNewEntriesAfterCurated, etc.).

Gate the launch-time fetch on a small isRunningInTestEnvironment
helper that checks for any of XCTestConfigurationFilePath,
XCTestBundlePath, or XCTestSessionIdentifier in the process
environment. Those variables are only present inside an xctest host
process; production app launches still get the HF fetch exactly as
before.

This is a network call, so removing it under tests also has the
side benefit of making the test suite work offline / on hermetic
CI runners.

Co-authored-by: Michael Meding <mimeding@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants