feat(ai): AI Engine — provider-agnostic LLM gateway + 8 connectors by thewrz · Pull Request #354 · wrzonance/WrzDJ

thewrz · 2026-05-25T19:56:43Z

Summary

Integration PR for the AI Engine Back-end Redesign epic. Replaces the single hardcoded Anthropic recommendation path with a provider-agnostic, per-DJ LLM gateway.

What lands

Gateway core (services/llm/): connector registry, adapter ABC, tool-translation layer, encrypted credential storage, audit + call logging, gateway dispatch (per-DJ MRU → org default).
8 connector types: openai_apikey, anthropic_apikey, openai_compatible (MVP feat(ai): provider-agnostic LLM gateway + 3 connectors + recommendation migration #348) + gemini_apikey (feat(ai): add Gemini provider adapter #351), xai_apikey (feat(ai): add xAI Grok provider adapter #350), openrouter_apikey (feat(ai): add OpenRouter provider adapter #352), azure_openai (feat(ai): add Azure OpenAI provider adapter #349), bedrock (feat(ai): add AWS Bedrock provider adapter #353).
Recommendation migration: recommendation engine now routes through the gateway (legacy env-var Anthropic path kept as fallback).
Admin UI (/admin/ai) + DJ UI (/settings/ai) for connector policy / connect / test / rotate / delete.
Migration 046_admin_ai_oauth (no per-provider columns added by the 5 adapter PRs — all gated by the generic llm_apikey_connectors_enabled flag; provider config in the encrypted credentials blob).

Provider notes

xAI / OpenRouter subclass the OpenAI-compatible adapter (fixed base URLs).
Gemini uses the native function_declarations tool schema.
Azure OpenAI uses per-deployment URLs + api-key header (with URL-injection hardening).
Bedrock signs SigV4 manually over httpx (no boto3), per-family tool translation (Anthropic / Llama).

Closes #329 #330 #331 #332 #334

Test plan

Full GitHub CI green (this PR finally triggers ci.yml — epic sub-PRs did not)
Local: backend ruff/format/bandit/pytest (~87% cov), frontend lint/tsc/vitest, alembic single-head + no drift
Manual: connect + test each connector type from /settings/ai; run a recommendation through a non-Anthropic connector
Admin: toggle llm_apikey_connectors_enabled, confirm gating

Summary by CodeRabbit

Release Notes

New Features
- Added LLM Gateway system supporting multiple AI providers (OpenAI, Anthropic, Gemini, Azure OpenAI, AWS Bedrock, OpenRouter, xAI) with per-DJ connector management.
- Introduced per-feature AI model preferences for recommendations and set builder.
- Added admin oversight dashboard with LLM policy controls, usage reporting, and audit trails.
- Implemented monthly token caps to control per-DJ AI usage.
- Added background health monitoring for AI connectors with DJ notifications.
- Moved DJ AI provider settings from /settings/ai to account page.
Bug Fixes
- Fixed SSE connection pool leak preventing proper resource cleanup.
Documentation
- Updated configuration guides and added LLM plugin adapter documentation.

Verification status (added on review)

OpenAI (openai_apikey) — verified live end-to-end on the testing server: connector test green, recommendation generated via gpt-5.4-mini, confirmed in llm_call_log.
Gemini / xAI / OpenRouter / Bedrock — implemented + full unit matrix (happy/401/429/5xx/timeout/malformed; Bedrock SigV4 checked against AWS's published test vector), reviewed, merged. Not yet exercised against live provider endpoints, but their acceptance criteria are met at the unit/integration level.
Azure OpenAI ([AI Engine] Add Azure OpenAI provider adapter #333) — implemented + unit-tested, but its acceptance criterion is "works against a real Azure OpenAI test deployment", which has not been performed. [AI Engine] Add Azure OpenAI provider adapter #333 intentionally left OUT of the Closes list until live Azure verification is done.

Follow-ups (not blocking this PR)

[AI Engine] DJ-readable LLM connector policy endpoint (fail-closed settings UI) #355 — DJ-readable LLM policy endpoint so /settings/ai can fail closed (from CodeRabbit review thread).
SSE event stream pins a pooled DB connection for the whole stream lifetime → pool exhaustion under guest load #356 — SSE stream pins a pooled DB connection for its lifetime → connection-pool exhaustion under guest load.

Sub-PRs target the long-lived epic/ai-engine branch, not main. Final epic->main merge happens once after end-to-end validation.

…on migration (#348) * feat(ai): provider-agnostic LLM gateway + 3 connectors + recommendation migration Implements the MVP of the AI Engine back-end redesign per docs/superpowers/specs/2026-05-24-admin-ai-oauth-design.md. Ships: - services/llm/ gateway with per-DJ MRU connector resolution + org default fallback - Adapters: openai_apikey (httpx), openai_compatible (Hermes/Ollama/etc), anthropic_apikey (SDK) - Models: LlmConnector (encrypted credentials), LlmCallLog (counts-only telemetry), LlmAuditEvent (credential lifecycle audit trail) - Alembic migration 046 — creates 3 tables, adds 3 system_settings columns, data-migrates ANTHROPIC_API_KEY env var into an "anthropic_apikey" connector owned by the first admin - REST API: /api/llm/connectors (per-DJ CRUD + test) + /api/admin/llm/* (policy, force-revoke, usage) - DJ UI: /settings/ai (connect/test/delete + Hermes ChatGPT-subscription onboarding link) - Admin UI: /admin/ai gains connector policy + per-DJ table + usage rollup - Recommendation engine routes through the gateway (actor=event.created_by, purpose="recommendation"); legacy direct-Anthropic path retained as fallback for callers without db/actor Security: - All credentials encrypted at rest via EncryptedText (Fernet) - Custom OpenAI-compatible URLs validated (HTTPS any host, HTTP loopback + RFC1918 only, no embedded creds, no query/fragment) - Sanitised error responses — no provider error bodies, stack traces, or bearer values leak - Audit events for every credential lifecycle event (create, rotate, delete, admin-revoke, auth-invalid) - Rate limited: create 5/min, test 10/min, rotate credentials 5/min - No new dependencies (httpx + anthropic already in tree, no openai SDK, no litellm) Tests: 104 new backend tests (gateway, adapters, tool translation, URL validation, API CRUD, recommendation regression), frontend tests for both new pages, +4 api client tests. Total backend coverage: 86.95% (above 85% threshold). Closes #329 * fix(llm): address CodeRabbit review on PR #348 Backend correctness: - llm_audit_event.target_connector_id FK now ondelete=SET NULL (migration + model) so audited connectors can be deleted without FK violations - gateway MRU resolver uses nulls_last() so never-used connectors sort after recently-used ones on PostgreSQL - tool_translation rejects Anthropic tool_use blocks missing id/name instead of emitting "None" tool calls - _httpx_openai flattens dict-style content blocks (was dropping their text) - connector_storage.update_metadata enforces the same 80-char model_hint bound as create - llm.py update path maps unique-constraint rename collisions to 409 (was 500) - admin_llm PATCH treats explicit null llm_default_connector_id as clear - registry.register_adapter raises on conflicting duplicate registration - is_llm_available(db, actor) mirrors gateway resolver semantics; callers in events.py pass the event owner as actor - schemas enforce per-type credential requirements (ConnectorCreate) and a non-empty rotation payload (ConnectorCredentialsRotate); openapi.json regenerated Frontend: - DJ AI settings: block "Add provider" when policy disables all types; render create failures with error styling instead of success - admin AI page: Promise.allSettled so one failing gateway request no longer blanks the others - api-types derives LlmConnectorType/Status from generated schema Docs + tests: - CLAUDE.md / .env.example clarify ANTHROPIC_API_KEY deprecation + org-default connector semantics - add regression tests for tool_use validation and URL validator IPv6/edge cases Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds an xAI Grok connector type (xai_apikey) to the LLM gateway. xAI exposes an OpenAI-compatible /chat/completions surface, so the adapter subclasses OpenAICompatibleAdapter with a fixed base_url (https://api.x.ai/v1) and api_key credential extraction, inheriting the OpenAI-compatible request/response and tool-use translation logic. - Reuses the generic llm_apikey_connectors_enabled admin flag (no per-provider column, no Alembic migration). - Stores the api_key in the encrypted credentials blob. - xAI key-format validation (xai- prefix). - Adapter test matrix: happy path, 401, 429, 5xx, timeout, malformed response, tool-use via inherited path, plus connector CRUD + policy gating through the API. Closes #331

* feat(ai): add OpenRouter LLM provider adapter Add an openrouter_apikey connector type that routes to many upstream models via a single OpenRouter API key. Subclasses OpenAICompatibleAdapter with a fixed base_url (https://openrouter.ai/api/v1) and surfaces the stored api_key as the Bearer token. Adds a model-hint dropdown backed by a cached (hourly TTL) fetch of OpenRouter's public /models catalogue, exposed via a minimal GET /api/llm/openrouter/models endpoint. Reuses the generic llm_apikey_connectors_enabled admin flag (no per-provider flag) and stores credentials in the encrypted creds blob — no DB columns and no Alembic migration. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(ai): include OpenRouter in admin connector-policy copy The admin "Allow API-key connectors" label still listed only "OpenAI, Anthropic" after openrouter_apikey was added as a supported connector type. Update the copy to include OpenRouter so the policy toggle messaging matches the available connectors. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(ai): add AWS Bedrock LLM provider adapter Adds a "bedrock" connector type to the provider-agnostic LLM gateway so DJs can route AI features to their own AWS Bedrock account (AWS-billed inference). - New adapter (services/llm/adapters/bedrock.py) signs InvokeModel requests with AWS SigV4 implemented manually over the existing httpx client — no boto3 or any new dependency (per the CLAUDE.md CVE/dependency rule). - Dependency-free SigV4 signer (services/llm/sigv4.py), pinned against AWS's published signing-key test vector. - Per-family tool translation keyed off aws_model_id: anthropic.* reuses the Anthropic tool schema/parser; meta.*/llama* uses prompt-embedded tools with JSON tool-call parsing (added to tool_translation.py). - Config (aws_region, aws_model_id) + creds (aws_access_key_id, aws_secret_access_key) live in the encrypted credentials blob — NO new DB columns and NO Alembic migration. - Reuses the generic llm_apikey_connectors_enabled admin flag (no per-provider flag). Creds rotate via the existing PUT /credentials route (partial rotation supported). - Frontend /settings/ai gains a Bedrock option (gated like other api-key connectors) with the four AWS credential inputs. Closes #334 * fix(llm): address CodeRabbit review on Bedrock adapter - bedrock: pass empty set (not None) for tool_names when no tools are configured, so generated {"name","input"} JSON is never misclassified as a tool call (would clear text + flip stop_reason to tool_use) - ai settings page: null model_hint for Bedrock so a stale hint left over from a prior connector-type selection is never posted - tests: assert aws_secret_access_key round-trips on create and aws_model_id is preserved during partial Bedrock credential rotation

* feat(ai): add Azure OpenAI LLM provider adapter Closes #333. Adds an azure_openai connector type to the LLM gateway. Azure uses a per-deployment endpoint and the api-key auth header (not Authorization: Bearer), so the adapter builds its own URL + headers while reusing the shared OpenAI request-body shaping and response parsing. All Azure config (resource name, deployment name, api version) plus the api key live in the encrypted credentials blob — no new DB columns, no migration. Admins can rotate any subset (resource/deployment/version or just the key) via the existing PUT /credentials route without recreating the connector. Gated by the generic llm_apikey_connectors_enabled policy flag (no per-provider flag). * fix(ai): harden Azure connector input validation (CodeRabbit) Address three CodeRabbit findings on the Azure OpenAI adapter: - _build_azure_endpoint now validates resource/deployment/api-version and URL-encodes path/query components, preventing URL-component injection (authority/path/query rewrite) from a malformed blob. - Pydantic create/rotate validators reject whitespace-only credential values via a shared _provided() helper. - Azure rotation distinguishes None (field omitted, keep current) from "" (explicit empty), passing empties to _build_azure_creds so they are rejected instead of silently preserving the old value. Adds regression tests for endpoint hardening, whitespace rejection, and empty-string rotation rejection.

* feat(ai): add Gemini provider adapter Add a native Google Gemini (generativelanguage API) connector to the LLM gateway alongside OpenAI/Anthropic. Implements the LlmAdapter ABC with function_declarations tool translation, header-based API-key auth, and the full error matrix (401/403, 429, 5xx, timeout, malformed). Reuses the generic llm_apikey_connectors_enabled admin flag — no per-provider flag, no migration. Credentials stored in the encrypted blob. Frontend surfaces the Gemini option when api-key connectors are enabled. Closes #330 * fix(llm): harden Gemini adapter + drop key-shaped test literals Address CodeRabbit review on PR #351: - Flatten dict-based content blocks in Gemini message translation so list content like {"text": "..."} is no longer silently dropped. - Normalize malformed Gemini functionCall args to ToolTranslationError instead of raw TypeError/ValueError from dict(...). - Replace AIza-shaped GCP key literals in tests with non-secret placeholders / runtime-built valid-shape keys to stop secret scanners. - Document gemini_apikey credential semantics in ConnectorCreate docstring and regenerate openapi.json. - Make admin AI policy checkbox label generic (provider-agnostic). - Add regression tests for list-content flattening and non-object args. * fix(llm): guard non-object Gemini functionCall payload CodeRabbit follow-up on PR #351: a truthy non-dict functionCall value (e.g. a string/list) would reach fn.get("name") and raise a raw AttributeError, bubbling as a 500. Normalize to ToolTranslationError, matching the args-handling pattern. Regression test added.

coderabbitai · 2026-05-25T19:57:49Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds an LLM gateway with adapters, DB models/migrations, admin/DJ APIs, streaming tests, health monitor, quotas, and moves recommendations to use the gateway. Frontend gains API types/clients and admin/DJ UIs. OpenAPI regenerated, SSE endpoint pool-safety fixed, and docs/env updated.

Changes

LLM Gateway + Admin/DJ Management

Layer / File(s)	Summary
End-to-end gateway, adapters, APIs, UI, tests `server/app/services/llm/...`, `server/app/api/...`, `server/app/models/...`, `dashboard/lib/...`, `dashboard/components/...`, `dashboard/app/...`, `server/alembic/...`, `server/openapi.json`, `server/tests/`, `docs/`	Implements provider-agnostic LLM gateway and adapters, schema/migrations, admin/DJ endpoints and UIs, streaming and health monitoring, monthly caps, and routes recommendations via the gateway; updates OpenAPI, tests, and docs.

Sequence Diagram(s)

sequenceDiagram
  participant DJ Client
  participant Dashboard API
  participant Gateway
  participant Adapter
  participant Provider
  participant DB
  DJ Client->>Dashboard API: Create/Test connector / Request recommendation
  Dashboard API->>Gateway: dispatch(ChatRequest, actor, purpose)
  Gateway->>DB: resolve connector (prefs/default/MRU), enforce monthly cap
  Gateway->>Adapter: chat()/stream()
  Adapter->>Provider: Provider-specific call
  Provider-->>Adapter: Response/Stream chunks
  Adapter-->>Gateway: ChatResponse/Chunks (+usage)
  Gateway->>DB: log_call, audit, status/health updates
  Gateway-->>Dashboard API: Result/stream
  Dashboard API-->>DJ Client: UI updates/CSV/usage

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

wrzonance/WrzDJ#348: Earlier gateway/connector + recommendation migration foundation touched the same gateway, migrations, and endpoints.
wrzonance/WrzDJ#379: Shares the streaming gateway path and /api/llm/connectors/{id}/stream-test SSE client/server wiring.
wrzonance/WrzDJ#377: Implements monthly token caps that this PR enforces/admin-manages across backend and admin UI.

Poem

A rabbit taps code in moonlit byte,
Gateways hum, models take flight.
Keys kept safe, audits aglow,
Streams like rivers softly flow.
DJs pick paths, defaults align—
Caps guard costs, all is fine.
Thump-thump: shipped on rabbit time! 🐇✨

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch epic/ai-engine

coderabbitai

Actionable comments posted: 16

🧹 Nitpick comments (5)

server/tests/test_llm_gateway.py (1)

194-231: ⚡ Quick win

This test currently validates system-context fallback, not admin-actor fallback.

Line 230 calls Gateway.dispatch(..., None, ...), so the admin-actor path described by the test isn’t exercised. Consider dispatching with admin_user_actor (or splitting into two tests).

🎯 Minimal adjustment

-        resp = await Gateway.dispatch(db, None, gateway_request, purpose="test")
+        resp = await Gateway.dispatch(db, admin_user_actor, gateway_request, purpose="test")

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/tests/test_llm_gateway.py` around lines 194 - 231, The test
test_falls_back_to_system_default is calling Gateway.dispatch with None as the
actor, so it exercises system-default fallback rather than the admin-actor
fallback described; update the call to Gateway.dispatch to pass admin_user_actor
(or split into two tests: one that calls Gateway.dispatch(db, admin_user_actor,
gateway_request, ...) to verify admin-specific connector behavior and another
that calls Gateway.dispatch(db, None, gateway_request, ...) to verify
system-default fallback) so the admin-actor path is actually exercised.

server/tests/test_llm_tool_translation.py (1)

66-69: ⚡ Quick win

Add missing invalid-force test for Anthropic tool translation.

to_anthropic_tools has a guarded error branch for unknown forced tools, but this suite doesn’t currently assert it (unlike OpenAI/Gemini). Add a pytest.raises(ToolTranslationError) case for "does_not_exist" to prevent silent regressions.

Proposed test addition

 class TestAnthropicTools:
@@
     def test_force_tool(self):
         tools, choice = to_anthropic_tools([TOOL], "rank_recommendations")
         assert choice == {"type": "tool", "name": "rank_recommendations"}
+
+    def test_force_tool_not_in_list(self):
+        with pytest.raises(ToolTranslationError):
+            to_anthropic_tools([TOOL], "does_not_exist")

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/tests/test_llm_tool_translation.py` around lines 66 - 69, Add a
negative test inside the test_force_tool case to exercise the guarded error
branch in to_anthropic_tools: call to_anthropic_tools([TOOL], "does_not_exist")
inside a pytest.raises(ToolTranslationError) context and assert that it raises;
keep the existing valid forced-tool assertion (to_anthropic_tools([TOOL],
"rank_recommendations") -> choice ==
{"type":"tool","name":"rank_recommendations"}) and add the raises case
immediately after to prevent silent regressions.

dashboard/app/(dj)/settings/ai/__tests__/page.test.tsx (1)

68-85: ⚡ Quick win

Align policy-filter test with the DJ auth boundary

This test currently mocks getAdminLlmPolicy() success while the page is running in DJ auth mode. Please add a case that models the real non-admin policy-fetch path so provider filtering behavior is validated under actual DJ constraints.

As per coding guidelines server/app/**/*.py: “Use get_current_admin dependency for all endpoints under /api/admin/*.”
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dashboard/app/`(dj)/settings/ai/__tests__/page.test.tsx around lines 68 - 85,
The test currently only mocks api.getAdminLlmPolicy for an admin path but the
page runs in DJ auth mode; add a second case that models the non-admin
policy-fetch flow by having api.getAdminLlmPolicy fail (e.g., reject with a 403)
and mock the non-admin endpoint api.getLlmPolicy (or the equivalent non-admin
policy getter used by SettingsAIPage) to return the policy {
llm_apikey_connectors_enabled: false, llm_compatible_connector_enabled: true,
llm_default_connector_id: null }, while keeping api.listLlmConnectors mocked;
then render SettingsAIPage, open the Add provider dropdown and assert the
provider select options equal ['openai_compatible'] so provider filtering is
validated under the DJ (non-admin) auth boundary.

server/app/services/llm/openrouter_models.py (1)

71-75: ⚡ Quick win

Return a copy of cached model lists to avoid accidental cache mutation by callers.

The function currently returns the cache list by reference, so any downstream mutation can poison the process-wide cache.

💡 Proposed fix

@@
         fetched_at, models = _cache
         if _now() - fetched_at < CACHE_TTL_SECONDS:
-            return models
+            return list(models)
@@
         logger.warning("Failed to fetch OpenRouter model catalogue")
         if _cache is not None:
-            return _cache[1]
+            return list(_cache[1])
         return []
@@
     elif _cache is not None:
         # Empty parse but we had a prior good list — keep serving it.
-        return _cache[1]
-    return models
+        return list(_cache[1])
+    return list(models)

Also applies to: 83-93

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/app/services/llm/openrouter_models.py` around lines 71 - 75, The
cached model list is returned by reference, allowing callers to mutate
process-wide _cache; update the cache-returning paths in the function that reads
_cache (the block checking "if not force_refresh and _cache is not None" and the
later branch that sets/returns _cache after fetching) to return a shallow copy
of the models (e.g., list(models) or models.copy()) instead of the original list
so callers cannot mutate the cached list stored in _cache; ensure both the early
return and the post-fetch return use the same copying approach.

server/openapi.json (1)

1904-2193: 🏗️ Heavy lift

Encode connector credential requirements in the OpenAPI schema (not just runtime validators).

ConnectorCreate / ConnectorCredentialsRotate in server/openapi.json make all credential fields optional/nullable, so generated TS clients can submit invalid payloads (e.g., an empty rotation object), even though the source Pydantic models enforce these rules at runtime (_require_credentials_for_type and _require_at_least_one). Consider expressing these constraints in the emitted schema (e.g., discriminated oneOf/anyOf variants and/or minProperties) so the OpenAPI contract matches the validation behavior.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/openapi.json` around lines 1904 - 2193, The OpenAPI schema currently
allows all credential fields optional which mismatches runtime validators;
update the ConnectorCreate and ConnectorCredentialsRotate definitions in
server/openapi.json to encode the credential requirements enforced by
_require_credentials_for_type and _require_at_least_one: for ConnectorCreate
replace the loose object with a discriminated oneOf (or anyOf) keyed by
connector_type that lists per-connector required properties (e.g., for
"openai_apikey"/"anthropic_apikey"/"openrouter_apikey"/"xai_apikey"/"gemini_apikey"
require api_key; for "openai_compatible" require base_url; for "bedrock" require
aws_access_key_id/aws_secret_access_key/aws_region/aws_model_id; for
"azure_openai" require
api_key/azure_resource_name/azure_deployment_name/azure_api_version) matching
build_create_payload; for ConnectorCredentialsRotate add minProperties:1 (or an
anyOf that requires at least one credential field) so generated TS clients
cannot send empty rotation objects and so rotate_credentials validation is
reflected in the contract.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@dashboard/app/`(dj)/settings/ai/page.tsx:
- Around line 95-100: The Promise.all handler uses fetchPolicySoft() which
returns null on failure and the caller (in this file) treats null as “allow
all,” letting DJs see restricted connectors; change the client-side handling in
the Promise.all().then(([rows, p]) => { ... }) so that if fetchPolicySoft()
returns null you setPolicy to a conservative fallback (deny/disallow connectors
or an explicit empty/forbidden policy) instead of treating null as permissive;
alternatively make fetchPolicySoft() return an explicit restrictive policy on
error. Also ensure the server-side policy endpoint is protected with the
admin-only dependency (use get_current_admin) so non-admin DJ requests cannot
retrieve admin policy data. Apply changes around functions/api calls:
api.listLlmConnectors(), fetchPolicySoft(), and the setPolicy usage referenced
in this file.

In `@dashboard/app/admin/ai/page.tsx`:
- Around line 19-25: TYPE_LABELS is missing mappings for some supported provider
enum values so admin views show raw enum keys; update the TYPE_LABELS object(s)
to include entries for "xai_apikey" => "xAI", "bedrock" => "AWS Bedrock" and
"azure_openai" => "Azure OpenAI" (and any duplicate TYPE_LABELS declarations in
this file: the declarations around the other referenced ranges) so those labels
render human-readable names in the admin UI.

In `@server/app/api/events.py`:
- Around line 943-947: The LLM availability check uses is_llm_available(db,
actor=user) but the response later is built without actor (so llm_available may
be incorrect for connector-specific setups) and the 503 message is stale; change
the availability check and any later llm_available evaluation to use the same
actor context used for the response (make both calls pass the same actor
variable, not hardcoded user), and update the HTTPException detail string to a
generic message that mentions connector/API key configuration rather than only
ANTHROPIC_API_KEY; reference is_llm_available and the
llm_available/HTTPException usage to locate and fix the code paths.

In `@server/app/api/llm.py`:
- Around line 317-324: Before calling delete_connector and committing, check
whether the system-wide default llm_default_connector_id equals the connector
being deleted and, if so, clear that field (set to None/empty) on the
system/settings record and persist the change; perform this update prior to
delete_connector/db.commit() so the global default isn't left stale (this
mirrors the logic used in the admin revoke path).

In `@server/app/models/llm_connector.py`:
- Line 88: The model's Mapped field "status" defines an ORM default but lacks
the DB server default present in the migration; update the mapped_column call
for status to include a matching server_default (e.g.,
server_default=sqlalchemy.text("'<STATUS>'") using the STATUS_ACTIVE value) so
the model exactly matches the migration (type String(20), nullable=False,
default=STATUS_ACTIVE and server_default present), and add the necessary import
for sqlalchemy.text if not already imported.

In `@server/app/models/system_settings.py`:
- Around line 31-33: The two boolean Mapped columns
llm_apikey_connectors_enabled and llm_compatible_connector_enabled currently
only set ORM defaults; update their mapped_column calls to include matching DB
server defaults (server_default) so the model exactly mirrors the migration—add
a server_default representing true (matching the migration's DB-side default)
for both columns in the SystemSettings model (symbols:
llm_apikey_connectors_enabled, llm_compatible_connector_enabled) and ensure the
column type and nullable settings remain identical to the migration.

In `@server/app/schemas/llm.py`:
- Around line 204-210: The AdminPolicyPatch model allows contradictory payloads
where clear_default=True and llm_default_connector_id is non-null; add
validation on AdminPolicyPatch (e.g., a pydantic `@root_validator` or validator)
that checks if clear_default is True and llm_default_connector_id is not None
and raises a ValueError (or ValidationError) to reject such payloads,
referencing the AdminPolicyPatch class and the clear_default and
llm_default_connector_id fields so clients cannot submit both simultaneously.

In `@server/app/services/llm/adapters/anthropic_apikey.py`:
- Around line 118-122: The list-content branch in anthropic_apikey.py flattens
only objects with a .text attribute, so dict-backed blocks like
{"type":"text","text":"..."} become empty; update the logic where content =
m.content and the list comprehension (the expression joining getattr(b, "text",
"") or "") to handle dicts by checking isinstance(b, dict) and using
b.get("text", "") (falling back to getattr(b, "text", "") when not a dict), so
any element with a 'text' key is included when computing text.
- Around line 81-94: The AsyncAnthropic client created by
self._client(timeout=timeout) in AnthropicApiKeyAdapter.chat() is never closed;
update chat() to use an async context (async with AsyncAnthropic(...) as
client:) or ensure await client.close() in a finally block so the per-request
HTTP resources are released; make the same change pathing through health_check()
which calls chat(), and keep existing exception handling (APITimeoutError,
APIConnectionError, APIStatusError, APIError) and return via
parse_anthropic_response(message).

In `@server/app/services/llm/adapters/bedrock.py`:
- Around line 109-117: The SigV4 signing call in the Bedrock adapter is missing
the temporary session token: update the sign_request(...) invocation in
bedrock.py to forward creds["aws_session_token"] (when present) so temporary STS
credentials are included; specifically add an
aws_session_token=creds.get("aws_session_token") argument to the
sign_request(...) call (function name: sign_request, variable: creds) so
connectors using session credentials authenticate correctly.
- Around line 193-197: The current Bedrock translators iterate m.content and
only read attribute .text, which drops dict-backed blocks; update the logic that
sets the local variable text (the branches handling m.content/list and the
similar block at the other translator) to handle dict items as well—when
iterating content, for each element b use getattr(b, "text", None) first and if
that is None and isinstance(b, dict) then use b.get("text") (or b.get("content")
if that is the canonical key your callers use), concatenating those values so
dict-backed blocks are preserved instead of collapsing to empty strings; apply
the same fix to both occurrences (the m.content handling around the text
assignment and the second occurrence mentioned at the other block).

In `@server/app/services/llm/adapters/openai_compatible.py`:
- Around line 24-57: The extracted base_url from _extract_credentials must be
validated with the URL validator before any outbound HTTP call: after calling
self._extract_credentials() in both chat and health_check (or perform the
validation inside _extract_credentials itself), pass the base_url through the
validator from url_validator.py (using the same policy: HTTPS any host; HTTP
only loopback+RFC1918) and if validation fails raise AuthInvalid (or similar) to
stop the request; only call call_openai_chat when the validator approves the
base_url and keep passing bearer as before.

In `@server/app/services/llm/tool_translation.py`:
- Around line 61-70: The current parsing loop over raw_tool_calls may produce
non-dict inputs (or other exception types) that escape as raw exceptions; update
the parsing logic in the loop that handles raw_args (the block building
input_obj from json.loads or raw_args) to explicitly validate that input_obj is
a dict and, if not, raise ToolTranslationError with a clear message; also catch
JSONDecodeError as already done but ensure any other unexpected types (e.g., if
raw_args is None or a primitive) are converted into ToolTranslationError instead
of letting TypeError/ValueError propagate; apply the same validation and error
normalization to the analogous parsing block around the code referenced at lines
150-163 so both ToolCall creation sites (using variables raw_args, input_obj,
ToolCall(...)) consistently raise ToolTranslationError for malformed/non-dict
tool inputs.

In `@server/app/services/recommendation/llm_client.py`:
- Around line 287-305: The current path always calls Gateway.dispatch when db is
present which skips the legacy Anthropic fallback; modify the logic in the block
that builds ChatRequest and calls Gateway.dispatch (symbols: ChatRequest,
Gateway.dispatch, _parse_tool_response, _legacy_call, db, actor, user_message)
so that after awaiting Gateway.dispatch you detect a failed/no-connector or
non-actionable response (e.g., dispatch returns None, an error flag, or a
response without tool output) and in that case call and await
_legacy_call(user_message) and set result to that value; otherwise keep using
_parse_tool_response(response) as now. Ensure the fallback path preserves
async/await semantics and error propagation.

In `@server/openapi.json`:
- Around line 2304-2335: Update the source Pydantic/FastAPI model that generates
ConnectorPatch so that display_name is optional but not nullable (remove
Optional[...]=None or default=None usage and instead use a sentinel/unset
default), e.g., change the field declaration used to produce ConnectorPatch so
its type is just str with a missing-field default rather than allowing None;
then regenerate the OpenAPI (server/openapi.json) so ConnectorPatch.display_name
no longer includes "type": "null". This aligns the schema with update_metadata()
in server/app/services/llm/connector_storage.py (which treats None as "do not
update") and with ConnectorOut.display_name being always a string. Ensure the
model change propagates to whatever model/class produces ConnectorPatch (the
source model class that defines display_name).

In `@server/tests/test_llm_api.py`:
- Around line 436-444: The test test_create_rejects_unknown_type is using a
supported connector_type ("gemini_apikey"), so change the payload sent to
client.post("/api/llm/connectors") to use a truly unknown connector_type (e.g.,
"unknown_connector" or "not_a_connector") so the request exercises the
server-side unknown-type handling and the assertion resp.status_code in (400,
422) is meaningful.

---

Nitpick comments:
In `@dashboard/app/`(dj)/settings/ai/__tests__/page.test.tsx:
- Around line 68-85: The test currently only mocks api.getAdminLlmPolicy for an
admin path but the page runs in DJ auth mode; add a second case that models the
non-admin policy-fetch flow by having api.getAdminLlmPolicy fail (e.g., reject
with a 403) and mock the non-admin endpoint api.getLlmPolicy (or the equivalent
non-admin policy getter used by SettingsAIPage) to return the policy {
llm_apikey_connectors_enabled: false, llm_compatible_connector_enabled: true,
llm_default_connector_id: null }, while keeping api.listLlmConnectors mocked;
then render SettingsAIPage, open the Add provider dropdown and assert the
provider select options equal ['openai_compatible'] so provider filtering is
validated under the DJ (non-admin) auth boundary.

In `@server/app/services/llm/openrouter_models.py`:
- Around line 71-75: The cached model list is returned by reference, allowing
callers to mutate process-wide _cache; update the cache-returning paths in the
function that reads _cache (the block checking "if not force_refresh and _cache
is not None" and the later branch that sets/returns _cache after fetching) to
return a shallow copy of the models (e.g., list(models) or models.copy())
instead of the original list so callers cannot mutate the cached list stored in
_cache; ensure both the early return and the post-fetch return use the same
copying approach.

In `@server/openapi.json`:
- Around line 1904-2193: The OpenAPI schema currently allows all credential
fields optional which mismatches runtime validators; update the ConnectorCreate
and ConnectorCredentialsRotate definitions in server/openapi.json to encode the
credential requirements enforced by _require_credentials_for_type and
_require_at_least_one: for ConnectorCreate replace the loose object with a
discriminated oneOf (or anyOf) keyed by connector_type that lists per-connector
required properties (e.g., for
"openai_apikey"/"anthropic_apikey"/"openrouter_apikey"/"xai_apikey"/"gemini_apikey"
require api_key; for "openai_compatible" require base_url; for "bedrock" require
aws_access_key_id/aws_secret_access_key/aws_region/aws_model_id; for
"azure_openai" require
api_key/azure_resource_name/azure_deployment_name/azure_api_version) matching
build_create_payload; for ConnectorCredentialsRotate add minProperties:1 (or an
anyOf that requires at least one credential field) so generated TS clients
cannot send empty rotation objects and so rotate_credentials validation is
reflected in the contract.

In `@server/tests/test_llm_gateway.py`:
- Around line 194-231: The test test_falls_back_to_system_default is calling
Gateway.dispatch with None as the actor, so it exercises system-default fallback
rather than the admin-actor fallback described; update the call to
Gateway.dispatch to pass admin_user_actor (or split into two tests: one that
calls Gateway.dispatch(db, admin_user_actor, gateway_request, ...) to verify
admin-specific connector behavior and another that calls Gateway.dispatch(db,
None, gateway_request, ...) to verify system-default fallback) so the
admin-actor path is actually exercised.

In `@server/tests/test_llm_tool_translation.py`:
- Around line 66-69: Add a negative test inside the test_force_tool case to
exercise the guarded error branch in to_anthropic_tools: call
to_anthropic_tools([TOOL], "does_not_exist") inside a
pytest.raises(ToolTranslationError) context and assert that it raises; keep the
existing valid forced-tool assertion (to_anthropic_tools([TOOL],
"rank_recommendations") -> choice ==
{"type":"tool","name":"rank_recommendations"}) and add the raises case
immediately after to prevent silent regressions.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: f51edd89-6135-47f9-98a6-0da1165ba44f

📥 Commits

Reviewing files that changed from the base of the PR and between e095380 and 4d36057.

📒 Files selected for processing (56)

.env.example
CLAUDE.md
dashboard/app/(dj)/settings/ai/__tests__/page.test.tsx
dashboard/app/(dj)/settings/ai/page.tsx
dashboard/app/admin/ai/__tests__/page.test.tsx
dashboard/app/admin/ai/page.tsx
dashboard/lib/__tests__/api.test.ts
dashboard/lib/api-types.generated.ts
dashboard/lib/api-types.ts
dashboard/lib/api.ts
docs/superpowers/specs/2026-05-24-admin-ai-oauth-design.md
server/alembic/versions/046_admin_ai_oauth.py
server/app/api/__init__.py
server/app/api/admin_llm.py
server/app/api/events.py
server/app/api/llm.py
server/app/models/__init__.py
server/app/models/llm_connector.py
server/app/models/system_settings.py
server/app/schemas/llm.py
server/app/services/llm/__init__.py
server/app/services/llm/adapters/__init__.py
server/app/services/llm/adapters/_httpx_openai.py
server/app/services/llm/adapters/anthropic_apikey.py
server/app/services/llm/adapters/azure_openai.py
server/app/services/llm/adapters/bedrock.py
server/app/services/llm/adapters/gemini_apikey.py
server/app/services/llm/adapters/openai_apikey.py
server/app/services/llm/adapters/openai_compatible.py
server/app/services/llm/adapters/openrouter_apikey.py
server/app/services/llm/adapters/xai_apikey.py
server/app/services/llm/base.py
server/app/services/llm/connector_storage.py
server/app/services/llm/exceptions.py
server/app/services/llm/gateway.py
server/app/services/llm/openrouter_models.py
server/app/services/llm/registry.py
server/app/services/llm/sigv4.py
server/app/services/llm/tool_translation.py
server/app/services/llm/url_validator.py
server/app/services/recommendation/llm_client.py
server/app/services/recommendation/llm_hooks.py
server/app/services/recommendation/service.py
server/app/services/system_settings.py
server/openapi.json
server/scripts/export_openapi.py
server/tests/test_llm_adapters.py
server/tests/test_llm_api.py
server/tests/test_llm_bedrock_adapter.py
server/tests/test_llm_gateway.py
server/tests/test_llm_hooks.py
server/tests/test_llm_openrouter_models.py
server/tests/test_llm_recommendation_via_gateway.py
server/tests/test_llm_sigv4.py
server/tests/test_llm_tool_translation.py
server/tests/test_llm_url_validator.py

…endation model OpenAI Platform GPT-5/o-series models reject the legacy `max_tokens` field (HTTP 400 unsupported_parameter) and require `max_completion_tokens`; Azure OpenAI serves the same models. Thread a `max_tokens_field` through the shared OpenAI caller so the Platform + Azure adapters opt into `max_completion_tokens`, while third-party compatible / xAI / OpenRouter endpoints keep legacy `max_tokens`. Drop the 1-token cap from the shared connector healthcheck: reasoning models spend their completion budget on internal reasoning tokens before producing any visible output, so a 1-token budget yields zero output and an HTTP 400. Propagate the gateway's actual ChatResponse.model through LLMSuggestionResult → LLMRecommendationResult → the /recommendations/llm response, replacing the hardcoded anthropic_model default. The recommendation badge now reflects the connector that actually ran (e.g. gpt-5.4-mini) instead of always "Haiku".

coderabbitai

🧹 Nitpick comments (1)

server/app/services/recommendation/llm_hooks.py (1)
43-46: 💤 Low value

Consider adding type hints for db and actor parameters.

The parameters lack type annotations, unlike the equivalent signature in llm_client.py:call_llm which has db: Session | None = None, actor: User | None = None. Adding them improves IDE support and consistency.
     *,
-    db=None,
-    actor=None,
+    db: "Session | None" = None,
+    actor: "User | None" = None,
(Use string literals or add imports at top of file.)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/app/services/recommendation/llm_hooks.py` around lines 43 - 46, Add
explicit type annotations for the db and actor parameters in the function
signature in llm_hooks.py to match the equivalent in llm_client.py: use db:
Session | None = None and actor: User | None = None (either import Session and
User at the top of llm_hooks.py or use string-literal forward refs like
"Session" and "User") so IDEs and type checkers get the same hints as call_llm;
update the function signature where db and actor appear to include these types.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@server/app/services/recommendation/llm_hooks.py`:
- Around line 43-46: Add explicit type annotations for the db and actor
parameters in the function signature in llm_hooks.py to match the equivalent in
llm_client.py: use db: Session | None = None and actor: User | None = None
(either import Session and User at the top of llm_hooks.py or use string-literal
forward refs like "Session" and "User") so IDEs and type checkers get the same
hints as call_llm; update the function signature where db and actor appear to
include these types.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 858bc04f-dd67-42ff-9cc8-b7d4815e735b

📥 Commits

Reviewing files that changed from the base of the PR and between 4d36057 and 3baf828.

📒 Files selected for processing (10)

dashboard/app/(dj)/events/[code]/components/RecommendationsCard.tsx
server/app/api/events.py
server/app/services/llm/adapters/_httpx_openai.py
server/app/services/llm/adapters/azure_openai.py
server/app/services/llm/adapters/openai_apikey.py
server/app/services/recommendation/llm_client.py
server/app/services/recommendation/llm_hooks.py
server/app/services/recommendation/service.py
server/tests/test_llm_adapters.py
server/tests/test_llm_recommendation_via_gateway.py

…session token, admin labels, policy validation, SSRF revalidation, server defaults) - admin AI page: add xAI/Bedrock/Azure OpenAI to TYPE_LABELS - events.py: refresh 503 LLM message for connector gateway; pass actor=user - llm.py: DJ delete clears system default connector (parity with admin revoke) - models: server_default on connector.status and settings LLM policy bools - schemas: AdminPolicyPatch rejects clear_default + non-null default id - anthropic adapter: close client via async with; handle dict text blocks - bedrock adapter: forward aws_session_token to SigV4; handle dict text blocks - openai_compatible: re-validate base_url at call time (SSRF boundary) - tool_translation: malformed OpenAI tool calls raise ToolTranslationError - tests: cover delete-clears-default, policy validator, unsupported type

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

dashboard/app/admin/ai/page.tsx (1)
19-28: ⚡ Quick win

Type this map from the shared connector union.

Record<string, string> lets future backend connector additions compile while this label table silently falls out of date again. Key it from LlmAdminConnector['connector_type'] so missing labels fail at type-check time.
♻️ Proposed fix
-const TYPE_LABELS: Record<string, string> = {
+const TYPE_LABELS: Record<LlmAdminConnector['connector_type'], string> = {
Based on learnings Pydantic models should mirror backend response schemas with TypeScript types in frontend.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dashboard/app/admin/ai/page.tsx` around lines 19 - 28, The TYPE_LABELS map is
currently typed as Record<string, string>, which allows missing labels to pass;
change its type to use the shared connector union so missing labels cause a type
error—replace Record<string, string> with a type keyed by
LlmAdminConnector['connector_type'] (or an equivalent mapped union) and ensure
TYPE_LABELS includes an entry for every member of
LlmAdminConnector['connector_type'] so the compiler enforces label coverage;
update usages if needed to satisfy the stricter type.
server/app/services/llm/adapters/bedrock.py (1)
109-119: ⚡ Quick win

Add regression coverage for the Bedrock-specific fixes.

These paths were already fragile enough to need follow-up fixes, but there still isn't Bedrock adapter coverage here for STS session credentials or dict-backed content blocks. A small adapter test that asserts aws_session_token reaches SigV4 signing and that mixed block shapes flatten correctly would keep both regressions from slipping back in.

Based on learnings: For each bug fix, pin a regression test that references the commit SHA to prevent the bug from recurring.

Also applies to: 190-258
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/app/services/llm/adapters/bedrock.py` around lines 109 - 119, Add
regression tests for the Bedrock adapter to prevent the STS session-token and
mixed/block-shape regressions from returning: write a test that calls the
Bedrock request-building/send flow (the code path that calls sign_request with
utcnow()) and asserts that creds["aws_session_token"] is forwarded into
sign_request (i.e., X-Amz-Security-Token is included in the SigV4-signed
headers) and a separate test that exercises dict-backed/mixed content blocks
through the adapter's payload flattening to ensure the shape is normalized the
same way as in the fix; make the tests reference the commit SHA that introduced
the fixes and target the Bedrock adapter functions that build payloads and call
sign_request so future changes will fail if the token forwarding or block
flattening regresses.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@server/app/services/llm/adapters/openai_compatible.py`:
- Around line 34-45: The code currently falls back to
self.connector.base_url_plain when blob lacks "base_url", which exposes the
plaintext admin mirror as an active source of truth; change the logic in the
routine that reads the blob so it requires the encrypted blob to contain
"base_url" (do not use self.connector.base_url_plain as a fallback), and if
missing raise AuthInvalid("Connector is missing a base_url"); keep the existing
validate_compatible_base_url(str(base_url)) call and the InvalidBaseUrlError ->
AuthInvalid conversion logic unchanged so validation still occurs at call time.

In `@server/app/services/llm/tool_translation.py`:
- Around line 71-76: The current parsing in tool_translation.py wrongly coerces
falsy non-object inputs (e.g., [] or False) into {} which lets invalid tool
arguments pass; in the block handling raw_args (inside the try/except that
raises ToolTranslationError) change the logic so only None becomes an empty dict
and all other non-string values are left as-is (e.g., use “if raw_args is None
then {} else raw_args” or explicit branches), then validate that the resulting
input_obj is a dict and raise ToolTranslationError for anything else; update the
code paths around json.loads(raw_args) and the subsequent isinstance check to
reflect this behavior (refer to the try/except, json.loads usage and the
ToolTranslationError raises).

---

Nitpick comments:
In `@dashboard/app/admin/ai/page.tsx`:
- Around line 19-28: The TYPE_LABELS map is currently typed as Record<string,
string>, which allows missing labels to pass; change its type to use the shared
connector union so missing labels cause a type error—replace Record<string,
string> with a type keyed by LlmAdminConnector['connector_type'] (or an
equivalent mapped union) and ensure TYPE_LABELS includes an entry for every
member of LlmAdminConnector['connector_type'] so the compiler enforces label
coverage; update usages if needed to satisfy the stricter type.

In `@server/app/services/llm/adapters/bedrock.py`:
- Around line 109-119: Add regression tests for the Bedrock adapter to prevent
the STS session-token and mixed/block-shape regressions from returning: write a
test that calls the Bedrock request-building/send flow (the code path that calls
sign_request with utcnow()) and asserts that creds["aws_session_token"] is
forwarded into sign_request (i.e., X-Amz-Security-Token is included in the
SigV4-signed headers) and a separate test that exercises dict-backed/mixed
content blocks through the adapter's payload flattening to ensure the shape is
normalized the same way as in the fix; make the tests reference the commit SHA
that introduced the fixes and target the Bedrock adapter functions that build
payloads and call sign_request so future changes will fail if the token
forwarding or block flattening regresses.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 4e5cd8e8-c20a-4c64-aa74-555a03852671

📥 Commits

Reviewing files that changed from the base of the PR and between 3baf828 and 27a9dd7.

📒 Files selected for processing (12)

dashboard/app/admin/ai/page.tsx
server/app/api/events.py
server/app/api/llm.py
server/app/models/llm_connector.py
server/app/models/system_settings.py
server/app/schemas/llm.py
server/app/services/llm/adapters/anthropic_apikey.py
server/app/services/llm/adapters/bedrock.py
server/app/services/llm/adapters/openai_compatible.py
server/app/services/llm/tool_translation.py
server/tests/test_llm_adapters.py
server/tests/test_llm_api.py

* feat(ai): add admin LLM audit-trail browse + CSV export endpoints (#341) Add read-only admin endpoints over the existing llm_audit_event table: - GET /api/admin/llm/audit: paginated, filterable JSON browse (event_type, actor_user_id, target_connector_id, days) with joined actor username and target connector display name. - GET /api/admin/llm/audit.csv: streaming CSV export honoring the same filters, capped at 10k rows. Both join the connector display name only — credential material is never read or returned. Adds frontend API client methods + types (no migration). * feat(ai): add Audit trail card to admin AI page (#341) Add a filterable, paginated audit-trail section to /admin/ai that browses llm_audit_event via the new admin endpoints. Filters: event type, actor user ID, target connector, date range. Columns: timestamp, actor, event type, connector display name, notes. Includes CSV export honoring active filters and prev/next pagination. * fix(ai): address CodeRabbit review on PR #360 (CSV formula injection, text/csv contract, test coverage) - Neutralize spreadsheet-formula injection in audit CSV export by prefixing cells starting with =, +, -, @, tab, or CR with a single quote - Declare text/csv response media type on GET /audit.csv so the OpenAPI contract + regenerated TS types model it as a CSV download (was application/json) - Add unauthenticated 401 guard test + formula-injection regression test for the CSV endpoint; assert CSV export honors the active event-type filter * refactor(ai): dedupe CSV formula-injection guard into core/csv_safe The audit CSV export hand-rolled a private _sanitize_csv_cell that duplicated services/export.py:sanitize_csv_value (same defang, less robust — no None handling). Promote the shared primitive to core/csv_safe.py (alongside the other cross-cutting primitives) and point both the audit export and the event exports at it. stdlib csv.writer still handles serialization; only the formula-injection guard is app-layer. No behavior change. * fix(ai): escape leading LF in CSV sanitizer to close formula-injection bypass sanitize_csv_value guarded = + - @ tab CR but not LF. A leading newline can be stripped by spreadsheet importers, exposing a trailing formula. Add "\n" to the prefix set (now covers both the audit and event CSV exports) + regression test.

Add ChatRequest.fallback_policy ("none" | "org_default" | "retry_then_org_default") so a DJ's connector that fails mid-event with a transient/credential error can optionally fall back to the org-default connector instead of failing the call. - Gateway extracts a single-attempt helper (_attempt) and a bounded fallback chain: at most one same-connector retry (retry_then_org_default) plus one org-default attempt. Never loops. - Fallback-eligible triggers: RateLimited, AuthInvalid, ProviderUnavailable, QuotaExceeded. ToolTranslationError / generic LlmError are NOT eligible. - Writes a fallback_triggered:<trigger> LlmAuditEvent (reusing the existing audit_event helper) targeting the fallback connector id. Trigger encoded in the event_type string to fit the existing String(60) column — no migration. - Default "none" preserves the current fail-fast behavior; callers opt in per-call.

* feat(ai): DJ-readable LLM connector policy endpoint (#355) Add GET /api/llm/policy (get_current_active_user) exposing the non-sensitive connector-policy subset DJs need: the two enablement toggles plus a server-computed allowed_connector_types set. Admin-only fields (llm_default_connector_id) are intentionally excluded. The /settings/ai page now reads this DJ endpoint instead of the admin-only /api/admin/llm/policy (which 403s for non-admin DJs). It fails *closed*: when the policy can't be read it offers no providers, rather than the old behavior of treating a 403 as "all types allowed" and surfacing providers the admin disabled (which the create call would reject with a 403). allowed_connector_types is computed server-side so the picker exactly matches what the create endpoint accepts, eliminating UX/enforcement drift. No schema/migration — reads existing SystemSettings flags. * fix(ai): document 401/403 on DJ policy route, harden DJ-scoped policy test Add explicit 401/403 responses to GET /api/llm/policy so the regenerated OpenAPI contract and TS client cover the unauthenticated and pending-DJ failure modes the route already enforces via get_current_active_user. Regenerate openapi.json + api-types.generated.ts. Harden the "reads the DJ-scoped policy endpoint" test to assert the admin policy endpoint is never called, fully enforcing the regression guard.

coderabbitai

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@dashboard/app/`(dj)/settings/ai/page.tsx:
- Around line 128-134: The current useMemo treats policy === null as “disabled”
even while the policy is still loading; change the guard to only return an empty
list when policy === null AND loading is false (i.e. if (policy === null &&
!loading) return []), otherwise allow the memo to defer to the actual policy
value (or a loading/fallback) so the UI doesn't show “disabled by admin” during
initial load; apply the same change to the other occurrence referencing policy
(lines ~286-290) so both spots gate on the loading flag.

In `@dashboard/app/admin/ai/__tests__/page.test.tsx`:
- Around line 258-283: The test currently stubs the global URL with
vi.stubGlobal(...) and calls vi.unstubAllGlobals() at the end, which can leak if
an assertion fails; update the test that renders AdminAISettingsPage so that the
vi.stubGlobal('URL', { ...URL, createObjectURL, revokeObjectURL }) is paired
with a try/finally block that calls vi.unstubAllGlobals() in the finally clause
to guarantee restoration even on test failure, keeping the rest of the test
logic (render, waitFor assertions, fireEvent calls, auditSpy/exportSpy checks)
unchanged.

In `@dashboard/app/admin/ai/page.tsx`:
- Around line 33-41: The AUDIT_EVENT_TYPES list is missing gateway-emitted
fallback_triggered events; update the array (variable AUDIT_EVENT_TYPES) to
include a fallback entry such as { value: 'fallback_triggered', label: 'Fallback
triggered (all)' } (and add any specific fallback flavors if you expect typed
variants, e.g. 'fallback_triggered:connector') so the admin dropdown can filter
those audit entries; make the same addition to the duplicate list used later in
the file (the other AUDIT_EVENT_TYPES occurrence around the bottom) so both
dropdowns include fallback-triggered options.

In `@server/app/core/csv_safe.py`:
- Around line 9-24: sanitize_csv_value currently only checks the very first
character and fails to handle values with leading normal spaces (e.g. "
=HYPERLINK(...)"), and also misses a final return for non-formula values; update
sanitize_csv_value (and keep _FORMULA_PREFIXES) to locate the first non-space
character in value (e.g. find the index of the first char not equal to ' '),
check that character against _FORMULA_PREFIXES, and if it matches return "'" +
value, otherwise return the original value (still return "" for falsy input).

In `@server/tests/test_export.py`:
- Around line 323-327: Add a regression case to the CSV sanitization tests to
cover formulas that start with a leading space before a dangerous payload;
update the test_sanitizes_line_feed (or add a new test) to call
sanitize_csv_value with the string " =cmd|' /C calc'!A0" and assert the returned
value begins with a single-quote (e.g., result.startswith("'")), ensuring
sanitize_csv_value handles the space-prefix bypass path the same as a leading
newline.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 3198dcc6-8dd9-47d1-8f2c-551adef1adc0

📥 Commits

Reviewing files that changed from the base of the PR and between 27a9dd7 and fd7e650.

📒 Files selected for processing (20)

dashboard/app/(dj)/settings/ai/__tests__/page.test.tsx
dashboard/app/(dj)/settings/ai/page.tsx
dashboard/app/admin/ai/__tests__/page.test.tsx
dashboard/app/admin/ai/page.tsx
dashboard/lib/api-types.generated.ts
dashboard/lib/api-types.ts
dashboard/lib/api.ts
docs/superpowers/plans/2026-05-25-llm-audit-trail-admin-ui.md
server/app/api/admin_llm.py
server/app/api/llm.py
server/app/core/csv_safe.py
server/app/schemas/llm.py
server/app/services/export.py
server/app/services/llm/base.py
server/app/services/llm/gateway.py
server/openapi.json
server/tests/test_export.py
server/tests/test_llm_admin_audit.py
server/tests/test_llm_api.py
server/tests/test_llm_gateway.py

✅ Files skipped from review due to trivial changes (2)

docs/superpowers/plans/2026-05-25-llm-audit-trail-admin-ui.md
dashboard/lib/api-types.generated.ts

…review) - openai_compatible adapter: drop the base_url_plain fallback. The encrypted blob is the sole routing source of truth (storage always writes base_url into it on create/rotate); base_url_plain is a display-only admin mirror that could be stale. Fail closed if the blob lacks base_url. - tool_translation: stop coercing falsy non-objects ([], false, 0) into {} — only None (and empty-string, sent by some compatible servers for no-arg calls) falls back to {}; everything else flows to the existing dict check and raises instead of silently running a tool with empty input. - Regression tests for both.

…back filter, UX/test hygiene) - csv_safe: add leading space to formula-prefix set (" =FORMULA" bypasses after importer trims), + regression test. - admin/ai audit filter: add the four fallback_triggered:<trigger> event types the gateway emits (#338) so admins can filter auto-fallback incidents — the audit query is an exact event_type match, so each variant needs its own option. - settings/ai: gate the "disabled by admin policy" message on !loading to avoid a false disabled flash during initial policy load. - admin/ai test: wrap the URL global stub in try/finally so it can't leak into later tests if an assertion fails early.

Relocate the DJ-facing AI connector UI (connect/test/delete, model hint, Hermes onboarding) from the standalone /settings/ai route into the existing /account page as an inline "AI / Model providers" card section. - Extract the connector UI into a reusable AiProvidersSection component; the parent page owns auth gating so the component does none of its own. - Add a permanent (308) redirect /settings/ai -> /account in next.config.js so existing bookmarks/links don't 404. - Delete the old /settings/ai route + test; port the connector tests to the component and assert the section renders inside /account. - Fail-closed policy behavior is preserved verbatim: when the DJ-scoped policy endpoint can't be read, no provider types are offered. - Admin /admin/ai UI is untouched.

…362) * chore(ai): remove deprecated ANTHROPIC_API_KEY env-var reads (#343) The LLM Gateway connector system has been the source of truth for credentials since the MVP, and every production caller of the recommendation engine passes db + actor (so the gateway path always runs). The legacy direct-Anthropic env-var fallback was therefore dead code. Removed: - llm_client._legacy_call() + the `db is None` branch + the AsyncAnthropic import; call_llm now requires a db session. - _resolve_max_tokens()'s settings read (inlined DEFAULT_MAX_TOKENS = 1024). - is_llm_available()'s final ANTHROPIC_API_KEY env-var fallback; without db it now returns False (connector-backed only). - config.anthropic_max_tokens / anthropic_timeout_seconds (only the deleted fallback used them). - Deprecated ANTHROPIC_* lines from .env.example; refreshed CLAUDE.md. Kept (documented in PR): config.anthropic_api_key + anthropic_model — still read by the admin AI-settings/model-listing endpoints (live observability + API contract) and the recommendation response model-name default. The historical data migration 046_admin_ai_oauth (which reads ANTHROPIC_API_KEY once to seed a connector) is left intact for replay. Tests updated to drive the gateway path; full backend suite green (2302 passed, 87% coverage). * fix(ai): harden LLM tool-payload parsing against malformed provider output (#362 review) Defensively validate the forced-tool `queries` payload in _parse_tool_response before constructing LLMSuggestionQuery objects. Custom OpenAI-compatible endpoints (Ollama, vLLM) may not enforce the tool JSON schema, so one malformed item should not crash the whole recommendation flow. - Extract shared _parse_query_items() used by both the canonical ChatResponse path and the legacy Anthropic Message-like path. - Skip non-dict payloads/items, missing or blank/non-string search_query, and coerce non-string reasoning to "". - Add regression tests for malformed payloads (CodeRabbit out-of-diff finding).

* feat(ai): configurable llm_call_log retention (#342) Make the previously-hardcoded 30-day llm_call_log retention configurable by admins (7-365 days) and wire up the daily cleanup job that the model docstring referenced but which did not yet exist. Backend: - Add SystemSettings.llm_call_log_retention_days (int, default 30, NOT NULL) with matching alembic migration 047 (server_default 30). - Surface it on the LLM policy endpoint (/api/admin/llm/policy). GET returns the value; PATCH accepts it with Pydantic bounds (ge=7, le=365) so out-of-range values are rejected at the API level (422). - Add purge_call_log_older_than() helper to connector_storage. - Add a daily (24h) cleanup loop in the FastAPI lifespan that reads the retention window from settings on each run, so an admin change takes effect within 24h without a restart (no hardcoded constant). Frontend: - Add a "Call log retention (days)" number input to the Connector policy card on /admin/ai; clamps and patches on blur. Regenerated OpenAPI TS types. Tests: - API bounds (min/max/boundaries) + persistence; purge helper window behavior; cleanup-reads-settings (honors admin change, keeps in-window rows); frontend retention input persist + clamp. * fix(ai): address PR #363 review (retention bounds guard, blur fallback) - Guard purge_call_log_older_than against out-of-bounds retention_days (fail closed on values outside 7-365 so a corrupt/tampered persisted setting cannot push the cutoff to now/future and wipe call-log history; the daily cleanup loop catches the ValueError and retries next pass). - Admin /admin/ai retention input: on invalid blur (NaN), fall back to the current policy value instead of a hardcoded 30, so clearing the field no longer silently overwrites a previously saved window (e.g. 90 -> 30). - Add regression tests: backend bounds-guard rejection (no deletion) and a frontend below-min clamp test locking the 7-365 contract symmetrically.

OSV advisory MAL-2026-4750 flags fastapi 0.136.3 for an undocumented 'fastar' dependency. It is a false positive (withdrawn by OSV 2026-05-26): 'fastar' is a legitimate Rust-tar-bindings package predating the release, and it is only pulled via fastapi's [standard] extra, which WrzDJ does not install (we use plain fastapi + uvicorn[standard]) - so it never enters our dependency tree. We stay on 0.136.3 to keep its underscore-header rejection (PR #15589) and SSE field validation (PR #15588). Remove this ignore once pip-audit's feed drops the withdrawn entry.

test_tampered_signature_returns_false flipped the last base64url char of the signature. base64url's final char encodes only the high bits of the last decoded byte; the low bits are discarded on decode and _b64decode() re-pads, so the mutated char frequently round-trips to the SAME sig_bytes. When it did, hmac.compare_digest passed, the cookie validated, and the `verified is False` assertion failed (~2/5 runs). Tamper at the decoded-bytes level instead: decode the signature with the module's own _b64decode, flip a whole byte (sig[0] ^= 0xFF), re-encode with _b64encode. This guarantees sig_bytes differs from the expected signature on every run. Production human_verification.py logic is unchanged. Closes #364

…#366) * fix(sse): don't pin a pooled DB connection for the SSE stream lifetime (#356) event_stream declared `db: Session = Depends(get_db)`, so FastAPI kept that session (and its checked-out QueuePool connection) open until the request finished. An EventSource request never finishes while the browser holds it open, so the connection sat idle but pinned for the whole stream. ~15 concurrent guest viewers exhausted the pool (size 5 + overflow 10), after which every request 500'd with QueuePool TimeoutError — a low-effort DoS on an unauthenticated guest-facing endpoint. Fix: run the one-shot existence/auth check inside a short-lived `with SessionLocal() as db:` block that returns its connection to the pool BEFORE the EventSourceResponse is returned. The async generator holds no DB session (and documents that any future per-tick DB access must open its own short-lived session). Existence/auth behavior (404 unknown, 410 archived/expired) is preserved exactly. Adds tests/test_sse_pool.py driving a real QueuePool engine and asserting 25 concurrent idle streams hold 0 checked-out connections. * test(sse): fix _make_request helper to model live idle stream (CodeRabbit PR #366) The helper's nested receive() previously returned {"type": "http.disconnect"} immediately, which made StarletteRequest.is_disconnected() true on the first generator iteration. The SSE generator exited before it could ever await queue.get(), so test_n_concurrent_idle_streams_hold_zero_pool_connections was asserting that instantly-disconnected streams hold zero pool connections — trivially true, useless as a regression for issue #356. Fix receive() to suspend forever on a never-set asyncio.Event, matching a real live, idle SSE client. The handler now suspends on queue.get() as intended, and checked_out == 0 is a meaningful assertion across 25 concurrent idle streams. Also: - Add sanity assertion that primer tasks remain not-done (per CodeRabbit's suggested follow-up) so the test fails loudly if receive() ever regresses back to instant-disconnect semantics. - Await primer-task cancellation before aclose() to avoid the "asynchronous generator is already running" RuntimeError that occurs when cancelling a generator suspended on queue.get(). Verification: - 10/10 loop runs pass with the new helper. - Reverted server/app/api/sse.py to the pre-fix shape (generator holds a pooled connection); test failed with "Expected 0 pooled connections held by 25 idle SSE streams, got 15" — confirming the helper now actually exercises the pinned-connection path and catches the original regression.

…oader (#370) * feat(ai): LLM adapter plug-in SDK with docs, skeleton, and optional loader Defines the public extension surface for third-party LLM provider adapters so forks and community contributors can add new providers without modifying any file under server/app/services/llm/. Deliverables: - docs/LLM-PLUGIN.md — full contract: LlmAdapter ABC, ChatRequest/Response fields, exception hierarchy (when to raise AuthInvalid / RateLimited / QuotaExceeded / ProviderUnavailable / ToolTranslationError), stable-vs- internal API table, tool-translation delegation, registration mechanics, test matrix link, 5-minute quick-start. - docs/examples/echo_adapter.py — minimal LlmAdapter skeleton that registers via the public registry and round-trips through chat() with no network call. Exercised by the contract test so it can't silently rot. - server/app/services/llm/plugin_loader.py — optional LLM_PLUGIN_DIR env-var loader. Imports every *.py in the dir (non-recursive, skips _-prefixed files), fails open on broken plug-ins with a logged stack trace, never mutates sys.path. Wired into registry._bootstrap so built-ins always load first (a plug-in cannot shadow a built-in — registry refuses double-registration). - server/tests/test_llm_adapter_contract.py — parametrised contract test over every registered adapter (8 built-ins + the docs skeleton). Covers subclassing, connector_type non-empty + matches registration key, async chat/health_check, constructor accepts a connector row, and the malformed-credentials boundary that prevents JSONDecodeError leakage. - server/tests/test_llm_plugin_loader.py — 6 tests pinning loader behaviour (no env var, missing dir, file-selection rules, broken-plug-in resilience, sys.modules cleanup, sys.path isolation). Security posture documented in docs/LLM-PLUGIN.md: LLM_PLUGIN_DIR loading grants the plug-in the full backend process — same trust boundary as `pip install`. Production deployments are encouraged to keep the env-var unset and ship trusted plug-ins as ordinary Python modules. * chore(ai): address CodeRabbit nitpicks on #370 - Remove tautological isinstance(exc_info.value, AuthInvalid | LlmError) assertion in test_adapter_raises_typed_error_on_malformed_credentials. pytest.raises(LlmError) already guarantees the exception type; the union reduces to LlmError since AuthInvalid is a subclass. The docstring already documents the intent (accept any LlmError subclass). - Add 'text' language specifier to the ASCII architecture diagram fence in docs/LLM-PLUGIN.md for Markdown-linter compliance.

* feat(ai): per-DJ explicit default LLM connector (#336) Adds a per-DJ "pinned default" so DJs can force routing through a specific connector instead of relying on the gateway's MRU heuristic. - New column LlmConnector.is_default (bool, default false, NOT NULL) - Partial unique index per user_id where is_default = true (Postgres / SQLite) - Service helpers set_default_for_user / unset_default_for_user with atomic sibling-clear semantics - Gateway resolution: prefer pinned default; fall through to MRU when no default is set or the pinned default is inactive - API: POST /api/llm/connectors/{id}/default sets, DELETE clears; audit events connector_default_set / connector_default_unset - Setting an inactive connector as default returns 400 (footgun guard) - Alembic migration 048 backfills each DJ's MRU active connector as default - Frontend: radio + "Default" badge on each card; optimistic update * docs(ai): document 400/404 responses on connector default endpoints Address CodeRabbit threads on PR #371: the POST /api/llm/connectors/{id}/default endpoint can return 400 (inactive connector) and 404 (non-owned connector), and the DELETE counterpart can return 404. Only 200/422 were declared in the FastAPI route metadata, which left the generated OpenAPI schema and dashboard api-types.generated.ts out of sync with runtime behavior and forced frontend error-handling to fall back to generic catches. Add `responses=` entries to both routes, regenerate server/openapi.json, and re-run openapi-typescript so dashboard/lib/api-types.generated.ts reflects the new error contracts.

…in (#340, #346) (#372) Adds a background task that periodically runs `health_check` on every active LLM connector to catch expired/revoked credentials before a DJ hits one mid-event, and surfaces the last result in the admin per-DJ connectors table. - New columns: `llm_connectors.last_health_check_at` (timestamp) + `last_health_check_status` (one of: ok / auth_invalid / rate_limited / quota_exceeded / provider_unavailable / error). - Shared `services/llm/health_check.py` helper used by both the DJ-triggered test button (`POST /api/llm/connectors/{id}/test`) AND the background monitor — guarantees identical observability semantics regardless of trigger source. - Background loop wakes every 5 minutes, picks active connectors whose last check is older than `LLM_HEALTH_CHECK_INTERVAL_HOURS` (default 6, clamped to 1..168h) times a per-connector deterministic jitter factor (0.7..1.3). - Sequential per-pass execution with a 1s sleep between calls — respects per-provider rate limits even when many DJs share one upstream account. - On an active->auth_invalid transition: flips status, writes a `connector_health_check_failed` audit row (distinct from per-check noise), and attempts an email notification to the DJ via Resend; falls back to logging when the DJ has no email or Resend isn't configured. - Admin per-DJ connectors table gains two sortable columns (Last health check / Result) with a colour-coded status badge; default sort is `last_health_check_at DESC` for fastest triage. Both issues add the same two columns to `llm_connectors`. Splitting them across two PRs would mean two clashing Alembic revisions (047->048 from each PR with conflicting downgrade ops). One PR ships the schema + monitor + admin UI atomically so there's no half-state on the epic branch. Closes #340 Closes #346

The SSE endpoint (`app.api.sse`) opens its own `with SessionLocal()` block for the existence check, bypassing `get_db`. Tests that exercise the endpoint hit the real `DATABASE_URL` (Postgres in CI) instead of the in-memory SQLite test DB, failing with "relation does not exist". Scoped rebind in conftest: swap only `app.api.sse.SessionLocal`, not the production sessionmaker, so other modules importing the production `SessionLocal` aren't disturbed. Caught by 4 `test_sse_security.py` failures in PR #354 backend CI.

Consolidate byte-identical logic copy-pasted across the LLM adapters into single sources of truth: - new adapters/_shared.py: parse_retry_after, raise_for_status (with optional throttle_detector for Bedrock's 400-throttle path), extract_api_key, extract_fixed_base_credentials - tool_translation.py: content_to_text + to_anthropic_messages (shared by the Anthropic SDK adapter and Bedrock's anthropic family), _validate_force, and one table-driven _normalise_finish_reason replacing four near-identical per-provider normalizers - removed dead _normalise_role passthrough in _httpx_openai Behavior-preserving: error messages, exception types, retry-after values, and JSON shapes unchanged. Anthropic's SDK-based _raise_for_status and Gemini's str()-wrapping message translator left intact (distinct semantics). Net -106 LOC across the llm tree; 2429 backend tests pass, coverage 87.78%.

Collapse repeated boilerplate across the connector API + storage layer into shared helpers — single source of truth, behavior-preserving: - api/llm.py: _get_owned_connector_or_404 (6 call sites), _audit_and_return (the audit -> commit -> refresh -> ConnectorOut epilogue, 4 endpoints), _raise_if_duplicate_label (the uq_dj_connector_label -> 409 mapping, 2 sites) - admin_llm.py: _connector_to_admin_out for the AdminConnectorOut column reflection (list + revoke) - connector_storage.py: _APIKEY_CONNECTOR_TYPES constant, _normalise_model_hint and _validate_api_key_blob shared by build_create_payload + rotate_credentials + update_metadata Every ValueError/HTTPException message preserved verbatim (the create/rotate/ update paths phrase model_hint + api_key errors differently, so those strings stay caller-supplied). Skipped the ConnectorCreate/Rotate credential-field mixin: Pydantic v2 reorders inherited fields, which would churn the generated OpenAPI schema + api-types.generated.ts. Net -5 LOC; api/llm.py -16. 2429 backend tests pass, coverage 87.31%.

) Behavior-preserving DRY cleanup of the two AI dashboard surfaces: - AiProvidersSection.tsx: replace two nested-ternary placeholder chains with API_KEY_PLACEHOLDERS / MODEL_HINT_PLACEHOLDERS lookup tables; collapse 15 per-field onChange closures into a single handleField(key) factory (connector_type stays inline — it needs the LlmConnectorType cast). - admin/ai/page.tsx: add formatTimestamp() for the 3 toLocaleString cells; reuse the existing PlainHeader component in the Usage + Audit tables instead of repeating the inline <th> style; extract onAuditFilterChange() for the 4 filter handlers that all reset pagination to page 0. Rendered DOM + behavior unchanged. Skipped the PROVIDER_FIELDS form-block data-driving: the bedrock/azure/apikey/compatible blocks genuinely diverge (help text, Hermes <details>, select-vs-input model hint), so a descriptor system would cost as much as it saves and risk shifting the DOM the component tests query. 966 frontend tests pass; lint + tsc clean; coverage above thresholds.

Behavior-preserving test contraction — same case counts, no assertions dropped: - test_llm_url_validator.py: 26 single-assert methods → 2 @parametrize tables (accept/normalise + reject) + keep the case-normalization test standalone (its startswith/endswith assertions don't pin netloc casing). - test_llm_bedrock_adapter.py: TestBedrockErrorMapping's 8 methods → one parametrized status→exception test (7 cases) + keep the 429 test separate (it additionally asserts retry_after_seconds). _err lifted to module scope so parametrize args resolve at collection time. - test_llm_gateway.py: _patch_chat() helper replaces the verbose patch.object(__import__("...openai_apikey", ...).OpenAIApiKeyAdapter, "chat") block (×15) via a plain top-level import; _make_org_default() dedupes the org-owner + connector + wire-default setup (×6). Deliberately NOT collapsed (verifier-flagged coverage risk): the cross-adapter error matrix in test_llm_adapters.py, the api happy-path/policy tests (two distinct setup mechanisms), and the per-provider tool-translation tests — those are assertion-divergent and parametrizing would drop distinct cases. Net -226 LOC. 2429 backend tests pass (unchanged count), coverage 87.94%.

* feat(llm): add ChatResponseChunk + streaming ABC default Adds ToolCallDelta / ChatResponseChunk canonical models alongside ChatResponse, a StreamingUnsupported typed error, and a non-abstract LlmAdapter.stream() default that raises it so non-streaming adapters degrade gracefully. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(llm): OpenAI-wire streaming generator + event parser Adds streaming.parse_openai_stream_event (text + partial tool-call JSON fragments + terminal usage) and _httpx_openai.stream_openai_chat, an async generator that streams Chat Completions SSE and maps non-2xx statuses to canonical typed errors. Consumer disconnect closes the httpx stream context, cancelling the upstream request. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(llm): streaming for OpenAI platform + compatible adapters Both adapters delegate to stream_openai_chat, preserving the max_completion_tokens field override on the Platform adapter and the SSRF-validated base URL on the compatible adapter. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(llm): Anthropic provider-native streaming (text + tool_use deltas) Consumes the anthropic SDK messages.stream() typed event stream: text_delta blocks become text chunks, tool_use content_block_start + input_json_delta blocks become partial tool-call fragments grouped by content-block index. The terminal chunk carries the canonical stop_reason and completion token count. Shared _build_kwargs keeps the chat and stream request translation in lockstep. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(llm): Gateway.stream with counts-only logging + cancellation Adds Gateway.stream mirroring dispatch connector resolution. A single counts-only llm_call_log row (plus auth audit on 401) is written in _attempt_stream's finally, so it fires on success, typed error, AND consumer cancellation. The outer generator explicitly aclose()s the inner one so client disconnect synchronously closes the upstream connection. Auto-fallback is intentionally disabled for streaming (chunks already delivered — a restart would corrupt output). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * chore(llm): backend lint/format fix-ups for streaming endpoint Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(llm): cover unknown OpenAI finish_reason → error mapping Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(ai-ui): SSE stream consumer for connector stream-test apiClient.streamConnectorTest uses fetch + ReadableStream (EventSource cannot send the Bearer header), parses text/event-stream data frames, and invokes onChunk per frame. Accepts an AbortSignal so callers can cancel — aborting closes the connection, which the backend treats as a client disconnect and cancels the upstream provider request. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(ai-ui): live Stream test button on DJ AI providers section Wires the streamConnectorTest consumer into the DJ-owned connector rows (AiProvidersSection) — the correct surface, since the stream-test endpoint is DJ-scoped and 404s for connectors the caller doesn't own. Streamed text_delta accumulates into an italic live-text line under the connector. The admin /admin/ai page is intentionally NOT wired: it lists other DJs' connectors the admin doesn't own. Recommendation UI migration to SSE is deferred (background pipeline returns a final JSON payload, not a token feed) — the reusable SSE plumbing lands here per issue #335's 'reasonable judgment on scope'. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(llm-stream): address CodeRabbit review on PR #379 - api.ts: parse SSE `event:` field so `event: error` frames surface as a thrown ApiError instead of passing through as an inert chunk. - _httpx_openai: raise ToolTranslationError on a malformed `data:` SSE JSON frame instead of silently dropping it (keepalive/comment frames start with ":" and never reach this branch). - anthropic stream: canonicalise stop_reason via the shared normalise_anthropic_stop_reason() helper so stream and chat agree on pause_turn / refusal / unknown values (-> "error", not "end_turn"). - streaming: guard int() coercion of provider-supplied index / token counts via _as_int() so null/non-numeric fields don't abort an otherwise-usable stream. - Regression tests for all four (referencing #379). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(llm): add LlmFeaturePreference model + migration 050 New table maps (user_id, feature) -> connector_id with a UNIQUE constraint so each DJ has at most one pinned connector per agentic feature. Both FKs cascade on delete. Feature allowlist (recommendation, set_builder) lives in the model module. Refs #337 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(llm): feature-preference CRUD helpers in connector_storage set_feature_preference upserts (replace-in-place), clear returns whether a row was removed, plus list/get helpers — all caller-commits. Refs #337 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(llm): gateway resolves per-feature pin first, falls back gracefully _resolve_connector now takes the dispatch purpose and checks the DJ's LlmFeaturePreference for that feature before the per-DJ default / MRU / org-default chain. A pinned connector that was deleted or is no longer active is skipped so a stale/broken pin never silently breaks the DJ. Refs #337 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(llm): feature-preference API schemas FeatureKey Literal (OpenAPI enum), FeaturePreferenceOut / ...ListOut / ...Set request+response models. Refs #337 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(llm): set/clear/list feature-preference endpoints GET/POST /api/llm/feature-preferences and DELETE /api/llm/feature-preferences/{feature}, scoped to the current DJ and rate-limited like the existing connector endpoints. POST validates connector ownership (404, not leaking another DJ's connectors) and rejects pinning a non-active connector (400). The {feature} path param is the FeatureKey Literal, so unknown features 422 at the boundary. Adds a guard test that FeatureKey stays in sync with KNOWN_FEATURES. Refs #337 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(ai-ui): api client methods + types for feature preferences Regenerated OpenAPI types (FeaturePreferenceOut/...ListOut/...Set), added LlmFeaturePreference* re-exports, and listLlmFeaturePreferences / setLlmFeaturePreference / clearLlmFeaturePreference on the ApiClient. Refs #337 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(ai-ui): per-feature defaults section on DJ AI settings Adds a 'Per-feature defaults' section to AiProvidersSection: one connector picker per known feature (Recommendations, Set builder). Selecting a connector pins it; 'Use account default' clears the pin. Only active connectors are offered. The section fetches soft and hides itself if the preferences endpoint is unavailable. Refs #337 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(llm): make set_feature_preference a conflict-safe upsert Two concurrent pins for the same (user_id, feature) could both miss the existing-row check and race to INSERT; the loser tripped the uq_llm_feature_pref_user_feature UNIQUE constraint, bubbling a 500. Replace the read-then-insert with a DB-native ON CONFLICT DO UPDATE upsert (dialect-aware: postgresql in prod, sqlite in tests) so a concurrent re-pin resolves deterministically (last writer wins) in a single atomic statement. Adds a regression test pinning the behavior. Addresses CodeRabbit review on PR #378. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(llm): per-DJ monthly token cap + gateway pre-flight enforcement Add monthly_token_cap column (NULL = unlimited) on LlmConnector with migration 050, a current-month token-usage aggregation helper, and a gateway dispatch pre-flight check that raises the new QuotaCapReached exception before any provider call. Cap edits never disrupt in-flight calls. Direct aggregation over the indexed llm_call_log.created_at column — no rollup table needed at current volume. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(llm): admin cap endpoint + expose usage + 429 DJ-facing error Add PATCH /api/admin/llm/connectors/{id}/cap (admin-only) to set/clear a connector's monthly token cap, with Pydantic ge=0 validation and a defensive service-layer guard. Expose monthly_token_cap + current_month_tokens on the admin connector listing for the usage-vs-cap bar. Map QuotaCapReached to a 429 with the fixed DJ-facing message in the /recommendations/llm endpoint (no internal leakage). Cap changes reuse the policy_changed audit event. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(ai-ui): admin per-DJ monthly cap input + usage-vs-cap progress bar Add a Monthly cap column to the admin AI per-DJ connectors table: an editable cap input (empty = unlimited), current-month usage text, and a colour-escalating progress bar (green < 80% < amber < 100% red). Wire setAdminLlmConnectorCap into the api client + regenerate OpenAPI types. Update test fixtures for the new monthly_token_cap / current_month_tokens fields. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(llm): batch month-usage query + require cap field in patch (#377) Address CodeRabbit review on PR #377: - Batch current-month usage lookup in the admin connector list endpoint: add current_month_token_usage_bulk() — one grouped aggregate over llm_call_log instead of an N+1 per-connector query. - Make monthly_token_cap required in AdminConnectorCapPatch (Field(...)): an empty {} body is now rejected with 422 rather than silently treated as null, which would wipe a configured cap. Frontend already sends the field explicitly (null to clear), so no client change needed. - Regenerate openapi.json + dashboard api-types.generated.ts. - Add regression tests: bulk-usage aggregation/parity/empty-input, and empty-body PATCH rejection leaving the cap untouched. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(merge): regenerate openapi + api-types after epic merge git auto-merge of the generated openapi.json / api-types.generated.ts dropped endpoints from one side (streaming stream-test + per-feature preference). Regenerated from the FastAPI app to restore the full spec. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(merge): add monthly_token_cap to per-feature test fixture #339 added monthly_token_cap (required) to ConnectorOut; the per-feature test fixture from #337 predated it. current_month_tokens stays admin-only (AdminConnectorOut), so it's not added here. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Resolve 4 conflicts + linearize the alembic migration fork: - server/app/api/sse.py: keep epic's pool-safety `with SessionLocal()` block (#356/#366 — don't pin a pooled connection for the stream lifetime) while adopting main's dual-code resolver `get_event_by_public_code_with_status` (#382/#383 — accepts collection OR join code, fixes SSE channel-key bug). - account/page.tsx: both sides appended a card; keep BOTH as separate cards — AiProvidersSection (#357) and the Guest Experience frictionless toggle (#380). - account/page.test.tsx: hang the AI-section stubs on the shared hoisted `mockApi` object instead of spreading, so `vi.spyOn(mockApi, ...)` keeps rebinding the same reference the page calls. - api.test.ts: keep both appended suites — `streamConnectorTest` (#379) and `frictionless join api` (#369). - a11334c031bb migration: re-parent down_revision 045 -> 051. main branched the frictionless migration off 045 while epic built 046..051 on 045, creating two alembic heads. Columns don't overlap the AI tables, so ordering after 051 is safe. `alembic upgrade head` + `alembic check` pass on Postgres.

thewrz · 2026-06-03T05:10:29Z

@coderabbitai review

Heads-up on what changed since your last review: main was merged into epic/ai-engine (merge commit). Please focus on the 4 conflict resolutions and the migration re-parent:

server/app/api/sse.py — kept the pool-safety with SessionLocal() block (SSE event stream pins a pooled DB connection for the whole stream lifetime → pool exhaustion under guest load #356/fix(sse): release pooled DB connection for SSE stream lifetime (#356) #366) and switched to the dual-code resolver get_event_by_public_code_with_status (Dual-code routing breaks frictionless /join (collection-code vs join_code split) #382/fix(routing): dual-code routing for frictionless /join + close get_event leak (#382) #383).
dashboard/app/(dj)/account/page.tsx — kept both the AiProvidersSection card ([AI Engine] Move DJ AI connector/model settings into the account page #357) and the Guest Experience frictionless toggle (feat(join): frictionless live-event guest flow (#369) #380) as separate cards.
test mocks in account/page.test.tsx and lib/__tests__/api.test.ts.
server/alembic/versions/a11334c031bb_... — re-parented down_revision 045 → 051 to collapse a two-head migration fork.

coderabbitai · 2026-06-03T05:12:11Z

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find and read the relevant files
echo "=== server/app/api/sse.py ==="
cat server/app/api/sse.py

echo ""
echo "=== dashboard/app/(dj)/account/page.tsx ==="
cat "dashboard/app/(dj)/account/page.tsx"

echo ""
echo "=== Alembic migration a11334c ==="
fd -e py . server/alembic/versions/ | xargs grep -l "a11334c\|down_revision.*051\|down_revision.*045" 2>/dev/null | head -5

echo ""
echo "=== All recent alembic migrations ==="
ls -la server/alembic/versions/ | tail -20

Length of output: 16424

🏁 Script executed:

#!/bin/bash
echo "=== a11334c migration (re-parented) ==="
cat server/alembic/versions/a11334c031bb_add_frictionless_join_flags.py

echo ""
echo "=== 5bb46508476b migration ==="
cat "server/alembic/versions/5bb46508476b_add_now_playing_and_play_history_tables.py"

echo ""
echo "=== account page test ==="
cat "dashboard/app/(dj)/account/__tests__/page.test.tsx"

echo ""
echo "=== lib/__tests__/api.test.ts (last 150 lines to see LLM additions) ==="
tail -150 "dashboard/lib/__tests__/api.test.ts"

echo ""
echo "=== alembic env.py to check head config ==="
grep -n "head\|run_migrations\|include_schemas" server/alembic/env.py | head -20

Length of output: 18113

@thewrz All four conflict resolutions and the migration re-parent look good. Details below.

✅ `server/app/api/sse.py`

Clean merge. The pool-safety and dual-code resolver changes coexist correctly:

with SessionLocal() as db: is entered and exited before EventSourceResponse is returned — the pooled connection is released before the long-lived stream begins. ✅
event_code = event.code is extracted as a plain string inside the block, so no ORM object escapes the session (no risk of DetachedInstanceError). ✅
get_event_by_public_code_with_status covers all three result variants (NOT_FOUND → 404, ARCHIVED/EXPIRED → 410). ✅
The docstring in _event_generator documenting the no-session-held contract is a nice guard for future contributors.

✅ `dashboard/app/(dj)/account/page.tsx`

Both cards are present and ordered correctly: Change Password → Change Email → AiProvidersSection → Guest Experience. No bleed between the two branches.

✅ Test mocks (`account/page.test.tsx` + `lib/tests/api.test.ts`)

mockApi correctly stubs listLlmConnectors (→ []) and getLlmPolicy (→ rejects), giving the AI providers section a fail-closed render in all account-page tests. The new "renders the relocated AI / Model providers section" test confirms the heading survives. ✅
api.test.ts — streamConnectorTest covers happy-path SSE, non-OK HTTP error, and the event: error frame case (#379 — verified it rejects the promise and still delivers leading chunks). frictionless join api block covers getJoinConfig, both ensureGuestName variants, and updateMyPreferences PATCH. ✅

✅ `server/alembic/versions/a11334c031bb_…` re-parent

down_revision = "051" — correctly placed at the tip of the AI engine chain (045 → 046 → … → 051 → a11334c031bb). The two-head fork is collapsed. ✅

⚠️ Pre-existing two-head risk: `5bb46508476b_add_now_playing_and_play_history_tables.py`

This file (likely brought in from main via the merge commit) has:

down_revision: str | None = "003"

That branches off revision 003 — far back in the chain — and is never re-joined, creating a second Alembic head alongside the a11334c031bb tip. Running alembic upgrade head will raise an error; you'd need alembic upgrade heads or a merge migration (alembic merge). This isn't introduced by this PR's resolutions, but it's now surfaced in the branch and will block clean deploys if left unaddressed. Worth a quick alembic heads check and a merge migration before landing.

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

thewrz · 2026-06-03T05:25:37Z

@coderabbitai Thanks for the thorough pass — the four conflict resolutions and the migration re-parent confirmations are appreciated.

On the ⚠️ 5bb46508476b second-head concern: this is a false positive. That revision is not a head — it's consumed by the very next migration:

$ alembic heads
a11334c031bb (head)        # single head

# 5bb46508476b is re-joined immediately:
003           -> 5bb46508476b   (add now_playing + play_history)
5bb46508476b  -> 004            (004_add_now_playing_visibility.py: down_revision = "5bb46508476b")
004           -> ...            (linear from here)

004_add_now_playing_visibility.py sets down_revision = "5bb46508476b", so the chain is linear 003 → 5bb46508476b → 004 → … → 051 → a11334c031bb. It's a long-standing rev-003-era migration, not introduced by this merge.

Empirically confirmed against Postgres before pushing:

alembic upgrade head   # ran cleanly through 051 -> a11334c031bb
alembic check          # "No new upgrade operations detected" (no model drift)

So alembic upgrade head does not error and there is a single head. No merge migration needed. Resolving as not-an-issue.

coderabbitai · 2026-06-03T05:25:57Z

Tip

For best results, initiate chat on the files or code changes.

@thewrz Thanks for the clear correction and for backing it up with the empirical alembic heads / alembic check output. You're right — I only inspected 5bb46508476b in isolation and missed that 004_add_now_playing_visibility.py immediately consumes it as down_revision, keeping the chain linear. That was an analysis gap on my end. The single-head / no-merge-migration conclusion stands. Marking this resolved.

(ᵕ—ᴗ—)

✏️ Learnings added

Learnt from: thewrz
URL: https://github.com/wrzonance/WrzDJ/pull/354

Timestamp: 2026-06-03T05:25:56.696Z
Learning: In the wrzonance/WrzDJ repo, `server/alembic/versions/5bb46508476b_add_now_playing_and_play_history_tables.py` has `down_revision = "003"` but is NOT a second Alembic head. It is immediately consumed by `004_add_now_playing_visibility.py` which sets `down_revision = "5bb46508476b"`, making the full chain linear: `003 → 5bb46508476b → 004 → … → 051 → a11334c031bb` (single head). Do not flag `5bb46508476b` as a dangling or second-head migration in future reviews.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

🧠 Learnings used

Learnt from: thewrz
Repo: wrzonance/WrzDJ PR: 377
File: server/app/services/llm/gateway.py:99-101
Timestamp: 2026-05-29T05:53:37.669Z
Learning: In `server/app/services/llm/gateway.py`, the monthly-token-cap pre-flight check in `_enforce_monthly_cap` is deliberately a read-then-dispatch pattern (not atomic) by design. The cap is a *soft monthly budget guardrail* (not a hard per-request billing limit): token counts (`tokens_in`/`tokens_out`) are only known post-hoc after the provider responds and are logged to `llm_call_log`, so there is no quantity to atomically reserve before dispatch. Concurrency exposure is negligible because enforcement is per-DJ-connector and the only caller is the recommendation engine (one LLM call per DJ-initiated request, not a fan-out). The worst case is a small one-call overshoot that self-corrects on the next call. The pre-flight-only pattern also satisfies the issue `#339` acceptance criterion: "editing caps must not disrupt in-flight calls." Do not flag this as a TOCTOU issue in future reviews — it is a tracked and intentional design tradeoff.

Learnt from: thewrz
Repo: wrzonance/WrzDJ PR: 354
File: server/app/services/recommendation/llm_client.py:289-307
Timestamp: 2026-05-25T23:00:13.086Z
Learning: In `server/app/services/recommendation/llm_client.py`, when `db` is present the `call_llm` function intentionally does NOT fall back to the legacy env-var Anthropic key if `Gateway.dispatch` raises `NoLlmConfigured`. The `is_llm_available(db, actor)` check upstream already gates the path and returns an HTTP 503 when no connector is configured, prompting the DJ/admin to set one up. This is a deliberate design decision: per-DJ/opt-in LLM keys are a core goal; silently reusing a shared org Anthropic env-var key would contradict it. The `_legacy_call` path is reserved exclusively for `db=None` callers (e.g., unit tests that mock `AsyncAnthropic` directly).

coderabbitai

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)

dashboard/app/(dj)/account/page.tsx (1)
46-54: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Prevent the initial getMe() response from clobbering an edited preference.

If the user flips the checkbox before this request settles, the late setFrictionlessDefault(user.frictionless_join_default) can revert the UI back to the stale server value. Track whether the toggle has been edited locally, or wait for the initial preference load before enabling the control.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dashboard/app/`(dj)/account/page.tsx around lines 46 - 54, The initial
api.getMe() handler can overwrite a user toggle change made locally; add a local
flag (e.g., hasEditedFrictionless or frictionlessInitialized) and update
useEffect to only call setFrictionlessDefault(user.frictionless_join_default)
when the control has not been edited yet (or when not initialized), and update
that flag in the onChange handler for the toggle to prevent later API responses
from clobbering the local change; locate the logic around useEffect, api.getMe,
setFrictionlessDefault and the toggle's onChange to implement this guard.
dashboard/app/(dj)/account/__tests__/page.test.tsx (1)
171-179: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Keep this getMe fixture aligned with the current response shape.

This override drops frictionless_join_default, so this test is now exercising a payload shape AccountPage no longer gets from api.getMe(). Add the field here too so the pending-email path still covers the real contract.
💡 Suggested fix
     mockGetMe.mockResolvedValue({
       id: 1,
       username: 'testuser',
       role: 'dj',
       help_pages_seen: [],
       pending_email: 'pending@example.com',
       email: null,
+      frictionless_join_default: false,
     });
Based on learnings: When adding fields to shared TypeScript interfaces (e.g., PublicRequestInfo), update test fixtures in test files that construct those types.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dashboard/app/`(dj)/account/__tests__/page.test.tsx around lines 171 - 179,
The test overrides mockGetMe to return a payload missing the new field; update
the fixture returned by mockGetMe in the test ('shows pending email from getMe
on load') to include frictionless_join_default (matching the shape returned by
api.getMe/PublicRequestInfo) so AccountPage receives the real contract; locate
the mockGetMe.mockResolvedValue call and add a frictionless_join_default
property with an appropriate value alongside pending_email and other fields.
server/app/services/llm/adapters/gemini_apikey.py (1)
135-146: ⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

Fix Gemini tool response mapping (functionResponse.name must be the function name, not the call id)

Gemini requires functionResponse.name to match the declared function name (FunctionDeclaration/FunctionCall name), and uses a separate functionResponse.id for the call id. The current adapter sets "name": m.tool_call_id, but the canonical Message only provides an opaque tool_call_id (no tool/function name), so follow-up turns are likely to break when id ≠ function name. Adjust translation to use the tool/function name for name and keep the opaque id in functionResponse.id (using ToolCall.name/ToolCall.id in the canonical model). [critical mismatch with Gemini contract]
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/app/services/llm/adapters/gemini_apikey.py` around lines 135 - 146,
The adapter currently maps tool messages by setting functionResponse.name to the
opaque tool_call_id (see m.role == "tool" branch and m.tool_call_id), but Gemini
expects functionResponse.name to be the declared function name and
functionResponse.id to hold the call id; update the translation to use the
canonical ToolCall.name for functionResponse.name and place m.tool_call_id into
functionResponse.id (or use ToolCall.id if present) so the output uses function
name for name and the opaque call id for id (adjust any references in the m ->
ToolCall mapping logic accordingly).

♻️ Duplicate comments (1)

server/app/services/llm/tool_translation.py (1)

247-259: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate tool_use.input before coercing it.

dict(input_obj or {}) still lets malformed Anthropic payloads slip through: [] becomes {}, while primitives can leak a raw TypeError. That breaks the typed error contract and can silently run a tool with missing arguments.

💡 Proposed fix

         elif btype == "tool_use":
             if isinstance(block, dict):
                 name = block.get("name")
                 tool_id = block.get("id") or name
                 input_obj = block.get("input")
             else:
                 name = getattr(block, "name", None)
                 tool_id = getattr(block, "id", None) or name
                 input_obj = getattr(block, "input", None)
             if not name or not tool_id:
                 raise ToolTranslationError("Anthropic tool_use block missing id/name")
+            if input_obj is None:
+                input_dict: dict = {}
+            elif isinstance(input_obj, dict):
+                input_dict = input_obj
+            else:
+                raise ToolTranslationError("Anthropic tool_use input must be an object")
             tool_calls.append(
-                ToolCall(id=str(tool_id), name=str(name), input=dict(input_obj or {}))
+                ToolCall(id=str(tool_id), name=str(name), input=input_dict)
             )

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/app/services/llm/tool_translation.py` around lines 247 - 259, The code
currently coerces tool inputs with dict(input_obj or {}), which lets
lists/primitives slip through; update the block handling around input_obj (used
to construct ToolCall) to explicitly validate that input_obj is either None or a
mapping before coercion: if input_obj is None treat as {}, if
isinstance(input_obj, dict) use dict(input_obj), otherwise raise
ToolTranslationError with a clear message (mention the offending tool id/name)
so malformed Anthropic payloads (e.g. lists/primitives) fail fast instead of
producing TypeError or silently becoming {} when constructing
ToolCall(id=str(tool_id), name=str(name), input=...).

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@dashboard/lib/api.ts`:
- Around line 1272-1297: The SSE parser in streamConnectorTest currently only
handles LF-only framing (uses buffer.indexOf('\n\n') and frame.split('\n')), so
CRLF-delimited frames (`\r\n` / `\r\n\r\n`) can be missed; fix by normalizing
line endings and boundary detection: when reading into buffer (and before
splitting frames) replace CRLF with LF (e.g., buffer = buffer.replace(/\r\n/g,
'\n')) or detect both separators (check for '\r\n\r\n' and '\n\n'), and split
frame lines with a CRLF-tolerant pattern (e.g., frame.split(/\r?\n/)); update
references in streamConnectorTest where buffer.indexOf, frame.split, eventType,
and dataLines are used so `event: error` frames are correctly parsed.

In `@server/alembic/versions/051_llm_connector_monthly_token_cap.py`:
- Around line 26-34: The migration allows negative monthly_token_cap values; add
a DB-level check constraint in upgrade() to enforce monthly_token_cap >= 0
(e.g., create a constraint named like
ck_llm_connectors_monthly_token_cap_nonnegative via op.create_check_constraint
on table "llm_connectors"), and in downgrade() drop that constraint
(op.drop_constraint) before dropping the monthly_token_cap column so the DB
won't accept negative caps and the rollback removes the constraint cleanly.

In `@server/app/services/llm/adapters/_httpx_openai.py`:
- Around line 206-223: The loop currently JSON-decodes each "data:" line
individually which breaks SSE events that span multiple data lines; change the
logic in the resp.aiter_lines iteration (the block around the async for line in
resp.aiter_lines()) to accumulate successive "data:" lines into a single event
buffer (concatenate their payloads preserving newlines) until you hit the SSE
event delimiter (a blank line) or a non-"data:" line, then handle the special
"[DONE]" sentinel and only call json.loads + parse_openai_stream_event on the
fully assembled event string; on JSON decode errors raise ToolTranslationError
as before but only for complete events so partial fragments don't prematurely
raise.

In `@server/app/services/llm/adapters/anthropic_apikey.py`:
- Around line 183-240: _translate_anthropic_event currently only uses
getattr(...) so plain-dict events drop through; update its attribute accesses to
fall back to dict-key lookup when getattr returns None (e.g., read etype =
getattr(event, "type", None) or event.get("type") if event is a dict), do the
same for content_block/block fields (block = getattr(event, "content_block",
None) or block = block.get(...) ), delta, usage, index (accept
event.get("index") too) and the inner fields like delta["text"] /
delta.get("partial_json") as fallbacks; preserve existing behavior for SDK
objects and keep conversions (int(...) for index/output_tokens) and the same
ChatResponseChunk and ToolCallDelta construction.

In `@server/app/services/llm/streaming.py`:
- Around line 70-85: The parser currently returns None for events lacking
text/tool deltas even when they are terminal usage-only events; move or compute
usage_payload before the early-return and change the guard so it only returns
None when there is no text_delta, no tool_call_deltas, not done, and no
usage_payload. Specifically, keep the existing logic that reads finish_reason
and sets done/stop_reason via _FINISH_REASON_OPENAI, but ensure usage_payload =
payload.get("usage") or {} is evaluated before the if-not-text/tool/done check
and include a check for usage_payload (and populate usage via TokenUsage and
_as_int as shown) so usage-only terminal events are not discarded.

---

Outside diff comments:
In `@dashboard/app/`(dj)/account/__tests__/page.test.tsx:
- Around line 171-179: The test overrides mockGetMe to return a payload missing
the new field; update the fixture returned by mockGetMe in the test ('shows
pending email from getMe on load') to include frictionless_join_default
(matching the shape returned by api.getMe/PublicRequestInfo) so AccountPage
receives the real contract; locate the mockGetMe.mockResolvedValue call and add
a frictionless_join_default property with an appropriate value alongside
pending_email and other fields.

In `@dashboard/app/`(dj)/account/page.tsx:
- Around line 46-54: The initial api.getMe() handler can overwrite a user toggle
change made locally; add a local flag (e.g., hasEditedFrictionless or
frictionlessInitialized) and update useEffect to only call
setFrictionlessDefault(user.frictionless_join_default) when the control has not
been edited yet (or when not initialized), and update that flag in the onChange
handler for the toggle to prevent later API responses from clobbering the local
change; locate the logic around useEffect, api.getMe, setFrictionlessDefault and
the toggle's onChange to implement this guard.

In `@server/app/services/llm/adapters/gemini_apikey.py`:
- Around line 135-146: The adapter currently maps tool messages by setting
functionResponse.name to the opaque tool_call_id (see m.role == "tool" branch
and m.tool_call_id), but Gemini expects functionResponse.name to be the declared
function name and functionResponse.id to hold the call id; update the
translation to use the canonical ToolCall.name for functionResponse.name and
place m.tool_call_id into functionResponse.id (or use ToolCall.id if present) so
the output uses function name for name and the opaque call id for id (adjust any
references in the m -> ToolCall mapping logic accordingly).

---

Duplicate comments:
In `@server/app/services/llm/tool_translation.py`:
- Around line 247-259: The code currently coerces tool inputs with
dict(input_obj or {}), which lets lists/primitives slip through; update the
block handling around input_obj (used to construct ToolCall) to explicitly
validate that input_obj is either None or a mapping before coercion: if
input_obj is None treat as {}, if isinstance(input_obj, dict) use
dict(input_obj), otherwise raise ToolTranslationError with a clear message
(mention the offending tool id/name) so malformed Anthropic payloads (e.g.
lists/primitives) fail fast instead of producing TypeError or silently becoming
{} when constructing ToolCall(id=str(tool_id), name=str(name), input=...).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: e1e3f4b7-6317-4f69-8688-fc95c528980c

📥 Commits

Reviewing files that changed from the base of the PR and between fd7e650 and 5a8b858.

📒 Files selected for processing (91)

.env.example
.github/workflows/ci.yml
CLAUDE.md
dashboard/app/(dj)/account/__tests__/page.test.tsx
dashboard/app/(dj)/account/page.tsx
dashboard/app/admin/ai/__tests__/page.test.tsx
dashboard/app/admin/ai/page.tsx
dashboard/components/AiProvidersSection.tsx
dashboard/components/__tests__/AiProvidersSection.featurePrefs.test.tsx
dashboard/components/__tests__/AiProvidersSection.test.tsx
dashboard/lib/__tests__/api.test.ts
dashboard/lib/api-types.generated.ts
dashboard/lib/api-types.ts
dashboard/lib/api.ts
dashboard/next.config.js
docs/LLM-PLUGIN.md
docs/examples/echo_adapter.py
docs/superpowers/plans/2026-05-26-move-dj-ai-settings-to-account.md
docs/superpowers/plans/2026-05-26-remove-deprecated-anthropic-env-reads.md
docs/superpowers/plans/2026-05-26-sse-stream-pooled-db-connection-leak.md
docs/superpowers/plans/2026-05-28-llm-cost-quota-caps.md
docs/superpowers/plans/2026-05-28-llm-gateway-streaming.md
docs/superpowers/plans/2026-05-28-per-feature-connector-preference.md
server/alembic/versions/047_llm_call_log_retention.py
server/alembic/versions/048_llm_connector_is_default.py
server/alembic/versions/049_llm_connector_health_check_columns.py
server/alembic/versions/050_llm_feature_preference.py
server/alembic/versions/051_llm_connector_monthly_token_cap.py
server/alembic/versions/a11334c031bb_add_frictionless_join_flags.py
server/app/api/admin_llm.py
server/app/api/events.py
server/app/api/llm.py
server/app/api/sse.py
server/app/core/config.py
server/app/core/csv_safe.py
server/app/main.py
server/app/models/__init__.py
server/app/models/llm_connector.py
server/app/models/llm_feature_preference.py
server/app/models/system_settings.py
server/app/schemas/llm.py
server/app/services/email_sender.py
server/app/services/llm/__init__.py
server/app/services/llm/adapters/_httpx_openai.py
server/app/services/llm/adapters/_shared.py
server/app/services/llm/adapters/anthropic_apikey.py
server/app/services/llm/adapters/azure_openai.py
server/app/services/llm/adapters/bedrock.py
server/app/services/llm/adapters/gemini_apikey.py
server/app/services/llm/adapters/openai_apikey.py
server/app/services/llm/adapters/openai_compatible.py
server/app/services/llm/adapters/openrouter_apikey.py
server/app/services/llm/adapters/xai_apikey.py
server/app/services/llm/base.py
server/app/services/llm/connector_storage.py
server/app/services/llm/exceptions.py
server/app/services/llm/gateway.py
server/app/services/llm/health_check.py
server/app/services/llm/health_monitor.py
server/app/services/llm/plugin_loader.py
server/app/services/llm/registry.py
server/app/services/llm/streaming.py
server/app/services/llm/tool_translation.py
server/app/services/recommendation/llm_client.py
server/app/services/recommendation/llm_hooks.py
server/app/services/system_settings.py
server/openapi.json
server/tests/conftest.py
server/tests/test_export.py
server/tests/test_llm_adapter_contract.py
server/tests/test_llm_adapters.py
server/tests/test_llm_api.py
server/tests/test_llm_bedrock_adapter.py
server/tests/test_llm_call_log_retention.py
server/tests/test_llm_client.py
server/tests/test_llm_default_connector.py
server/tests/test_llm_feature_preference.py
server/tests/test_llm_gateway.py
server/tests/test_llm_gateway_stream.py
server/tests/test_llm_health_check.py
server/tests/test_llm_health_monitor.py
server/tests/test_llm_hooks.py
server/tests/test_llm_plugin_loader.py
server/tests/test_llm_quota_cap.py
server/tests/test_llm_recommendation_via_gateway.py
server/tests/test_llm_stream_endpoint.py
server/tests/test_llm_streaming.py
server/tests/test_llm_tool_translation.py
server/tests/test_llm_url_validator.py
server/tests/test_sse_pool.py
server/tests/test_verify_status_endpoint.py

✅ Files skipped from review due to trivial changes (3)

server/alembic/versions/049_llm_connector_health_check_columns.py
docs/superpowers/plans/2026-05-26-remove-deprecated-anthropic-env-reads.md
dashboard/lib/api-types.generated.ts

🚧 Files skipped from review as they are similar to previous changes (12)

server/app/services/llm/exceptions.py
server/app/services/llm/init.py
server/app/models/init.py
server/app/services/system_settings.py
server/app/models/system_settings.py
server/app/services/llm/registry.py
CLAUDE.md
server/app/services/llm/adapters/azure_openai.py
server/app/services/llm/adapters/xai_apikey.py
server/app/core/csv_safe.py
server/app/api/events.py
dashboard/app/admin/ai/page.tsx

coderabbitai · 2026-06-03T05:26:48Z

+    const reader = response.body.getReader();
+    const decoder = new TextDecoder();
+    let buffer = '';
+    try {
+      for (;;) {
+        const { done, value } = await reader.read();
+        if (done) break;
+        buffer += decoder.decode(value, { stream: true });
+        // SSE frames are separated by a blank line.
+        let sep: number;
+        while ((sep = buffer.indexOf('\n\n')) !== -1) {
+          const frame = buffer.slice(0, sep);
+          buffer = buffer.slice(sep + 2);
+          // A frame may carry an `event:` name plus one or more `data:` lines.
+          // The backend emits `event: error` for typed gateway failures, so we
+          // must inspect the event type — not just blindly parse `data:`.
+          let eventType = 'message';
+          const dataLines: string[] = [];
+          for (const line of frame.split('\n')) {
+            if (line.startsWith('event:')) {
+              eventType = line.slice('event:'.length).trim();
+            } else if (line.startsWith('data:')) {
+              dataLines.push(line.slice('data:'.length).trim());
+            }
+          }
+          const data = dataLines.join('\n').trim();


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail # Locate the relevant function/snippet rg -n "streamConnectorTest|SSE|event:|data:" dashboard/lib/api.ts # Print the exact region around the referenced lines sed -n '1220,1360p' dashboard/lib/api.ts | nl -ba # Also check for any other SSE parsing helpers/usages in the file rg -n "getReader\\(\\)|TextDecoder\\(|indexOf\\(\"\\\\n\\\\n\"\\)|\\\\r\\\\n" dashboard/lib/api.ts

Repository: wrzonance/WrzDJ

Length of output: 1960

🏁 Script executed:

#!/bin/bash set -euo pipefail # Show the exact region around streamConnectorTest sed -n '1210,1335p' dashboard/lib/api.ts | cat -n echo "----" # Search for CRLF-related code in this file rg -n "\\\\r\\\\n|\\\\r\\\\r|\\r\\n|\\r" dashboard/lib/api.ts || true

Repository: wrzonance/WrzDJ

Length of output: 5685

🏁 Script executed:

#!/bin/bash set -euo pipefail rg -n "indexOf\\('\\\\n\\\\n'\\)|split\\(/\\\\r\\?\\\\n/\\)|split\\('\\\\n'\\)" dashboard/lib/api.ts dashboard -S || true

Repository: wrzonance/WrzDJ

Length of output: 375

Handle CRLF-delimited SSE frames in streamConnectorTest()

The parser only detects frame boundaries via buffer.indexOf('\n\n') and splits lines with frame.split('\n'). For standard SSE \r\n\r\n / \r\n framing, the boundary match can fail, leaving frames unparsed (including event: error).

💡 Suggested fix

const reader = response.body.getReader(); const decoder = new TextDecoder(); let buffer = ''; + const frameBoundary = /\r?\n\r?\n/; try { for (;;) { const { done, value } = await reader.read(); if (done) break; buffer += decoder.decode(value, { stream: true }); // SSE frames are separated by a blank line. let sep: number; - while ((sep = buffer.indexOf('\n\n')) !== -1) { + while ((sep = buffer.search(frameBoundary)) !== -1) { const frame = buffer.slice(0, sep); - buffer = buffer.slice(sep + 2); + const boundary = buffer.slice(sep).match(frameBoundary)?.[0] ?? '\n\n'; + buffer = buffer.slice(sep + boundary.length); // A frame may carry an `event:` name plus one or more `data:` lines. // The backend emits `event: error` for typed gateway failures, so we // must inspect the event type — not just blindly parse `data:`. let eventType = 'message'; const dataLines: string[] = []; - for (const line of frame.split('\n')) { + for (const line of frame.split(/\r?\n/)) { if (line.startsWith('event:')) { eventType = line.slice('event:'.length).trim(); } else if (line.startsWith('data:')) { dataLines.push(line.slice('data:'.length).trim()); }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const reader = response.body.getReader();

const decoder = new TextDecoder();

let buffer = '';

try {

for (;;) {

const { done, value } = await reader.read();

if (done) break;

buffer += decoder.decode(value, { stream: true });

// SSE frames are separated by a blank line.

let sep: number;

while ((sep = buffer.indexOf('\n\n')) !== -1) {

const frame = buffer.slice(0, sep);

buffer = buffer.slice(sep + 2);

// A frame may carry an `event:` name plus one or more `data:` lines.

// The backend emits `event: error` for typed gateway failures, so we

// must inspect the event type — not just blindly parse `data:`.

let eventType = 'message';

const dataLines: string[] = [];

for (const line of frame.split('\n')) {

if (line.startsWith('event:')) {

eventType = line.slice('event:'.length).trim();

} else if (line.startsWith('data:')) {

dataLines.push(line.slice('data:'.length).trim());

}

}

const data = dataLines.join('\n').trim();

const reader = response.body.getReader();

const decoder = new TextDecoder();

let buffer = '';

const frameBoundary = /\r?\n\r?\n/;

try {

for (;;) {

const { done, value } = await reader.read();

if (done) break;

buffer += decoder.decode(value, { stream: true });

// SSE frames are separated by a blank line.

let sep: number;

while ((sep = buffer.search(frameBoundary)) !== -1) {

const frame = buffer.slice(0, sep);

const boundary = buffer.slice(sep).match(frameBoundary)?.[0] ?? '\n\n';

buffer = buffer.slice(sep + boundary.length);

// A frame may carry an `event:` name plus one or more `data:` lines.

// The backend emits `event: error` for typed gateway failures, so we

// must inspect the event type — not just blindly parse `data:`.

let eventType = 'message';

const dataLines: string[] = [];

for (const line of frame.split(/\r?\n/)) {

if (line.startsWith('event:')) {

eventType = line.slice('event:'.length).trim();

} else if (line.startsWith('data:')) {

dataLines.push(line.slice('data:'.length).trim());

}

}

const data = dataLines.join('\n').trim();

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@dashboard/lib/api.ts` around lines 1272 - 1297, The SSE parser in streamConnectorTest currently only handles LF-only framing (uses buffer.indexOf('\n\n') and frame.split('\n')), so CRLF-delimited frames (`\r\n` / `\r\n\r\n`) can be missed; fix by normalizing line endings and boundary detection: when reading into buffer (and before splitting frames) replace CRLF with LF (e.g., buffer = buffer.replace(/\r\n/g, '\n')) or detect both separators (check for '\r\n\r\n' and '\n\n'), and split frame lines with a CRLF-tolerant pattern (e.g., frame.split(/\r?\n/)); update references in streamConnectorTest where buffer.indexOf, frame.split, eventType, and dataLines are used so `event: error` frames are correctly parsed.

coderabbitai · 2026-06-03T05:26:48Z

+def upgrade() -> None:
+    op.add_column(
+        "llm_connectors",
+        sa.Column("monthly_token_cap", sa.Integer(), nullable=True),
+    )
+
+
+def downgrade() -> None:
+    op.drop_column("llm_connectors", "monthly_token_cap")


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Enforce non-negative caps in the migration.

monthly_token_cap currently accepts negative integers. A bad write here will make the connector effectively "always over budget" once the gateway compares current usage against the stored cap. Add a DB check so invalid values cannot persist.

💡 Proposed fix

def upgrade() -> None: op.add_column( "llm_connectors", sa.Column("monthly_token_cap", sa.Integer(), nullable=True), ) + op.create_check_constraint( + "ck_llm_connectors_monthly_token_cap_nonnegative", + "llm_connectors", + "monthly_token_cap IS NULL OR monthly_token_cap >= 0", + ) def downgrade() -> None: + op.drop_constraint( + "ck_llm_connectors_monthly_token_cap_nonnegative", + "llm_connectors", + type_="check", + ) op.drop_column("llm_connectors", "monthly_token_cap")

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/alembic/versions/051_llm_connector_monthly_token_cap.py` around lines 26 - 34, The migration allows negative monthly_token_cap values; add a DB-level check constraint in upgrade() to enforce monthly_token_cap >= 0 (e.g., create a constraint named like ck_llm_connectors_monthly_token_cap_nonnegative via op.create_check_constraint on table "llm_connectors"), and in downgrade() drop that constraint (op.drop_constraint) before dropping the monthly_token_cap column so the DB won't accept negative caps and the rollback removes the constraint cleanly.

coderabbitai · 2026-06-03T05:26:48Z

+                async for line in resp.aiter_lines():
+                    if not line or not line.startswith("data:"):
+                        continue
+                    data = line[len("data:") :].strip()
+                    if data == "[DONE]":
+                        break
+                    try:
+                        obj = json.loads(data)
+                    except json.JSONDecodeError as exc:
+                        # SSE comment / keepalive frames start with ":" (or are
+                        # blank) and never reach here — they fail the "data:"
+                        # prefix check above. A "data:" line that isn't "[DONE]"
+                        # yet won't parse is a genuine protocol fault; surface it
+                        # rather than silently truncating the stream.
+                        raise ToolTranslationError("Upstream returned malformed SSE JSON") from exc
+                    chunk = parse_openai_stream_event(obj)
+                    if chunk is not None:
+                        yield chunk


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Parse SSE events instead of individual data: lines.

This loop JSON-decodes every data: line separately. SSE frames are event-delimited, and a compliant OpenAI-compatible server may split one event across multiple data: lines. In that case this will raise ToolTranslationError on a partial fragment and truncate an otherwise valid stream.

💡 Proposed fix

- async for line in resp.aiter_lines(): - if not line or not line.startswith("data:"): - continue - data = line[len("data:") :].strip() - if data == "[DONE]": - break - try: - obj = json.loads(data) - except json.JSONDecodeError as exc: - # SSE comment / keepalive frames start with ":" (or are - # blank) and never reach here — they fail the "data:" - # prefix check above. A "data:" line that isn't "[DONE]" - # yet won't parse is a genuine protocol fault; surface it - # rather than silently truncating the stream. - raise ToolTranslationError("Upstream returned malformed SSE JSON") from exc - chunk = parse_openai_stream_event(obj) - if chunk is not None: - yield chunk + event_data: list[str] = [] + async for line in resp.aiter_lines(): + if line.startswith(":"): + continue + if line.startswith("data:"): + event_data.append(line[len("data:") :].lstrip()) + continue + if line != "": + continue + + if not event_data: + continue + data = "\n".join(event_data).strip() + event_data.clear() + if data == "[DONE]": + break + try: + obj = json.loads(data) + except json.JSONDecodeError as exc: + raise ToolTranslationError("Upstream returned malformed SSE JSON") from exc + chunk = parse_openai_stream_event(obj) + if chunk is not None: + yield chunk

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

async for line in resp.aiter_lines():

if not line or not line.startswith("data:"):

continue

data = line[len("data:") :].strip()

if data == "[DONE]":

break

try:

obj = json.loads(data)

except json.JSONDecodeError as exc:

# SSE comment / keepalive frames start with ":" (or are

# blank) and never reach here — they fail the "data:"

# prefix check above. A "data:" line that isn't "[DONE]"

# yet won't parse is a genuine protocol fault; surface it

# rather than silently truncating the stream.

raise ToolTranslationError("Upstream returned malformed SSE JSON") from exc

chunk = parse_openai_stream_event(obj)

if chunk is not None:

yield chunk

event_data: list[str] = []

async for line in resp.aiter_lines():

if line.startswith(":"):

continue

if line.startswith("data:"):

event_data.append(line[len("data:") :].lstrip())

continue

if line != "":

continue

if not event_data:

continue

data = "\n".join(event_data).strip()

event_data.clear()

if data == "[DONE]":

break

try:

obj = json.loads(data)

except json.JSONDecodeError as exc:

raise ToolTranslationError("Upstream returned malformed SSE JSON") from exc

chunk = parse_openai_stream_event(obj)

if chunk is not None:

yield chunk

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/app/services/llm/adapters/_httpx_openai.py` around lines 206 - 223, The loop currently JSON-decodes each "data:" line individually which breaks SSE events that span multiple data lines; change the logic in the resp.aiter_lines iteration (the block around the async for line in resp.aiter_lines()) to accumulate successive "data:" lines into a single event buffer (concatenate their payloads preserving newlines) until you hit the SSE event delimiter (a blank line) or a non-"data:" line, then handle the special "[DONE]" sentinel and only call json.loads + parse_openai_stream_event on the fully assembled event string; on JSON decode errors raise ToolTranslationError as before but only for complete events so partial fragments don't prematurely raise.

coderabbitai · 2026-06-03T05:26:48Z

+def _translate_anthropic_event(
+    event: Any,
+) -> tuple[ChatResponseChunk | None, bool, str | None, int | None]:
+    """Translate one Anthropic SDK stream event into stream state.
+
+    Returns ``(chunk_or_None, saw_tool_use, stop_reason_or_None,
+    output_tokens_or_None)``. Mirrors the dual dict/object access style of
+    ``tool_translation.parse_anthropic_response`` so it tolerates either the
+    typed SDK events or plain dicts (used in tests).
+    """
+    etype = getattr(event, "type", None)
+
+    if etype == "content_block_start":
+        block = getattr(event, "content_block", None)
+        if getattr(block, "type", None) == "tool_use":
+            idx = int(getattr(event, "index", 0))
+            chunk = ChatResponseChunk(
+                tool_call_deltas=[
+                    ToolCallDelta(
+                        index=idx,
+                        id=getattr(block, "id", None),
+                        name=getattr(block, "name", None),
+                    )
+                ]
+            )
+            return chunk, True, None, None
+        return None, False, None, None
+
+    if etype == "content_block_delta":
+        delta = getattr(event, "delta", None)
+        dtype = getattr(delta, "type", None)
+        if dtype == "text_delta":
+            return ChatResponseChunk(text_delta=getattr(delta, "text", "") or ""), False, None, None
+        if dtype == "input_json_delta":
+            idx = int(getattr(event, "index", 0))
+            chunk = ChatResponseChunk(
+                tool_call_deltas=[
+                    ToolCallDelta(
+                        index=idx,
+                        input_json_fragment=getattr(delta, "partial_json", "") or "",
+                    )
+                ]
+            )
+            return chunk, False, None, None
+        return None, False, None, None
+
+    if etype == "message_delta":
+        delta = getattr(event, "delta", None)
+        stop_reason = getattr(delta, "stop_reason", None)
+        usage = getattr(event, "usage", None)
+        output_tokens = None
+        if usage is not None:
+            ot = getattr(usage, "output_tokens", None)
+            if ot is not None:
+                output_tokens = int(ot)
+        return None, False, stop_reason, output_tokens
+
+    return None, False, None, None


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Handle dict-backed stream events here too.

_translate_anthropic_event() says it accepts SDK objects or plain dicts, but this branch only uses getattr(...). For dict fixtures, etype, delta, content_block, and usage all resolve to None, so text/tool deltas and stop metadata get dropped silently.

💡 Minimal fix

+def _field(obj: Any, name: str) -> Any: + return obj.get(name) if isinstance(obj, dict) else getattr(obj, name, None) + def _translate_anthropic_event( event: Any, ) -> tuple[ChatResponseChunk | None, bool, str | None, int | None]: @@ - etype = getattr(event, "type", None) + etype = _field(event, "type") @@ - block = getattr(event, "content_block", None) - if getattr(block, "type", None) == "tool_use": - idx = int(getattr(event, "index", 0)) + block = _field(event, "content_block") + if _field(block, "type") == "tool_use": + idx = int(_field(event, "index") or 0) chunk = ChatResponseChunk( tool_call_deltas=[ ToolCallDelta( index=idx, - id=getattr(block, "id", None), - name=getattr(block, "name", None), + id=_field(block, "id"), + name=_field(block, "name"), ) ] ) @@ - delta = getattr(event, "delta", None) - dtype = getattr(delta, "type", None) + delta = _field(event, "delta") + dtype = _field(delta, "type") if dtype == "text_delta": - return ChatResponseChunk(text_delta=getattr(delta, "text", "") or ""), False, None, None + return ChatResponseChunk(text_delta=_field(delta, "text") or ""), False, None, None if dtype == "input_json_delta": - idx = int(getattr(event, "index", 0)) + idx = int(_field(event, "index") or 0) chunk = ChatResponseChunk( tool_call_deltas=[ ToolCallDelta( index=idx, - input_json_fragment=getattr(delta, "partial_json", "") or "", + input_json_fragment=_field(delta, "partial_json") or "", ) ] ) @@ - delta = getattr(event, "delta", None) - stop_reason = getattr(delta, "stop_reason", None) - usage = getattr(event, "usage", None) + delta = _field(event, "delta") + stop_reason = _field(delta, "stop_reason") + usage = _field(event, "usage") output_tokens = None if usage is not None: - ot = getattr(usage, "output_tokens", None) + ot = _field(usage, "output_tokens")

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/app/services/llm/adapters/anthropic_apikey.py` around lines 183 - 240, _translate_anthropic_event currently only uses getattr(...) so plain-dict events drop through; update its attribute accesses to fall back to dict-key lookup when getattr returns None (e.g., read etype = getattr(event, "type", None) or event.get("type") if event is a dict), do the same for content_block/block fields (block = getattr(event, "content_block", None) or block = block.get(...) ), delta, usage, index (accept event.get("index") too) and the inner fields like delta["text"] / delta.get("partial_json") as fallbacks; preserve existing behavior for SDK objects and keep conversions (int(...) for index/output_tokens) and the same ChatResponseChunk and ToolCallDelta construction.

coderabbitai · 2026-06-03T05:26:48Z

+    finish_reason = choice.get("finish_reason")
+    done = finish_reason is not None
+
+    stop_reason: CanonicalStopReason | None = None
+    usage: TokenUsage | None = None
+    if done:
+        stop_reason = _FINISH_REASON_OPENAI.get(finish_reason, "error")
+        usage_payload = payload.get("usage") or {}
+        if usage_payload:
+            usage = TokenUsage(
+                prompt=_as_int(usage_payload.get("prompt_tokens", 0)),
+                completion=_as_int(usage_payload.get("completion_tokens", 0)),
+            )
+
+    if not text_delta and not tool_call_deltas and not done:
+        return None


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't discard usage-only terminal events.

When include_usage is enabled, the final OpenAI-style stream payload can carry only usage metadata and no text/tool delta. This parser returns None for that event because done is driven only by finish_reason, so streamed calls lose token accounting.

💡 Proposed fix

- stop_reason: CanonicalStopReason | None = None - usage: TokenUsage | None = None - if done: - stop_reason = _FINISH_REASON_OPENAI.get(finish_reason, "error") - usage_payload = payload.get("usage") or {} - if usage_payload: - usage = TokenUsage( - prompt=_as_int(usage_payload.get("prompt_tokens", 0)), - completion=_as_int(usage_payload.get("completion_tokens", 0)), - ) + stop_reason: CanonicalStopReason | None = None + usage: TokenUsage | None = None + if done: + stop_reason = _FINISH_REASON_OPENAI.get(finish_reason, "error") + usage_payload = payload.get("usage") or {} + if usage_payload: + usage = TokenUsage( + prompt=_as_int(usage_payload.get("prompt_tokens", 0)), + completion=_as_int(usage_payload.get("completion_tokens", 0)), + ) - if not text_delta and not tool_call_deltas and not done: + if not text_delta and not tool_call_deltas and not done and usage is None: return None

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/app/services/llm/streaming.py` around lines 70 - 85, The parser currently returns None for events lacking text/tool deltas even when they are terminal usage-only events; move or compute usage_payload before the early-return and change the guard so it only returns None when there is no text_delta, no tool_call_deltas, not done, and no usage_payload. Specifically, keep the existing logic that reads finish_reason and sets done/stop_reason via _FINISH_REASON_OPENAI, but ensure usage_payload = payload.get("usage") or {} is evaluated before the if-not-text/tool/done check and include a check for usage_payload (and populate usage via TokenUsage and _as_int as shown) so usage-only terminal events are not discarded.

…igrations The PR #354 conflict resolution anchored a11334c031bb (frictionless join, already applied to prod via main) ABOVE 046-051. Prod's alembic_version reads a11334c031bb, so after merge 'alembic upgrade head' would see current == head and silently skip all six AI-engine migrations — no llm tables would ever be created in production. Fix: restore a11334c031bb to down_revision=045 (byte-identical to main's copy) and re-anchor 046 onto a11334c031bb. New chain: 045 -> a11334c031bb -> 046 -> ... -> 051 (head). Validated on PostgreSQL 16: - prod-path simulation: main chain to a11334c031bb, then epic chain applies 046-051 and creates llm_connectors/llm_call_log/ llm_audit_event/llm_feature_preferences - fresh-DB full chain passes, single head (051), alembic check clean

thewrz and others added 7 commits May 25, 2026 08:33

docs(ai): add epic/ai-engine integration-branch strategy to spec

2000abc

Sub-PRs target the long-lived epic/ai-engine branch, not main. Final epic->main merge happens once after end-to-end validation.

coderabbitai Bot reviewed May 25, 2026

View reviewed changes

Comment thread server/app/services/llm/adapters/openai_compatible.py Outdated

Comment thread server/app/services/llm/tool_translation.py

This was referenced May 26, 2026

[AI Engine] DJ-readable LLM connector policy endpoint (fail-closed settings UI) #355

Closed

SSE event stream pins a pooled DB connection for the whole stream lifetime → pool exhaustion under guest load #356

Closed

thewrz added 3 commits May 25, 2026 22:22

coderabbitai Bot reviewed May 26, 2026

View reviewed changes

Comment thread dashboard/components/AiProvidersSection.tsx

Comment thread dashboard/app/admin/ai/__tests__/page.test.tsx Outdated

Comment thread dashboard/app/admin/ai/page.tsx

Comment thread server/app/core/csv_safe.py Outdated

Comment thread server/tests/test_export.py

thewrz and others added 8 commits May 25, 2026 22:39

thewrz mentioned this pull request May 28, 2026

[AI Engine] MVP — provider-agnostic LLM gateway + 3 connectors + recommendation migration #329

Closed

thewrz added 2 commits May 28, 2026 15:14

thewrz and others added 10 commits May 28, 2026 15:26

coderabbitai Bot reviewed Jun 3, 2026

View reviewed changes

thewrz and others added 2 commits June 2, 2026 23:00

Merge branch 'main' into epic/ai-engine

b674ad8

Conversation

thewrz commented May 25, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What lands

Provider notes

Test plan

Summary by CodeRabbit

Release Notes

Verification status (added on review)

Follow-ups (not blocking this PR)

Uh oh!

coderabbitai Bot commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

thewrz commented Jun 3, 2026

Uh oh!

coderabbitai Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ server/app/api/sse.py

✅ dashboard/app/(dj)/account/page.tsx

✅ Test mocks (account/page.test.tsx + lib/__tests__/api.test.ts)

✅ server/alembic/versions/a11334c031bb_… re-parent

⚠️ Pre-existing two-head risk: 5bb46508476b_add_now_playing_and_play_history_tables.py

Uh oh!

thewrz commented Jun 3, 2026

Uh oh!

coderabbitai Bot commented Jun 3, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

thewrz commented May 25, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 25, 2026 •

edited

Loading

coderabbitai Bot commented Jun 3, 2026 •

edited

Loading

✅ `server/app/api/sse.py`

✅ `dashboard/app/(dj)/account/page.tsx`

✅ Test mocks (`account/page.test.tsx` + `lib/tests/api.test.ts`)

✅ `server/alembic/versions/a11334c031bb_…` re-parent

⚠️ Pre-existing two-head risk: `5bb46508476b_add_now_playing_and_play_history_tables.py`