Release/v3.8.8#2930
Conversation
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
There was a problem hiding this comment.
Code Review
This pull request implements the Quota Sharing Engine (Group B, plans 16 and 22) to fairly distribute provider quotas across multiple API keys, along with a reorganized dashboard sidebar featuring a new Costs section and an Activity feed. The code reviewer identified several critical and high-severity issues: missing await calls on the asynchronous getQuotaStore function in enforce.ts that would cause runtime TypeErrors, in-place mutation of cached translation module exports in request.ts leading to cross-request state pollution, and unhandled non-2xx HTTP status codes in frontend fetch calls. Additionally, the reviewer pointed out dead database queries in sqliteQuotaStore.ts, a hardcoded empty provider name in the pool usage route that breaks catalog plan resolution, and redundant conditional checks in fairShare.ts.
|
|
||
| // 4. For each active dimension, peek consumption and saturation. | ||
| const store = getQuotaStore(); |
There was a problem hiding this comment.
The getQuotaStore function is asynchronous and returns a Promise<QuotaStore>. Calling it without await here means store is a Promise rather than the actual store instance. This will cause store.peek to be undefined and throw a TypeError at runtime, breaking quota enforcement.
| // 4. For each active dimension, peek consumption and saturation. | |
| const store = getQuotaStore(); | |
| // 4. For each active dimension, peek consumption and saturation. | |
| const store = await getQuotaStore(); |
|
|
||
| const store = getQuotaStore(); |
There was a problem hiding this comment.
Similar to the issue in enforceQuotaShare, getQuotaStore is called without await in recordConsumption. This causes store to be a Promise, making store.consume undefined and failing to record any consumption at runtime.
| const store = getQuotaStore(); | |
| const store = await getQuotaStore(); |
| if (locale !== FALLBACK_LOCALE) { | ||
| const fallbackMessages = (await import(`./messages/${FALLBACK_LOCALE}.json`)).default as Record<string, unknown>; | ||
| messages = deepMergeFallback({ ...localeMessages }, fallbackMessages); | ||
| } |
There was a problem hiding this comment.
The shallow copy { ...localeMessages } only copies the top-level properties of the imported translation module. Since deepMergeFallback mutates nested objects in-place, this will directly mutate the cached module exports of the translation files. On subsequent requests, the translations will remain polluted with English fallback keys, leading to cross-request state pollution. Using a deep clone like JSON.parse(JSON.stringify(...)) prevents this.
| if (locale !== FALLBACK_LOCALE) { | |
| const fallbackMessages = (await import(`./messages/${FALLBACK_LOCALE}.json`)).default as Record<string, unknown>; | |
| messages = deepMergeFallback({ ...localeMessages }, fallbackMessages); | |
| } | |
| let messages = localeMessages as Record<string, unknown>; | |
| if (locale !== FALLBACK_LOCALE) { | |
| const fallbackMessages = (await import(`./messages/${FALLBACK_LOCALE}.json`)).default as Record<string, unknown>; | |
| messages = deepMergeFallback(JSON.parse(JSON.stringify(localeMessages)), fallbackMessages); | |
| } |
| const handleCreate = useCallback( | ||
| async (poolData: Omit<QuotaPool, "id" | "createdAt">) => { | ||
| await fetch("/api/quota/pools", { | ||
| method: "POST", | ||
| headers: { "Content-Type": "application/json" }, | ||
| body: JSON.stringify(poolData), | ||
| }); | ||
| await mutate(); | ||
| }, | ||
| [pools, savePools] | ||
| [mutate] | ||
| ); | ||
|
|
||
| // ── Derived ────────────────────────────────────────────────────────────── | ||
| const handleSaveAllocations = useCallback( | ||
| async (pool: QuotaPool, allocations: PoolAllocation[]) => { | ||
| await fetch(`/api/quota/pools/${pool.id}`, { | ||
| method: "PATCH", | ||
| headers: { "Content-Type": "application/json" }, | ||
| body: JSON.stringify({ allocations }), | ||
| }); | ||
| await mutate(); | ||
| }, | ||
| [mutate] | ||
| ); | ||
|
|
||
| const stats = useMemo(() => { | ||
| let allocations = 0; | ||
| let atCap = 0; | ||
| let uncappedPct = 0; | ||
| for (const p of pools) { | ||
| allocations += p.allocations.length; | ||
| const totalPct = p.allocations.reduce((s, a) => s + a.percent, 0); | ||
| if (totalPct < 100) uncappedPct += 100 - totalPct; | ||
| // Simulated "at cap" — without backend tracking, mark pools with >=1 | ||
| // allocation summing to 100% as "fully utilized config" (proxy metric). | ||
| if (totalPct >= 100 && p.allocations.length > 0) atCap += 0; // see disclaimer | ||
| } | ||
| return { | ||
| activePools: pools.length, | ||
| allocations, | ||
| atCap, | ||
| uncapped: pools.length > 0 ? Math.round(uncappedPct / pools.length) : 0, | ||
| }; | ||
| }, [pools]); | ||
| const handleRemovePool = useCallback( | ||
| async (id: string) => { | ||
| if (!confirm(t("removeConfirm"))) return; | ||
| await fetch(`/api/quota/pools/${id}`, { method: "DELETE" }); | ||
| await mutate(); | ||
| }, | ||
| [mutate, t] | ||
| ); |
There was a problem hiding this comment.
The fetch API does not throw on non-2xx HTTP status codes (such as 400 or 500). Since these helper functions do not check res.ok, any server-side validation or processing errors will be silently ignored. This bypasses the try/catch blocks in the calling modals (e.g., CreatePoolModal and EditAllocationsModal), causing them to close and report success to the user even when the operation failed on the server.
const handleCreate = useCallback(
async (poolData: Omit<QuotaPool, "id" | "createdAt">) => {
const res = await fetch("/api/quota/pools", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify(poolData),
});
if (!res.ok) {
throw new Error(`Failed to create pool: ${res.statusText}`);
}
await mutate();
},
[mutate]
);
const handleSaveAllocations = useCallback(
async (pool: QuotaPool, allocations: PoolAllocation[]) => {
const res = await fetch(`/api/quota/pools/${pool.id}`, {
method: "PATCH",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ allocations }),
});
if (!res.ok) {
throw new Error(`Failed to save allocations: ${res.statusText}`);
}
await mutate();
},
[mutate]
);
const handleRemovePool = useCallback(
async (id: string) => {
if (!confirm(t("removeConfirm"))) return;
const res = await fetch(`/api/quota/pools/${id}`, { method: "DELETE" });
if (!res.ok) {
throw new Error(`Failed to remove pool: ${res.statusText}`);
}
await mutate();
},
[mutate, t]
);
| // instead we scan listAllocationsForApiKey to find which pools the key | ||
| // participates in, and derive dimensions via best-effort getBucket. | ||
| // For poolUsage we rely on the dimension keys we can discover. | ||
| // Since dimensions live in ProviderPlan (resolved separately), we peek | ||
| // via direct getBucket reads for the current bucket only. | ||
| // | ||
| // Note: This is intentionally a lightweight implementation. The full | ||
| // dimension list should come from the resolved plan; here we surface | ||
| // what's been stored in quota_consumption for this pool. | ||
|
|
||
| const { apiKeyId } = alloc; | ||
| // listAllocationsForApiKey returns pairs across all pools; filter to this one | ||
| const allAllocsForKey = listAllocationsForApiKey(apiKeyId); | ||
| for (const { poolId: pid } of allAllocsForKey) { | ||
| if (pid !== poolId) continue; | ||
| // The dimension keys for this pool are known if consumption exists | ||
| // We can't list all keys without a query, so we rely on the calling | ||
| // context having pre-populated via consume(). For dashboard use, | ||
| // the pool dimensions are read from the provider plan. | ||
| } | ||
|
|
||
| // We only read dimensions that we can discover from what was actually | ||
| // consumed. For a richer implementation, the caller should pass the | ||
| // resolved plan dimensions (done in REST routes - F8). | ||
| // Here: peek for common windows to detect what's in use. | ||
| } |
There was a problem hiding this comment.
This for (const alloc of allocations) loop is completely dead code. It queries the database for each allocation's pools but does not use or store the results anywhere, nor does it populate dimMap. This results in redundant database queries and CPU overhead on every poolUsage call.
| // instead we scan listAllocationsForApiKey to find which pools the key | |
| // participates in, and derive dimensions via best-effort getBucket. | |
| // For poolUsage we rely on the dimension keys we can discover. | |
| // Since dimensions live in ProviderPlan (resolved separately), we peek | |
| // via direct getBucket reads for the current bucket only. | |
| // | |
| // Note: This is intentionally a lightweight implementation. The full | |
| // dimension list should come from the resolved plan; here we surface | |
| // what's been stored in quota_consumption for this pool. | |
| const { apiKeyId } = alloc; | |
| // listAllocationsForApiKey returns pairs across all pools; filter to this one | |
| const allAllocsForKey = listAllocationsForApiKey(apiKeyId); | |
| for (const { poolId: pid } of allAllocsForKey) { | |
| if (pid !== poolId) continue; | |
| // The dimension keys for this pool are known if consumption exists | |
| // We can't list all keys without a query, so we rely on the calling | |
| // context having pre-populated via consume(). For dashboard use, | |
| // the pool dimensions are read from the provider plan. | |
| } | |
| // We only read dimensions that we can discover from what was actually | |
| // consumed. For a richer implementation, the caller should pass the | |
| // resolved plan dimensions (done in REST routes - F8). | |
| // Here: peek for common windows to detect what's in use. | |
| } | |
| // Note: This is intentionally a lightweight implementation. The full | |
| // dimension list should come from the resolved plan; here we surface | |
| // what's been stored in quota_consumption for this pool. |
| // 2. Resolve the provider plan for this pool's connection | ||
| // Provider name is not stored on pool — use empty string to trigger catalog/empty fallback | ||
| const plan = resolvePlan(pool.connectionId, ""); |
There was a problem hiding this comment.
Passing an empty string "" as the provider name to resolvePlan prevents catalog-defined plans (like codex or kimi) from being resolved unless a manual override is present in the database. Resolving the provider name from the connection ID first allows catalog plans to work out-of-the-box.
// 2. Resolve the provider plan for this pool's connection
let provider = "";
try {
const { getProviderConnectionById } = await import("@/lib/localDb");
const conn = getProviderConnectionById(pool.connectionId);
if (conn && typeof (conn as { provider?: string }).provider === "string") {
provider = (conn as { provider: string }).provider;
}
} catch {
// ignore
}
const plan = resolvePlan(pool.connectionId, provider);| // ── Teto global intransponível ───────────────────────────────────────── | ||
| // If the pool's global limit is already reached AND this key's request | ||
| // would exceed it (burst mode without borrow room), block as "global-saturated". | ||
| if (dim.consumedTotal >= dim.limit) { | ||
| if (allocation.policy !== "burst") { | ||
| return { kind: "block", reason: "global-saturated" }; | ||
| } | ||
| // burst also blocked when no room at all | ||
| return { kind: "block", reason: "global-saturated" }; |
There was a problem hiding this comment.
The conditional check if (allocation.policy !== "burst") is redundant because both branches return the exact same { kind: "block", reason: "global-saturated" } value. This can be simplified.
| // ── Teto global intransponível ───────────────────────────────────────── | |
| // If the pool's global limit is already reached AND this key's request | |
| // would exceed it (burst mode without borrow room), block as "global-saturated". | |
| if (dim.consumedTotal >= dim.limit) { | |
| if (allocation.policy !== "burst") { | |
| return { kind: "block", reason: "global-saturated" }; | |
| } | |
| // burst also blocked when no room at all | |
| return { kind: "block", reason: "global-saturated" }; | |
| if (dim.consumedTotal >= dim.limit) { | |
| return { kind: "block", reason: "global-saturated" }; | |
| } |
CI Coverage Report
Coverage artifact was not available for this run. |
| ) { | ||
| deepMergeFallback(targetValue as Record<string, unknown>, sourceValue as Record<string, unknown>); | ||
| } else if (targetValue === undefined) { | ||
| target[key] = sourceValue; |
There was a problem hiding this comment.
CodeQL found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.
…y-engine-redesign feat(memory): memory engine redesign — sqlite-vec + hybrid RRF + Studio UI (plan 21)
…ols Studio — plans 17+18) Conflicts: migration 076_playground_presets->084; localDb/.env union; REPOSITORY_MAP dedup; package.json keep base (version 3.8.7, coverage --functions 40); .source regenerated (+3 docs); deps cli-table3/wtfnode/@types/bun/uuid (npm install). i18n pt-BR: 19 collisions — 18 playground keys -> HEAD pt-BR translations (base had untranslated EN: Send->Enviar, Cancel->Cancelar etc), costsSection -> base Custos. Rule: prefer side != en.json (translated). openapi: --theirs base + 3 playground/search paths + 2 schemas + 1 tag (js-yaml surgical insert; union-blind breaks YAML). No open-sse, typecheck:core 0 errors.
…ound-search-tools feat(playground,search-tools): Playground Studio + Search Tools Studio (planos 17+18)
Translate Claude Code web_search_YYYYMMDD server tools to the native OpenAI Responses web_search tool and preserve filters/location. Convert forced Claude tool_choice for web_search to the native Responses tool choice while leaving ordinary custom functions unchanged. Closes #2936
Only translate Claude Code web_search_YYYYMMDD server tools to native Responses web_search when the final target is OpenAI Responses. Keep the Chat Completions target on function-tool shape and cover the full translateRequest path.
Keep existing object-argument cleanup behavior, but avoid parsing and stripping arbitrary JSON-string arguments for unrelated tools where empty strings or arrays may be valid payloads. Add regression coverage for non-Read and non-object Read arguments.
Remove the AuditLogTab from the dashboard logs page now that audit logs live under the dedicated /dashboard/audit route. Update integration wiring expectations and add metadata frontmatter to studio framework docs.
…ded in-memory caches Root cause: Bottleneck rate limiter instances in rateLimitManager accumulate without cleanup. Each instance runs an internal heartbeat setInterval every 250ms. Under heavy load with many provider:connection:model combinations, hundreds of limiters accumulate causing CPU to grow ~0.1%/min until server collapse (~2% after 5 minutes of intensive use). Changes: - rateLimitManager: Add idle limiter eviction in watchdogTick() using the previously defined but unused INACTIVE_LIMITER_MS threshold. Populate limiterLastUsed on every getLimiter() call. Clean up all 3 Maps (limiters, lastDispatchAt, limiterLastUsed) consistently. - combo.ts: Add size-based FIFO eviction to rrCounters, resetAwareConnectionCache, and resetAwareQuotaCache Maps. Convert per-target log.info calls in combo execution loops to log.debug?. to reduce serialization overhead. - chatCore.ts: Fix double-serialization in estimateTokens(JSON.stringify(x)) calls (estimateTokens already handles objects). Make trace() conditional on OMNIRROUTE_TRACE/DEBUG env vars. Make per-request usage logging conditional. - apiKeyRotator.ts: Add eviction guards to _keyHealth and _connectionExtraKeys Maps (MAX 500 entries each). Ensure removeConnectionIndex cleans all 3 Maps. - codexQuotaFetcher.ts: Add eviction guard to connectionRegistry and quotaCache Maps (MAX 200 entries each).
Code Review SummaryStatus: 1 Suggestions Found | Recommendation: Address before merge Overview
Issue Details (click to expand)SUGGESTION
Other Observations (not in diff)Issues found in unchanged code that cannot receive inline comments:
Files Reviewed (4 files)
|
…native Claude OAuth
Native Claude OAuth (claude->claude passthrough) forwards client tool
definitions verbatim. Anthropic's first-party Messages API then rejects:
- invalid tool input_schemas (deep-truncation placeholders such as
`enum: "[MaxDepth]"`, or index-keyed objects where arrays are required), and
- tool names it fingerprints as a third-party agent harness (specific
blacklisted names like `mixture_of_agents`, or a large enough set of
recognizable snake_case agent tool names),
both surfaced as a misleading `400 You're out of extra usage` placeholder
(the SSE stream is refused — not a real billing event). The same request
succeeds on translator-backed providers (OpenAI/Codex), which already sanitize
and re-shape tool payloads — so the gap is specific to the native passthrough.
Adds the missing guards on the native Claude OAuth path (executors/base.ts):
- sanitizeClaudeToolSchemas(): coerce/drop invalid draft-2020-12 constructs
(non-array enum/required/anyOf/..., placeholder schema slots -> {}).
- cloakThirdPartyToolNames(): deterministically alias non-Claude-Code tool
names (Claude Code canonical mapping where one exists, else PascalCase),
tracked in the existing per-request _toolNameMap so remapToolNamesInResponse
restores the caller's original names. Opt out via
CLAUDE_DISABLE_TOOL_NAME_CLOAK=true.
Genuine Claude Code tool names (PascalCase) and already-valid schemas are
left untouched, so existing first-party traffic is unaffected.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Extract 3 high-value CPU/RAM optimizations from perf branch: 1. estimateSizeFast() — fast object-tree size estimator replacing JSON.stringify().length in isSmallEnoughForSemanticCache(). Walks object tree with a stack, zero string allocation, early exit at 256KB. 2. Consolidate settings reads — move getCachedSettings() to a single early read in handleChatCore(), eliminating a redundant second read 200 lines later. Also removes the isDetailedLoggingEnabled() wrapper call (reads settings internally) in favor of direct field check. 3. Registry Proxy→direct export — convert 8 registries from lazy Proxy+getOrCreate pattern to simple exported const objects. Eliminates Proxy trap overhead on every provider property access during routing. Affected: audio, embedding, image, moderation, music, rerank, search, video registries (-451 lines of Proxy boilerplate). These changes are independent of the CPU leak fix (limiter eviction) and complement it by reducing per-request CPU overhead.
… null-guards, docs Follow-up commit on PR #2943 review: - Preserve boolean schemas in `sanitizeClaudeToolSchemas` (Gemini Code Assist, high severity). `additionalProperties: false` is the canonical JSON Schema lock-down for object tools; the previous coercion silently turned it into the permissive `{}`, which would invite models to hallucinate extra arguments during tool calling. Same rule now applies to per-property boolean schemas under `properties`. Placeholder strings still get the permissive `{}` slot — booleans get preserved verbatim. - Defensive null guards in `cloakThirdPartyToolNames` for `tools[]` and `messages[]` entries that might be `null`/`undefined`. Prevents a runtime `TypeError` if a malformed payload reaches the cloak. - Document `CLAUDE_DISABLE_TOOL_NAME_CLOAK` in `.env.example` and `docs/reference/ENVIRONMENT.md` (env/docs contract was failing in CI). - Regression tests covering all of the above (5 boolean preservation cases, 2 null-tolerance cases). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ecutor path The native Claude OAuth guard in executors/base.ts is bypassed when `upstream_proxy_config.mode = cliproxyapi` routes the request through the CliproxyAPI executor — it has its own execute()/transformRequest() and never reaches BaseExecutor.execute(), so the cloak/sanitizer never ran for that (common) deployment. Wire the same guards into CliproxyapiExecutor.transformRequest (Anthropic-shape branch), composing with the existing bisected `mcp_*` reserved-namespace rewrite: - sanitizeClaudeToolSchemas() on transformed.tools. - cloakThirdPartyToolNames() with skip = mcp-reserved, so applyMcpToolNameRewrite keeps authority over `mcp_*` (its bisected `Mcp_X` form) and the two reverse maps stay disjoint / single-hop. Both merge into the non-enumerable _toolNameMap the response stream already uses to restore the caller's names. cloakThirdPartyToolNames is now non-mutating (clones changed entries) to respect transformRequest's no-input-mutation contract, and takes an optional `skip` predicate. Verified end-to-end through the live CPA path: a real ~100-tool harness payload that returned the "out of extra usage" placeholder now returns 200 with original tool names restored on the response stream; `mcp_*` tools and genuine PascalCase Claude Code tools are unaffected. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…t boundary
The /dashboard/tools/agent-bridge page (Server Component) passed ALL_TARGETS
directly to AgentBridgePageClient (a Client Component). Each MitmTarget carries
a `handler: () => Promise<...>` function, which Next.js forbids across the
Server/Client boundary, raising at SSR time:
"Functions cannot be passed directly to Client Components ..."
This broke the whole page ("erro ao carregar").
Fix: introduce MitmTargetView = Omit<MitmTarget, "handler"> and pass a
sanitized array (ALL_TARGETS.map(({ handler, ...rest }) => rest)). The UI never
invokes handler, so behavior is unchanged. Adds a regression test asserting the
sanitized targets are function-free and JSON-serializable.
…on fixes (#3056) Co-authored-by: Ruslan Sivak <russ@ruslansivak.com>
…sts) #3054 ("remove 9 dead/unreachable free providers") removed the petals/nanobanana configs, registry entries and validators but left dangling references that broke the build and the unit suite on release/v3.8.8: - open-sse/executors/petals.ts imported the deleted ../config/petals.ts (webpack "Module not found" → `next build` failed). Removed the executor, its registration + re-export in executors/index.ts, and the leftover `providerId === "petals"` branch in providerAllowsOptionalApiKey. - Removed tests for the now-deleted providers: executor-petals.test.ts and poolside-provider.test.ts (REGISTRY.poolside was removed), and the petals / nanobanana validator assertions in provider-validation-specialty.test.ts, plus the stale petals catalog assertions in providers-page-utils.test.ts, proxy-connection-test.test.ts and providers-route-managed-catalog.test.ts. The image/video/embed registries for nanobanana/replicate/nomic are real and untouched — only the dead chat/api-key surfaces were removed. 146/146 affected tests pass; typecheck / build clean.
…ropic + collapse) Bugs found while testing the Quota Share engine on the local VPS: - B1 hidden/stuck pools: pools created while the page group filter was "all" were persisted with group_id="all", matched no real group, and rendered nowhere — so they could not be seen, edited or deleted. PoolWizard now resolves the group id away from the "all" sentinel before POST/PATCH (falls back to the first real group / seed group-demo), and QuotaSharePageClient renders an "Ungrouped" recovery bucket so already-orphaned pools stay editable/deletable. - B3 one-connection-per-pool made explicit: existingPoolConnectionIds now spans every member connection (not just the primary), and the wizard shows which pool an already-used connection belongs to instead of silently disabling it. - B4 delete group: wired the missing UI control + handler (handleDeleteGroup, 409-aware) — the backend DELETE handler + deleteGroup already existed. Hidden for "all" and the protected seed group-demo. - B5a endpoints card now surfaces the native Anthropic POST /v1/messages line when a claude*/anthropic provider is in scope (previously only /v1/chat/completions). - B5b endpoints card gained a collapse/minimize toggle (the card was too tall). Source-scan tests + en/pt-BR i18n parity in quota-share-bugfixes-v388.test.ts. The larger quota-key redesign (key type bound to a group, default-restricted with opt-in normal-model access, recoverable keys, api-keys page layout) is planned separately in _tasks/features-v3.8.8/quota-share-key-redesign.plan.md.
Remove the Petals executor from registration and exports. Improve type safety by replacing broad any usage in MCP tool registration with inferred types and documenting dynamic handler type limitations. Add request validation for the agent bridge cert route and expand tests to ensure switch buttons explicitly declare type="button", preventing implicit form submissions.
…on fixes (#3059) * fix(sse): defer enqueuing of event lines to align event names with data lines and prevent stop-signal event name misattribution * fix(sse): preserve keep-alives and prevent pending event leakage on dropped chunks * fix(sse): preserve pending event lines before other non-data lines and fix zero-window-size bypass * fix(sse): defer lastEventLine update until after flush check to preserve previous event context on flush * fix(sse): flush trailing pendingEventLine when stream closes * fix(sse): preserve consecutive event lines without intervening data --------- Co-authored-by: Ruslan Sivak <russ@ruslansivak.com>
…3061) (#3062) No-auth / keyless providers (opencode, opencode-zen) returned synthetic "noauth" credentials BEFORE honoring excludeConnectionIds, so the chat account-fallback loop re-selected the same synthetic connection forever on a persistent upstream error (e.g. the opencode public endpoint answering 401 "Model X is not supported"). The synthetic id has no DB row, so markAccountUnavailable could not persist a cooldown to brake it — each iteration wrote key-health + request logs immediately, growing the DB until the disk filled (see @paraflu's "failure #320" trace in discussion #3038). Honor the exclusion set in both synthetic-credential paths (getProviderCredentials NOAUTH_PROVIDERS block + opencode-zen keyless fallback): once "noauth" is already excluded, return null so the handler stops after a single attempt. The happy path (nothing excluded -> synthetic noauth) is preserved, so keyless access still works. Closes #3061. Tests (TDD): tests/unit/auth-noauth-fallback-loop-3061.test.ts — the two exclusion cases failed before the fix and pass after; two happy-path guards ensure first-selection synthetic noauth still resolves.
… not placeholders The "Available endpoints" card's no-key (default) view generated representative model ids from a hardcoded PREVIEW_MODELS_BY_PROVIDER map, so providers absent from that map (claude, xiaomi-mimo, kimi-coding) rendered fake "model-a/b/c" placeholders. It now fetches the REAL minted qtSd/* combos from /api/combos, parses them (parseQuotaModelName), and groups by group → provider — falling back to the placeholder map only when the fetch fails or returns nothing. The per-key view already showed real models via /api/quota/keys/[id]/models; this aligns the default view with it. Verified on the local VPS: an exclusive key (share01) returns ONLY the real qtSd models of its groups (claudao + chinas) and a non-quota key returns []. The remaining /v1/models leak (non-quota keys still see qtSd among all models) is tracked in the quota-key redesign plan.
…d, plan presets - Beta banner scoped to the Quota Share page (functional-but-bugs-expected) with a pre-filled "open an issue" link (labels quota-share,beta). Page-only. - Endpoints card now also surfaces POST /v1/responses (codex/github) and the codex-only WS /v1/responses line (the Responses-over-WebSocket proxy), each gated on the in-scope provider slug. - planRegistry: seed xiaomi-mimo (4.1B-token weekly "lite" cap) and kimi-coding so the PoolWizard "Limite" step pre-fills a fair-share limit for these no-balance-API providers (fair-share enforces from the proxy's own token count, not an upstream balance — set the real cap manually in step 2). - docs(API_REFERENCE): document the codex Responses-over-WebSocket endpoint. - i18n en/pt-BR for all new keys. Tracked in _tasks/features-v3.8.8/quota-share-key-redesign.plan.md (codex-WS config toggle + per-provider balance fetchers + %-quota attribution are planned follow-ups).
Claude Code (Pro/Max) is a percentage-of-plan quota (5h rolling + weekly cap, shared Claude+Code); exact token caps are unpublished/task-variable so percent is the practical unit. Unblocks the PoolWizard 'Limite' pre-fill for claude pools. Researched plan structures (codex/claude/glm/kimi/minimax/xiaomi) captured in the quota-share redesign plan.
…n tiers - xiaomi-mimo: token plan is MONTHLY (per platform.xiaomimimo.com/token-plan), so the seed is now tokens/monthly/4.1B (was weekly). - deepseek: prepaid in USD — its balance API is already wired (deepseekQuotaFetcher) and the fair-share engine supports the usd unit (COUNTABLE_UNITS). Seeded a usd/monthly preset so the limit is set by dollar value. - minimax: documented the real M3 tiers (Plus ~1.633B/Max ~5.053B/Ultra ~9.796B) in-comment; EPSILON keeps it manual until tier-aware presets land. - planRegistry already seeds codex/claude/glm/minimax/kimi/kimi-coding/xiaomi-mimo/ deepseek/bailian/alibaba; PoolWizard 'Limite' step stays editable. Researched plan structures + the tier-aware-preset follow-up are in the redesign plan.
…ridge-secret auth Two bugs made `wscat ws://host/v1/responses` fail with "Transfer-Encoding can't be present with Content-Length": 1. authz/management policy 401'd the proxy's own internal authenticate/prepare loopback call to /api/internal/codex-responses-ws (MANAGEMENT-classified, the per-process bridge secret wasn't recognized one layer up). Added a tightly-scoped carve-out: isValidWsBridgeRequest() honors a timing-safe sha256 match of OMNIROUTE_WS_BRIDGE_SECRET (x-omniroute-ws-bridge-secret header) for that exact internal path; the route still re-validates the secret. → auth now succeeds → 101. 2. On auth failure the proxy spread the internal fetch's response headers onto the raw upgrade socket — a chunked Transfer-Encoding + Next CSP/route-class headers collided with writeHttpError's Content-Length framing (and duplicated Content-Type via a case-mismatched spread). writeHttpError now strips framing + pipeline/security headers (case-insensitive), and the auth-fail callsite no longer forwards them. Regression test: tests/unit/responses-ws-proxy-headers.test.mjs (exports writeHttpError; asserts no TE+CL, single Content-Type, no CSP/route-class leak, safe headers forwarded).
…2-table layout) The key list stacked many badges in one column (tall/cluttered) and didn't distinguish quota keys. Now renders two sections — "Normal keys" and "Quota keys" (purple QUOTA pill) — sharing the same compact table header via an extracted renderKeyRow(). Quota rows prepend a qtSd-only mode chip + group-name chips (resolved by fetching /api/quota/pools + /api/quota/groups → poolId→group map). Empty sections are hidden. i18n en/pt-BR for the new labels. Source-scan test + i18n parity in api-manager-quota-keys-section.test.ts.
| "normalKeysSection": "Normal keys", | ||
| "quotaKeysSection": "Quota keys", | ||
| "quotaPill": "QUOTA", | ||
| "quotaModeOnly": "qtSd-only" |
There was a problem hiding this comment.
SUGGESTION: The translation "qtSd-only" is a nonsensical placeholder that doesn't convey meaning to users. Consider using a meaningful label like "Quota-only" or "Quota mode" instead.
| "normalKeysSection": "Chaves normais", | ||
| "quotaKeysSection": "Chaves de cota", | ||
| "quotaPill": "QUOTA", | ||
| "quotaModeOnly": "só-qtSd" |
There was a problem hiding this comment.
SUGGESTION: The translation "só-qtSd" is a nonsensical placeholder that doesn't convey meaning to users. Consider using a meaningful label like "Cota apenas" or "Modo cota" instead.
…(Check 2.9) E2E testing on the VPS showed a normal key (empty allowedQuotas) could call a qtSd/<group>/<provider>/<model> virtual model and route through a shared quota pool — because the quota-exclusive enforcement (Check 3) only ran when allowedQuotas was non-empty, so an unallocated key fell through to the normal model checks and qtSd was served. This is the "empty allowedQuotas = all pools" gap from the redesign. Add Check 2.9 in enforceApiKeyPolicy: if the requested model is a qtSd model and the key is NOT allocated to any quota pool (allowedQuotas empty), reject 403 QUOTA_NOT_ALLOCATED. Allocated keys are unchanged (Check 3 still validates scope). This matches the owner's rule: only a key selected in a pool may use its qtSd models. Normal (non-qtSd) model access for normal keys is unchanged. Test: tests/unit/apikeypolicy-quota-only.test.ts — new case asserts a non-quota key is blocked from qtSd (QUOTA_NOT_ALLOCATED) yet still uses normal models.
…ota sync The quota-sync path deliberately reuses a rotating-refresh provider's (Codex/ OpenAI/Claude — see refreshSerializer ROTATION_LOCK_GROUP) access_token WITHOUT proactively refreshing it (#3019, to avoid the Auth0 family-revocation cascade). When that token is expired the codex usage fetch returns "token expired", and syncExpiredStatusIfNeeded then flagged the connection testStatus="expired" — a false-negative: the credential is still valid (expires_at in the future) and the reactive serialized 401 path refreshes the access_token on next use. Symptom: freshly-added Codex accounts showed "expired" with no quota on the quota page, while a providers-page refresh turned them green. They never lost access — only the quota sync mislabeled them. Fix: extract the decision into the pure, exported `quotaPathShouldMarkExpired()` and skip rotating providers (rotationGroupFor !== null). Their status is owned by the reactive path / connection test, never the quota sync. Adds unit coverage.
…ialized refresh) Symptom: freshly-added Codex accounts (e.g. davi/gabriel) showed "No quota data" even when healthy. Root cause: the quota path reuses the access_token without refreshing rotating providers (#3019, anti Auth0 family-revocation cascade), so a Codex account whose short-lived access_token has expired can never surface quota from the sync — the live fetch returns "Codex token expired". Fix (opt-in, cascade-safe): - refreshAndUpdateCredentials gains `allowRotatingRefresh` + a pure exported gate `shouldAttemptRotatingRefresh`. The actual token mint is wrapped in `serializeRefresh` (one refresh at a time per Auth0 rotation group) — so even N concurrent per-account requests can never refresh siblings in parallel. - The BULK scheduler (syncAllProviderLimits, concurrent) keeps the flag OFF → #3019 fully preserved (guardian test codex-quota-sync-no-proactive-refresh stays green). Only the on-demand, per-connection path (`GET /api/usage/[connectionId]`) opts in. - Frontend: the quota page auto-fetches LIVE on open for the VISIBLE connections that have no cached quota (scoped to what's on screen — not all connections — and skips entries already cached), so expired-token Codex accounts surface real quota automatically and cascade-safely. Adds unit coverage for the gate (bulk skips rotating, on-demand allows; non-rotating always eligible). typecheck / lint clean.
…c mitm manager stub The Docker image build (`docker compose --profile cli build`) runs `next build` with OMNIROUTE_USE_TURBOPACK=1 and failed with two Turbopack errors that the webpack-based VM build never hits — which is why the VM deploy validated but the Docker build errored (#3066). The reporter's log was truncated before the real errors; reproducing `OMNIROUTE_USE_TURBOPACK=1 npm run build` locally surfaced them: 1. node_modules/sqlite-vec-linux-x64/vec0.so — "Unknown module type". sqlite-vec ships a native vec0.so loaded at runtime via createRequire(); Turbopack tried to bundle the .so. Fixed by adding "sqlite-vec" to serverExternalPackages, exactly like better-sqlite3. 2. /api/tools/agent-bridge/state statically imports getAllAgentsStatus from @/mitm/manager, which next.config aliases to manager.stub.ts for the Turbopack build. The stub did not export getAllAgentsStatus → "Export getAllAgentsStatus doesn't exist in target module". Added the export (throws like the other heavy ops — MITM/agent-bridge is non-functional in the bundled build anyway). Tests (tests/unit/next-config.test.ts): - assert sqlite-vec is in serverExternalPackages. - new guard: manager.stub.ts must export every name statically imported from @/mitm/manager across src/app (catches stub/manager drift — would have caught this). Verified: OMNIROUTE_USE_TURBOPACK=1 npm run build → EXIT 0 (was: Build error occurred); webpack build → EXIT 0; typecheck:core / check:cycles / lint clean. Fixes #3066
… (review feedback) Follow-up to 146244b (#3066), addressing optional review suggestions: - manager.stub.ts: getAllAgentsStatus now returns [] (the truthful "no agents" state, type-faithful) instead of throwing. Unlike the dynamic-import heavy ops, this is a STATIC import baked into the Turbopack/bundled build, so it is legitimately reached at runtime there — returning an empty list degrades gracefully instead of erroring. (Functionally inert for the existing agent-bridge/state route, where getMitmStatus already rejects first.) - next-config.test.ts: the stub-drift guard no longer hard-asserts a specific symbol (getAllAgentsStatus); the generic ">=1 import found" sanity plus the missing-exports check remain, so the guard survives an agent-bridge / traffic-inspector route being renamed or removed. typecheck:core / lint / next-config suite (4/4) clean. The export still exists, so the Turbopack build resolution is unchanged.
… review) Addresses findings from the multi-agent PR review of the #3066 fix: - manager.stub.ts comments: the previous inline comment claimed the throwing ops (getMitmStatus/startMitm/stopMitm) are "dynamic-import paths that should never hit the stub at runtime" — factually wrong: those are static imports too, baked into the bundled build just like getAllAgentsStatus. Rewrote the file header to describe the real split — exports with a safe degraded value return it (getCachedPassword/ setCachedPassword/clearCachedPassword → null/no-op, getAllAgentsStatus → []) while getMitmStatus/startMitm/stopMitm throw STUB_ERROR — and trimmed the inline comment. Comment-only; no runtime/build change (the export still exists). - stub-drift guard test: now scans ALL of src/ instead of only src/app — src/lib/tailscaleTunnel.ts statically imports getCachedPassword/setCachedPassword from @/mitm/manager and is pulled into routes transitively, so the src/app-only scan had a false-negative blind spot. Also skips inline `type` imports (erased at build, need no runtime export) and detects stub exports from declaration AND `export { … }` forms (no false-positive if the stub later uses class/re-export). Verified: next-config suite 4/4, typecheck:core / lint clean.
…emory in Docker) Completes the #3066 fix. Externalizing sqlite-vec unblocked the Turbopack build, but Next.js does not trace sqlite-vec's platform-specific native package (sqlite-vec-<os>-<arch>, which ships vec0.so) into .next/standalone — sqlite-vec resolves it at runtime via require.resolve() (Next.js issue #88844). Result: in the bundled/Docker build the wrapper loaded but getLoadablePath() threw MODULE_NOT_FOUND, so vectorStore silently degraded vector/semantic memory to FTS5 keyword search. build-next-isolated now syncs the sqlite-vec wrapper plus whichever sqlite-vec-<platform> package npm installed into the standalone output (mirroring the existing better-sqlite3 native-binary handling). Platform-agnostic, so Docker (linux) and Electron (mac/win/linux) builds all carry their matching vec0.so/.dylib/.dll. Verified: vec0.so present in .next/standalone/node_modules/sqlite-vec-linux-x64; createRequire("sqlite-vec") + require.resolve("sqlite-vec-linux-x64/vec0.so") both resolve from inside the standalone (no FTS5 fallback). build-next-isolated tests 7/7.
Code reviewNo issues found. Checked for bugs and CLAUDE.md compliance. Scope: the #3066 Docker/Turbopack build-fix commits added to this release branch ( 🤖 Generated with Claude Code |
Back-merge to resolve PR #2930 (release/v3.8.8 -> main) conflicts. Release is a superset of main's features, so all ~44 content conflicts resolved to the release ("ours") version; generated .source/* dropped. Reconciliation: - auth.ts: port #3058 (getProviderSearchPool expands custom provider_nodes prefixes to internal connection ids) — release lacked this main fix. - quota-plan-registry.test.ts: align knownProviders() 6 -> 10 (pre-existing stale assertion vs the registry).
The "recent" memory strategy maps to the internal "exact" retrieval path, whose post-query relevance filter (score > 0) silently dropped recent memories whose text didn't overlap the current prompt. Since the user-facing strategy enum is only recent|semantic|hybrid (no "exact"), forwarding the prompt as `query` for "recent" always engaged that filter, so recency-based injection returned nothing when the prompt was unrelated to the stored memory. Skip query forwarding for the "recent" strategy so retrieveMemories returns the most recent memories (ORDER BY created_at DESC) regardless of prompt overlap. Semantic/hybrid still forward the query for vector search. Fixes the chat-pipeline + memory-pipeline integration memory-injection tests.
…-provider credential lookup
Two pre-existing failures (release CI never ran): 1. Auth: the /api/compliance/audit-log route now requires management auth (requireManagementAuth). The test issued bare requests; in CI INITIAL_PASSWORD makes auth required, so it got 401. Now attaches a signed dashboard-session cookie via the shared managementSession helper (like sibling management-route tests). 2. Taxonomy: the test seeded stale action names (provider.added, combo.created) and treated provider.validation.ssrf_blocked as non-high. Aligned the seed to real HIGH_LEVEL_ACTIONS (provider.credentials.created, quota.pool.created) so the level=high filter assertion validates the actual filter.
…3058 follow-up) In checkModelAvailable and handleSingleModelChat, when the combo target's providerId is merely the prefix already encoded in the model string (e.g. "p2" from "p2/test-model"), prefer the fully-resolved provider (e.g. the generated custom node id openai-compatible-chat-e2e-p2) so the executor resolves the custom baseUrl from the connection instead of falling back to the base provider (real openai). Intentional providerId overrides (providerId not encoded in the model string) are preserved. Also fixes the resilience-http-e2e combo tests (cooldown window + DB-write visibility for the cooled-down-primary skip).
- quota-equal-split / quota-summed-budget: drop top-level `await` from test() registrations. Under --test-force-exit --test-concurrency=4 the awaited registrations were cancelled mid-module-eval when a sibling's slow SQLite migration briefly emptied the event loop. No assertions changed. - proxy-registry-flow: the legacy /api/settings/proxy GET is now a unified bridge over the new proxy registry; after an atomic create-with-assignment it resolves to the newly assigned proxy (atomic-flow) and supersedes the legacy config — assert that instead of expecting null. - e2e: agent-skills redirect regex now matches the bare /login auth redirect; memory-qdrant uses the unique heading locator (strict-mode fix); group-b specs navigate to the real pages / tolerate the auth redirect like sibling specs; playground-compare checks the toolbar control (Run all|Cancel all) per state.
- quota-combo-balancing / quota-multiprovider: eliminate the per-test full SQLite migration (migrate once at module load; resetStorage now DELETEs rows instead of rmSync+re-migrate) so it no longer races --test-force-exit under concurrency; drain the fire-and-forget syncQuotaCombosGuarded dispatched by createPool/ updatePool (flushPendingSyncs via setImmediate) so assertions see deterministic combo state; assert the GROUP-slug combo name (combos are named by group, like quota-combo-groups) and seed the group. Validated on CI shards 5/8 + 8/8 (7 runs). - playground-compare: wait for CompareTab (dynamic import) to mount, and use expect().toBeVisible() instead of locator.isVisible() (which no longer waits in Playwright 1.50+). - group-b-quota-plans-config: drop the unreliable raw-HTML "500" substring check (Next.js chunk hashes contain "500"); keep the real error-boundary text check.
Release v3.8.8
Merges
release/v3.8.8→main. Full changelog for this release:[3.8.8] — 2026-06-01
Added
src/lib/plugins/,/api/plugins/*,/dashboard/plugins) — hooks + registry unification, plugin SDK (definePlugin), worker-thread sandbox, per-plugin hook rate limiting, SHA-256 integrity verification, semver-gated upgrade, and execution analytics. Plugin routes are loopback-only (isLocalOnlyPath) andchild_processexec is opt-in viaOMNIROUTE_PLUGINS_ALLOW_EXEC. (feat(plugins): custom hooks + registry unification + SDK + marketplace + dashboard [deferred to v4.0.0-rc1] #2913 / feat: plugins framework (#2913) + disable-non-public-models (#3017) — integrated + security-hardened #3041 — thanks @oyi77)onResponsehook into the chat success path, loads active plugins on server startup so they survive restarts (pluginManager.loadAll()inserver-init), and ships awelcome-bannerexample plugin (examples/plugins/) plus a comprehensive plugin test suite. (feat(plugins): comprehensive test suite + welcome banner PoC #3045 — thanks @oyi77)auto/*/qtSd/*routing still allowed). (feat: add a switch to disable use of non-published models #3017 — thanks @androw)open-sse/services/sessionPool/) — pooledcookie/session manager with round-robin fingerprint rotation (distinct fingerprint per pooled
session), per-session cooldown/backoff, and a provider-agnostic
webExecutorWrapper. Adds poolsupport for DuckDuckGo Web and LLM7 providers and an MCP
poolToolstoolset. (refactor: Make SessionPool modular & provider-agnostic #2954 / refactor: Make SessionPool modular & provider-agnostic #2978 — thanks @oyi77)/dashboard/tools/agent-bridge) — MITM proxy consolidating 9 IDE agents(Antigravity, Kiro, GitHub Copilot, OpenAI Codex, Cursor IDE, Zed Industries, Claude Code,
Open Code, Trae stub) with server card, per-agent setup wizard, model mapping table,
bypass list, upstream CA cert support, and redirect from legacy
/dashboard/system/mitm-proxy.See
docs/frameworks/AGENTBRIDGE.md. (feat(mitm,inspector): AgentBridge + Traffic Inspector (planos 11+12 / Group A) #2858 — thanks @diegosouzapw)/dashboard/tools/traffic-inspector) — LLM-aware HTTPS debugger with4 capture modes (AgentBridge hook, Custom Hosts DNS, HTTP_PROXY :8080, System-wide proxy),
DevTools split UI, 7 detail tabs (Conversation, Headers, Request, Response, Timing, LLM Details,
Stats), resizable panels, session recording (.har/.jsonl export), SSE stream merger,
conversation normalizer (multi-provider), system-prompt fingerprint colorization, and annotations.
See
docs/frameworks/TRAFFIC_INSPECTOR.md.src/mitm/handlers/) —MitmHandlerBaseabstractclass with
hookBufferStart/hookBufferUpdatefor Traffic Inspector integration; concretehandlers for all 9 agents.
src/mitm/targets/) — declarativeMitmTargetshape per agent;emits
DATA_DIR/mitm/targets.jsonfor dynamicserver.cjsresolution.src/mitm/inspector/) —TrafficBufferin-memory ring,kindDetector,sseMerger(MIT port from chouzz/llm-interceptor),conversationNormalizer(MIT port),
contextKeyfingerprinting,httpProxyServer,systemProxyConfig.src/mitm/passthrough.ts) — TCP tunnel fornon-mapped hosts; bypass list with default sensitive-host patterns + user-defined patterns.
src/mitm/upstreamTrust.ts) —AGENTBRIDGE_UPSTREAM_CA_CERTforcorporate TLS environments.
src/mitm/maskSecrets.ts) — sk-/Bearer/generic token masking beforeany log or Traffic Inspector broadcast.
agent_bridge_state,agent_bridge_mappings,agent_bridge_bypass,inspector_custom_hosts,inspector_sessions,inspector_session_requests./api/tools/agent-bridge/(12 routes) and/api/tools/traffic-inspector/(16+ routes). All LOCAL_ONLY + SPAWN_CAPABLE.agentBridge.*andtrafficInspector.*namespaces;all other locales fall back to EN automatically.
tests/e2e/agent-bridge.spec.ts,tests/e2e/traffic-inspector.spec.ts,tests/e2e/agent-bridge-traffic-cross.spec.ts(skip-gated on CI by
RUN_AGENT_BRIDGE_E2E/RUN_TRAFFIC_INSPECTOR_E2E/RUN_CROSS_E2E).docs/frameworks/AGENTBRIDGE.mdanddocs/frameworks/TRAFFIC_INSPECTOR.md;docs/architecture/REPOSITORY_MAP.mdupdated;docs/reference/openapi.yamlupdated with~28 new routes and 20+ new schemas.
allowedQuotas),quotaShared-*routing models via combos, a 3-step pool wizard (legacy Plans page retired), endpoint + key preview, and full pool editing. Adds quota-pool DB migrations. (feat(monitoring,costs,quota): Monitoring reorg + Costs section + Quota Share Engine (planos 16+22) #2859 / feat(quota): Pool Groups + correções de enforcement → v3.8.8 (+ authz peer-IP, screen fixes) #3022 / feat(quota): Quota Share v2 — nav move, 3-col grouped layout, endpoints+key preview, full pool edit #3032 — thanks @diegosouzapw)/batch+/batch/filesredesign (feat(batch): functional & explanatory redesign for /batch + /batch/files #2849); Playground Studio + Search Tools Studio (feat(playground,search-tools): Playground Studio + Search Tools Studio (planos 17+18) #2869); memory engine redesign — sqlite-vec + hybrid RRF + Studio UI (feat(memory): memory engine redesign — sqlite-vec + hybrid RRF + Studio UI (plan 21) #2873). (thanks @diegosouzapw)notion_search,notion_list_databases,notion_get_database,notion_query_database,notion_read,notion_append_blocks) scoped underread:notion/write:notion, with dashboard "Context Sources" tab, settings API, and token persistence inkey_valuetable (feat(notion): add Notion MCP context source with 6 tools, dashboard tab, and 20 tests #2959 — thanks @branben)077_api_key_stream_default_mode), so integrations that expect non-streaming JSON work without client changes. (thanks @JxnLexn)Changed
agent-bridgeandtraffic-inspectoritems aftercloud-agents./api/tools/agent-bridge/and/api/tools/traffic-inspector/added toLOCAL_ONLY_API_PREFIXESand
SPAWN_CAPABLE_PREFIXESinsrc/server/authz/routeGuard.ts..env.example: documented 9 new env vars (AGENTBRIDGE_UPSTREAM_CA_CERT,INSPECTOR_BUFFER_SIZE,INSPECTOR_HTTP_PROXY_PORT,INSPECTOR_HTTP_PROXY_AUTOSTART,INSPECTOR_TLS_INTERCEPT,INSPECTOR_SYSTEM_PROXY_GUARD_MINUTES,INSPECTOR_MAX_BODY_KB,INSPECTOR_MASK_SECRETS,INSPECTOR_LLM_HOSTS_EXTRA,INSPECTOR_INTERNAL_INGEST_TOKEN).Fixed
POST /api/providers/[id]/refresh(the manual/auto "refreshtoken" endpoint) no longer rotates rotating-refresh providers (Codex/OpenAI share
one Auth0
client_id). This was the last unguarded proactive-refresh entry point:when the dashboard auto-refreshed every expiring connection on a page load (or an
old cached frontend bulk-called it), each Codex account's single-use refresh_token
was rotated, and Auth0 revoked the whole token family (
openai/codex#9648) — everyaccount but the last died with
[403] <!DOCTYPE. The endpoint now skips proactiverotation for rotating providers and defers to the reactive, serialized 401 path
(same guard as
refreshAndUpdateCredentialsand the connection-test route).Codex multi-account setups. The quota-sync path
(
refreshAndUpdateCredentials) proactively refreshed every connection — forrotating-refresh providers (Codex/OpenAI share one Auth0
client_id) itrefreshed siblings concurrently, so Auth0 revoked the whole token family
(
openai/codex#9648) and every account but the last died with[403] <!DOCTYPE html>. The quota path now skips proactive refresh forrotating providers (
rotationGroupFor) and reuses the current access_token,deferring genuine expiry to the reactive, serialized 401 path. Defense in
depth:
serializeRefreshnow leaves a settle gap between two queued siblingrefreshes (default 2000 ms, tunable via
CODEX_REFRESH_SPACING_MS,"0"toopt out) while releasing a lone refresh immediately, so the reactive path adds
no latency.
in-memory override is set (fresh process before the boot hook ran, or a
separate module instance in the standalone build),
getPayloadRulesConfignow reads the DB-persisted rules (the source of truth) before the file config,
instead of silently returning the empty file default. ([BUG] Payload Rules not persisting across server restart #2986)
targetFormatoverride (e.g. an opencode-go custom model that must use the Anthropic Messages
shape). Previously custom models always routed as OpenAI-compatible because
targetFormatwas neither persisted nor consulted at routing time. Threadedthrough
addCustomModel/replaceCustomModels/updateCustomModel, the APIschema/route,
getModelInfo, and chatCore's targetFormat resolution. ([BUG] opencode-go: custom models added via UI use wrong targetFormat (oa-compat) — custom model save appears to succeed but model still routes as OpenAI-compatible #2905)gen.pollinations.ai/v1instead of theretired
text.pollinations.aihost, which now returns404 "legacy API"forall models. The gen gateway is the current OpenAI-compatible endpoint. ([BUG] Pollinations provider returns 404 'legacy API' for all models #2987)
image_generationhosted tool forfree-plan Codex accounts (
workspacePlanType === "free"), which can't run itserver-side and would otherwise get an upstream 400. Paid plans keep it.
(mirrors CLIProxyAPI's free-plan guard; spun off from the [Feature] Add Codex OAuth Login Compatibility Mode for ChatGPT Account + API Routing #2980 analysis)
openai-compatible-*/anthropic-compatible-*)now show their user-given node name instead of the raw UUID id across the
active-requests panel, proxy logger, and home-page provider topology. The
display-label resolver was extracted into a shared util reused by all surfaces
(previously only the request-log viewer resolved it). ([BUG] Provider name not using given name, still using ID #2968)
CMD) now honorsOMNIROUTE_MEMORY_MB(default 512, clamped [64, 16384]) and overrides theimage
NODE_OPTIONSfallback, fixing random OOM crashes under load / withlarge SQLite DBs. Previously only
omniroute servehonored the knob. ([BUG] OOM error randomly with docker images #2939)webcompose profile (omniroute-web, targetrunner-web,image
omniroute:web) so web-cookie providers (gemini-web, claude-web,claude-turnstile) work out of the box — the default
baseimage ships withoutChromium/Playwright, which made those providers fail with
"Executable doesn't exist at .../ms-playwright/chromium...". ([BUG] Web cookies provider does not work in docker #2832)
account, a bare
gpt-5.5Responses request was rerouted to codex with themodel hardcoded to
gpt-5.5-medium(chatHelpers.ts); the executor read that-mediumsuffix as an explicitmodelEffortthat (per [BUG] GPT-5.5 Codex suffix aliases are overridden by client default reasoning.effort #2331) overrode aclient
reasoning.effort=xhigh, silently demoting it — now it keeps the baregpt-5.5id so the client effort wins. (B)gpt-5.5-xhigh/-high/-lowmisrouted to
openai(→ "No credentials" for codex-only users); the suffixedvariants are now in
CODEX_PREFERRED_UNPREFIXED_MODELSso they infer codex.const settingsdeclaration inhandleChatCore(introduced alongside the per-key stream-default-modefeature). The same-scope redeclaration made esbuild/tsx fail with
"The symbol 'settings' has already been declared", which turned every unit
test that imports chatCore red and broke the production build. The earlier
consolidated
settingsconst is now reused.077migration version collision(
077_api_key_stream_default_mode.sqlvs077_quota_pools.sql) that madegetMigrationFiles()throw and blockedgetDbInstance()at startup (app wouldnot boot; every DB-touching test was red). Renumbered the dependency-free,
idempotent
quota_poolsmigration to085, kept the non-idempotentapi_key_stream_default_modeALTERat077, added a retroactiveisSchemaAlreadyAppliedguard (case085), and a regression test enforcingunique migration prefixes.
big-pickle(provideropencode/ocand
opencode-zen) now declares the interleavedreasoning_contentcontractvia a new
RegistryModel.interleavedFieldfield, so follow-up/tool-use turnsreplay reasoning_content. Previously
big-picklematched no replay pattern andfailed with
[400] The reasoning_content in the thinking mode must be passed back to the API(its DeepSeek-thinking upstream is not detectable from themodel id, and
requiresReasoningReplaydoes not consumesupportsReasoning).getResolvedModelCapabilitiesnow surfaces the registryinterleavedField. ([BUG] OpenCode big-pickle fails DeepSeek thinking-mode reasoning_content replay #2900)models (
claude-opus-4.7,claude-opus-4-5-20251101,gemini-3.1-pro-preview,gemini-3-flash-preview) no longer carrytargetFormat: "openai-responses", sothey route through
chat/completions(the provider default, like the workingclaude-opus-4.6) instead of the Responses API, which Copilot does not serve fornon-OpenAI models (returned
[400]). Native OpenAIgpt-*models keep theResponses API. ([BUG] Built-in GitHub Copilot listed Claude Opus/Gemini models fail due to API format/routing #2911)
image_generationhostedtool into every Responses API request (even text-only ones), which OmniRoute
rejected with
[400] image_generation tool type is not supported. It is nowtreated like
tool_search: allowed past the tool-type validator and droppedsilently from the tools array before forwarding to Chat Completions. ([BUG] Responses API: Codex Desktop automatically sends image_generation tool for text-only requests #2950)
oc/routingalias instead of the
opencode/prefix.parseModel("opencode/<model>")resolves to the
opencode-zenapi-key tier (via a manualALIAS_TO_PROVIDER_IDoverride), so combos built with the bare provider id misrouted away from the
no-auth
opencodeprovider;oc/<model>resolves correctly. ([BUG] OpenCode Free combo entries use opencode/ prefix instead of oc/ #2901)403(e.g. Fireworks Fire Passfpk_*keys returning "…not authorized for this route." on/models, whilechat still works) no longer marks the connection unavailable. Provider
validation falls through to the chat probe for such 403s instead of returning
"Invalid API key", and
checkFallbackErrorshort-circuits them to no cooldown.Genuine auth failures (401 / generic 403) still fail fast. (Fire Pass (fpk_*) API keys incorrectly marked as unavailable when model listing returns 403 #2929)
and combos without an API key.
opencode-zenserves the public, signup-freeendpoint (
https://opencode.ai/zen/v1); when no api-key connection isconfigured, credential resolution now falls back to anonymous (no-auth) access
instead of failing with "No credentials for provider: opencode-zen". A
configured, active key is still used when present. ([BUG] Playground: The opencode free model cannot be used to test the conversation (Error: No credentials for the provider: opencode-zen) #2962)
[400] Messages with role 'tool' must be a response to a preceding message with 'tool_calls'when a Codexclient sent a
function_callwith an empty/missingcall_id. The orphanedfunction_call_outputpreviously slipped past the orphan filter. Nowempty-
call_idfunction calls are skipped (no dangling assistant tool_call)and any tool result without a matching tool_call id is dropped. ([BUG] Messages with role 'tool' must be a response to a preceding message with 'tool_calls' #2893)
proxiflynpm dependency (fix(deps): remove theproxiflynpm dependency #3000 — thanks @terence71-glitch)anthropic-betaflags from live captures #3010 — thanks @Tentoxa)/app/datapermission check instead ofexit 1, so a non-writable bind mount no longer kills the container at boot (fix(docker): warn-only on /app/data permission check, remove exit 1 #3036 — thanks @wussh)enforceScopesguard beforeMCP_TOOL_MAPlookup, add inlinescopesparameter towithScopeEnforcement(), and declare scopes on all 24 dynamic tool definitions (memory, skills, plugins, gamification, compression) to fix scope enforcement for dynamic MCP tool groups (fix(mcp): reorder enforceScopes guard before MCP_TOOL_MAP lookup, add scopes to all dynamic tool definitions #2958 — thanks @branben)of the live V8 heap ceiling (floor 400 MB) instead of a fixed 200 MB that sat below
the app's ~260 MB baseline and returned
503 Service temporarily unavailable due to resource pressurefor every request once the heap warmed up. It now tracks--max-old-space-sizeacross 1 GB / 2 GB / large VPS;HEAP_PRESSURE_THRESHOLD_MBstill overrides. (fix(sse): auto-calibrate heap-pressure threshold to the V8 heap ceiling #3052)
cf_clearancecookie support and session-pool fingerprint rotation for Pollinations / DuckDuckGo (fix(claude-web): cf_clearance + fix(pollinations/duckduckgo): wire session pool #3046 — thanks @oyi77)🏆 Contributors
A special thanks to everyone who contributed to this release — 687 commits since
v3.8.7:A special thanks to everyone who contributed code, reviews, and tests for this release:
@androw, @bobbyunknown, @branben, @charithharshana, @Chewji9875, @CitrusIce, @dangeReis, @dhaern, @diegosouzapw, @guanbear, @herjarsa, @JxnLexn, @Lion-killer, @makcimbx, @NekoMonci12, @NomenAK, @oyi77, @ReqX, @S0yora, @soyelmismo, @Tentoxa, @terence71-glitch, @wussh, @xz-dev