Skip to content

Merge release/v3.8.8 into main#3076

Closed
diegosouzapw wants to merge 711 commits into
mainfrom
chore/merge-v3.8.8
Closed

Merge release/v3.8.8 into main#3076
diegosouzapw wants to merge 711 commits into
mainfrom
chore/merge-v3.8.8

Conversation

@diegosouzapw
Copy link
Copy Markdown
Owner

Summary

Promotes the active release line v3.8.8 to main (709 commits ahead). Release is a superset of main's features (Trae, Notion MCP, SiliconFlow, OOM fixes, MaintenanceBanner), so the ~44 content conflicts were resolved to the release version; generated .source/* files (gitignored) were dropped.

Conflict resolution

  • Conflicts → release version (release is the superset / promotion source).
  • main-only content preserved (auto-merged, not in conflict): ProviderQuotaWidget (fix(home): pass providerId to quota widget icons #3064), combos/page.tsx, and 2 main-only tests. Verified the merge differs from release/v3.8.8 in only these items.

Post-merge reconciliation (commit fix: post-merge reconciliation)

  • CHANGELOG.md: restored release's authoritative changelog (auto-merge had re-introduced main's stale Unreleased section).
  • src/sse/services/auth.ts: ported fix(combo): align custom provider ids across creation and auth lookup #3058getProviderSearchPool expands custom provider_nodes prefixes (e.g. 78code/gpt-5.4) to internal connection ids during credential lookup; release lacked this fix. nvidia alias special-casing preserved.
  • tests/unit/quota-plan-registry.test.ts: aligned knownProviders() count 6 → 10 (registry gained claude/deepseek/kimi-coding/xiaomi-mimo). Pre-existing stale assertion on release.

Validation (local, clean npm ci)

  • ✅ lint (0 errors), check:route-validation:t06, check:any-budget:t11, check:docs-sync, check:cycles, check:node-runtime
  • typecheck:core, npm run build
  • ✅ unit suite green (real failures fixed; remaining flakiness is the pre-existing quota/chatCore concurrency artifact under --test-force-exit, which passes in isolation)

🤖 Generated with Claude Code

diegosouzapw and others added 30 commits May 30, 2026 03:29
Conflicts: CLAUDE.md base; openapi union + i18n deep-merge (costsSection=Custos); .source regenerated (fumadocs-mdx, +5 docs); openapi.generated regenerated.

open-sse/mcp-server/server.ts: union — registers BOTH agentSkillTools (#2827) AND pluginTools (base plugin system) via two separate forEach loops; tool count sums both; skills handler keeps @ts-expect-error, plugins keeps @ts-ignore. server.ts type-safe (0 TS errors).
…s-pages-redesign

feat(skills): redesign agent-skills + omni-skills with dynamic 42-skill catalog + MCP/A2A discovery
…l redesign)

Conflicts: CLAUDE.md base; i18n en/pt-BR deep-merge — 3 apiManager keys resolved to base pt-BR translations (HEAD had stale EN), costsSection=Custos; .source --theirs+regenerated. 40 other locales auto-merged. No migrations/open-sse. Batch redesign confirmed complete in prior code review.
…-files-functional-redesign

feat(batch): functional & explanatory redesign for /batch + /batch/files
…n — sqlite-vec + RRF + Studio)

Conflicts: migration 073_memory_vec->083; localDb/.env/REPOSITORY_MAP union; request.ts->base; i18n auto-merged; .source regenerated (+3 docs).

openapi: --theirs base + surgically inserted 14 memory paths + 4 schemas + Memory tag via js-yaml extract (union-blind broke YAML structure). +938 lines, base formatting preserved, gen-openapi validates.

deps: @huggingface/transformers + sqlite-vec added (package.json); npm install ran, lock regenerated. chatCore auto-merged (memory + quota hooks coexist, transform OK). typecheck:core 0 errors.
…y-engine-redesign

feat(memory): memory engine redesign — sqlite-vec + hybrid RRF + Studio UI (plan 21)
…ols Studio — plans 17+18)

Conflicts: migration 076_playground_presets->084; localDb/.env union; REPOSITORY_MAP dedup; package.json keep base (version 3.8.7, coverage --functions 40); .source regenerated (+3 docs); deps cli-table3/wtfnode/@types/bun/uuid (npm install).

i18n pt-BR: 19 collisions — 18 playground keys -> HEAD pt-BR translations (base had untranslated EN: Send->Enviar, Cancel->Cancelar etc), costsSection -> base Custos. Rule: prefer side != en.json (translated).

openapi: --theirs base + 3 playground/search paths + 2 schemas + 1 tag (js-yaml surgical insert; union-blind breaks YAML). No open-sse, typecheck:core 0 errors.
…ound-search-tools

feat(playground,search-tools): Playground Studio + Search Tools Studio (planos 17+18)
Buffer Claude Code Read tool calls through the existing shim layer so empty pages placeholders are removed before streaming input_json_delta to the client. Also clean JSON-string Responses tool arguments, not only object arguments.

Closes #2935
Addresses #2889
Translate Claude Code web_search_YYYYMMDD server tools to the native OpenAI Responses web_search tool and preserve filters/location. Convert forced Claude tool_choice for web_search to the native Responses tool choice while leaving ordinary custom functions unchanged.

Closes #2936
Only translate Claude Code web_search_YYYYMMDD server tools to native Responses web_search when the final target is OpenAI Responses. Keep the Chat Completions target on function-tool shape and cover the full translateRequest path.
Keep existing object-argument cleanup behavior, but avoid parsing and stripping arbitrary JSON-string arguments for unrelated tools where empty strings or arrays may be valid payloads. Add regression coverage for non-Read and non-object Read arguments.
Remove the AuditLogTab from the dashboard logs page now that audit logs
live under the dedicated /dashboard/audit route. Update integration wiring
expectations and add metadata frontmatter to studio framework docs.
…ded in-memory caches

Root cause: Bottleneck rate limiter instances in rateLimitManager accumulate
without cleanup. Each instance runs an internal heartbeat setInterval every
250ms. Under heavy load with many provider:connection:model combinations,
hundreds of limiters accumulate causing CPU to grow ~0.1%/min until server
collapse (~2% after 5 minutes of intensive use).

Changes:
- rateLimitManager: Add idle limiter eviction in watchdogTick() using the
  previously defined but unused INACTIVE_LIMITER_MS threshold. Populate
  limiterLastUsed on every getLimiter() call. Clean up all 3 Maps
  (limiters, lastDispatchAt, limiterLastUsed) consistently.
- combo.ts: Add size-based FIFO eviction to rrCounters, resetAwareConnectionCache,
  and resetAwareQuotaCache Maps. Convert per-target log.info calls in combo
  execution loops to log.debug?. to reduce serialization overhead.
- chatCore.ts: Fix double-serialization in estimateTokens(JSON.stringify(x))
  calls (estimateTokens already handles objects). Make trace() conditional
  on OMNIRROUTE_TRACE/DEBUG env vars. Make per-request usage logging conditional.
- apiKeyRotator.ts: Add eviction guards to _keyHealth and _connectionExtraKeys
  Maps (MAX 500 entries each). Ensure removeConnectionIndex cleans all 3 Maps.
- codexQuotaFetcher.ts: Add eviction guard to connectionRegistry and quotaCache
  Maps (MAX 200 entries each).
…native Claude OAuth

Native Claude OAuth (claude->claude passthrough) forwards client tool
definitions verbatim. Anthropic's first-party Messages API then rejects:
  - invalid tool input_schemas (deep-truncation placeholders such as
    `enum: "[MaxDepth]"`, or index-keyed objects where arrays are required), and
  - tool names it fingerprints as a third-party agent harness (specific
    blacklisted names like `mixture_of_agents`, or a large enough set of
    recognizable snake_case agent tool names),
both surfaced as a misleading `400 You're out of extra usage` placeholder
(the SSE stream is refused — not a real billing event). The same request
succeeds on translator-backed providers (OpenAI/Codex), which already sanitize
and re-shape tool payloads — so the gap is specific to the native passthrough.

Adds the missing guards on the native Claude OAuth path (executors/base.ts):
  - sanitizeClaudeToolSchemas(): coerce/drop invalid draft-2020-12 constructs
    (non-array enum/required/anyOf/..., placeholder schema slots -> {}).
  - cloakThirdPartyToolNames(): deterministically alias non-Claude-Code tool
    names (Claude Code canonical mapping where one exists, else PascalCase),
    tracked in the existing per-request _toolNameMap so remapToolNamesInResponse
    restores the caller's original names. Opt out via
    CLAUDE_DISABLE_TOOL_NAME_CLOAK=true.

Genuine Claude Code tool names (PascalCase) and already-valid schemas are
left untouched, so existing first-party traffic is unaffected.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Extract 3 high-value CPU/RAM optimizations from perf branch:

1. estimateSizeFast() — fast object-tree size estimator replacing
   JSON.stringify().length in isSmallEnoughForSemanticCache(). Walks
   object tree with a stack, zero string allocation, early exit at 256KB.

2. Consolidate settings reads — move getCachedSettings() to a single
   early read in handleChatCore(), eliminating a redundant second read
   200 lines later. Also removes the isDetailedLoggingEnabled() wrapper
   call (reads settings internally) in favor of direct field check.

3. Registry Proxy→direct export — convert 8 registries from lazy
   Proxy+getOrCreate pattern to simple exported const objects. Eliminates
   Proxy trap overhead on every provider property access during routing.
   Affected: audio, embedding, image, moderation, music, rerank, search,
   video registries (-451 lines of Proxy boilerplate).

These changes are independent of the CPU leak fix (limiter eviction)
and complement it by reducing per-request CPU overhead.
… null-guards, docs

Follow-up commit on PR #2943 review:

- Preserve boolean schemas in `sanitizeClaudeToolSchemas` (Gemini Code Assist,
  high severity). `additionalProperties: false` is the canonical JSON Schema
  lock-down for object tools; the previous coercion silently turned it into the
  permissive `{}`, which would invite models to hallucinate extra arguments
  during tool calling. Same rule now applies to per-property boolean schemas
  under `properties`. Placeholder strings still get the permissive `{}` slot —
  booleans get preserved verbatim.

- Defensive null guards in `cloakThirdPartyToolNames` for `tools[]` and
  `messages[]` entries that might be `null`/`undefined`. Prevents a runtime
  `TypeError` if a malformed payload reaches the cloak.

- Document `CLAUDE_DISABLE_TOOL_NAME_CLOAK` in `.env.example` and
  `docs/reference/ENVIRONMENT.md` (env/docs contract was failing in CI).

- Regression tests covering all of the above (5 boolean preservation cases,
  2 null-tolerance cases).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ecutor path

The native Claude OAuth guard in executors/base.ts is bypassed when
`upstream_proxy_config.mode = cliproxyapi` routes the request through the
CliproxyAPI executor — it has its own execute()/transformRequest() and never
reaches BaseExecutor.execute(), so the cloak/sanitizer never ran for that
(common) deployment. Wire the same guards into
CliproxyapiExecutor.transformRequest (Anthropic-shape branch), composing with
the existing bisected `mcp_*` reserved-namespace rewrite:

- sanitizeClaudeToolSchemas() on transformed.tools.
- cloakThirdPartyToolNames() with skip = mcp-reserved, so applyMcpToolNameRewrite
  keeps authority over `mcp_*` (its bisected `Mcp_X` form) and the two reverse
  maps stay disjoint / single-hop. Both merge into the non-enumerable
  _toolNameMap the response stream already uses to restore the caller's names.

cloakThirdPartyToolNames is now non-mutating (clones changed entries) to respect
transformRequest's no-input-mutation contract, and takes an optional `skip`
predicate.

Verified end-to-end through the live CPA path: a real ~100-tool harness payload
that returned the "out of extra usage" placeholder now returns 200 with original
tool names restored on the response stream; `mcp_*` tools and genuine PascalCase
Claude Code tools are unaffected.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…t boundary

The /dashboard/tools/agent-bridge page (Server Component) passed ALL_TARGETS
directly to AgentBridgePageClient (a Client Component). Each MitmTarget carries
a `handler: () => Promise<...>` function, which Next.js forbids across the
Server/Client boundary, raising at SSR time:
  "Functions cannot be passed directly to Client Components ..."
This broke the whole page ("erro ao carregar").

Fix: introduce MitmTargetView = Omit<MitmTarget, "handler"> and pass a
sanitized array (ALL_TARGETS.map(({ handler, ...rest }) => rest)). The UI never
invokes handler, so behavior is unchanged. Adds a regression test asserting the
sanitized targets are function-free and JSON-serializable.
… round)

Addresses confirmed findings from an adversarial review of the prior commits:

- schema sanitizer: a truncation placeholder in a SCALAR annotation keyword
  (description/title/pattern/format) was coerced to {}, which is itself invalid
  draft-2020-12 and re-triggered the exact "input_schema is invalid" 400 the
  sanitizer exists to prevent. Placeholders are now only coerced to {} in
  subschema-expecting positions; scalar keywords are left untouched.
- schema sanitizer: numeric-string coercion is folded into
  stripInvalidSchemaConstructs so it also covers contains / propertyNames /
  additionalItems (which coerceSchemaNumericFields never visited).
- schema sanitizer: stop stripping the valid `default` keyword on the Claude
  native/passthrough surface (the #1782 default-strip is a translator concern;
  tool schemas here were previously forwarded verbatim). sanitizeClaudeToolSchema
  is now a single stripInvalidSchemaConstructs pass.
- tool-name cloak: consult TOOL_RENAME_MAP / EXTRA_TOOL_RENAME_MAP before the
  generic PascalCase fallback, so the CLIProxyAPI path uses the established
  fingerprint-evasion aliases (subagents->SubDispatch, session_status->CheckStatus,
  webfetch->WebFetch, ...) identically to the native path instead of weaker
  first-letter casing.
- kill-switch: CLAUDE_DISABLE_TOOL_NAME_CLOAK is now honoured inside
  cloakThirdPartyToolNames, so BOTH the native and CLIProxyAPI executor paths
  respect it (previously only base.ts did); .env.example + ENVIRONMENT.md updated.

Regression tests added for each. Verified end-to-end through the live CPA path:
mixture_of_agents, subagents, and a tool carrying placeholder descriptions and
`default` values all return 200 with original names restored on the response.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
oyi77 and others added 23 commits June 1, 2026 22:29
* chore: remove 9 dead/unreachable free providers

Verified via HTTP probe — API endpoints return 000/404/empty:
- freetheai, enally, replicate, lepton, poolside, nomic
- astraflow, petals, nanobanana (phantom: catalog but no registry)

Also removed from: providerRegistry.ts, validation.ts,
staticModels.ts, imageValidation.ts, open-sse/config/petals.ts

* chore: remove dead astraflow providers

Remove astraflow and astraflow-cn (UCloud) — API endpoints unreachable.
Remaining dead providers (enally, freetheai, nanobanana, replicate,
lepton, petals, poolside, nomic) have working main sites but dead API
endpoints — need API keys. Will remove in follow-up.

* chore: remove 9 dead/unreachable free providers

Removed: freetheai, enally, replicate, lepton, poolside, nomic,
astraflow, petals, nanobanana

All verified as dead via live API probes (000/404/empty responses).
Cleaned from providers.ts, providerRegistry.ts, validation.ts,
staticModels.ts, and imageValidation.ts.

---------

Co-authored-by: oyi77 <oyi77@users.noreply.github.com>
…on fixes (#3056)

Co-authored-by: Ruslan Sivak <russ@ruslansivak.com>
…sts)

#3054 ("remove 9 dead/unreachable free providers") removed the petals/nanobanana
configs, registry entries and validators but left dangling references that broke
the build and the unit suite on release/v3.8.8:

- open-sse/executors/petals.ts imported the deleted ../config/petals.ts
  (webpack "Module not found" → `next build` failed). Removed the executor, its
  registration + re-export in executors/index.ts, and the leftover
  `providerId === "petals"` branch in providerAllowsOptionalApiKey.
- Removed tests for the now-deleted providers: executor-petals.test.ts and
  poolside-provider.test.ts (REGISTRY.poolside was removed), and the petals /
  nanobanana validator assertions in provider-validation-specialty.test.ts,
  plus the stale petals catalog assertions in providers-page-utils.test.ts,
  proxy-connection-test.test.ts and providers-route-managed-catalog.test.ts.

The image/video/embed registries for nanobanana/replicate/nomic are real and
untouched — only the dead chat/api-key surfaces were removed. 146/146 affected
tests pass; typecheck / build clean.
…ropic + collapse)

Bugs found while testing the Quota Share engine on the local VPS:

- B1 hidden/stuck pools: pools created while the page group filter was "all"
  were persisted with group_id="all", matched no real group, and rendered
  nowhere — so they could not be seen, edited or deleted. PoolWizard now resolves
  the group id away from the "all" sentinel before POST/PATCH (falls back to the
  first real group / seed group-demo), and QuotaSharePageClient renders an
  "Ungrouped" recovery bucket so already-orphaned pools stay editable/deletable.
- B3 one-connection-per-pool made explicit: existingPoolConnectionIds now spans
  every member connection (not just the primary), and the wizard shows which pool
  an already-used connection belongs to instead of silently disabling it.
- B4 delete group: wired the missing UI control + handler (handleDeleteGroup,
  409-aware) — the backend DELETE handler + deleteGroup already existed. Hidden for
  "all" and the protected seed group-demo.
- B5a endpoints card now surfaces the native Anthropic POST /v1/messages line when
  a claude*/anthropic provider is in scope (previously only /v1/chat/completions).
- B5b endpoints card gained a collapse/minimize toggle (the card was too tall).

Source-scan tests + en/pt-BR i18n parity in quota-share-bugfixes-v388.test.ts.
The larger quota-key redesign (key type bound to a group, default-restricted with
opt-in normal-model access, recoverable keys, api-keys page layout) is planned
separately in _tasks/features-v3.8.8/quota-share-key-redesign.plan.md.
Remove the Petals executor from registration and exports.

Improve type safety by replacing broad any usage in MCP tool registration
with inferred types and documenting dynamic handler type limitations.

Add request validation for the agent bridge cert route and expand tests to
ensure switch buttons explicitly declare type="button", preventing implicit
form submissions.
…on fixes (#3059)

* fix(sse): defer enqueuing of event lines to align event names with data lines and prevent stop-signal event name misattribution

* fix(sse): preserve keep-alives and prevent pending event leakage on dropped chunks

* fix(sse): preserve pending event lines before other non-data lines and fix zero-window-size bypass

* fix(sse): defer lastEventLine update until after flush check to preserve previous event context on flush

* fix(sse): flush trailing pendingEventLine when stream closes

* fix(sse): preserve consecutive event lines without intervening data

---------

Co-authored-by: Ruslan Sivak <russ@ruslansivak.com>
…3061) (#3062)

No-auth / keyless providers (opencode, opencode-zen) returned synthetic
"noauth" credentials BEFORE honoring excludeConnectionIds, so the chat
account-fallback loop re-selected the same synthetic connection forever on
a persistent upstream error (e.g. the opencode public endpoint answering
401 "Model X is not supported"). The synthetic id has no DB row, so
markAccountUnavailable could not persist a cooldown to brake it — each
iteration wrote key-health + request logs immediately, growing the DB until
the disk filled (see @paraflu's "failure #320" trace in discussion #3038).

Honor the exclusion set in both synthetic-credential paths
(getProviderCredentials NOAUTH_PROVIDERS block + opencode-zen keyless
fallback): once "noauth" is already excluded, return null so the handler
stops after a single attempt. The happy path (nothing excluded -> synthetic
noauth) is preserved, so keyless access still works.

Closes #3061.

Tests (TDD): tests/unit/auth-noauth-fallback-loop-3061.test.ts — the two
exclusion cases failed before the fix and pass after; two happy-path guards
ensure first-selection synthetic noauth still resolves.
… not placeholders

The "Available endpoints" card's no-key (default) view generated representative
model ids from a hardcoded PREVIEW_MODELS_BY_PROVIDER map, so providers absent
from that map (claude, xiaomi-mimo, kimi-coding) rendered fake "model-a/b/c"
placeholders. It now fetches the REAL minted qtSd/* combos from /api/combos,
parses them (parseQuotaModelName), and groups by group → provider — falling back
to the placeholder map only when the fetch fails or returns nothing. The per-key
view already showed real models via /api/quota/keys/[id]/models; this aligns the
default view with it.

Verified on the local VPS: an exclusive key (share01) returns ONLY the real qtSd
models of its groups (claudao + chinas) and a non-quota key returns []. The
remaining /v1/models leak (non-quota keys still see qtSd among all models) is
tracked in the quota-key redesign plan.
…d, plan presets

- Beta banner scoped to the Quota Share page (functional-but-bugs-expected) with a
  pre-filled "open an issue" link (labels quota-share,beta). Page-only.
- Endpoints card now also surfaces POST /v1/responses (codex/github) and the
  codex-only WS /v1/responses line (the Responses-over-WebSocket proxy), each gated
  on the in-scope provider slug.
- planRegistry: seed xiaomi-mimo (4.1B-token weekly "lite" cap) and kimi-coding so the
  PoolWizard "Limite" step pre-fills a fair-share limit for these no-balance-API
  providers (fair-share enforces from the proxy's own token count, not an upstream
  balance — set the real cap manually in step 2).
- docs(API_REFERENCE): document the codex Responses-over-WebSocket endpoint.
- i18n en/pt-BR for all new keys.

Tracked in _tasks/features-v3.8.8/quota-share-key-redesign.plan.md (codex-WS config
toggle + per-provider balance fetchers + %-quota attribution are planned follow-ups).
Claude Code (Pro/Max) is a percentage-of-plan quota (5h rolling + weekly cap,
shared Claude+Code); exact token caps are unpublished/task-variable so percent is
the practical unit. Unblocks the PoolWizard 'Limite' pre-fill for claude pools.
Researched plan structures (codex/claude/glm/kimi/minimax/xiaomi) captured in the
quota-share redesign plan.
…n tiers

- xiaomi-mimo: token plan is MONTHLY (per platform.xiaomimimo.com/token-plan), so
  the seed is now tokens/monthly/4.1B (was weekly).
- deepseek: prepaid in USD — its balance API is already wired (deepseekQuotaFetcher)
  and the fair-share engine supports the usd unit (COUNTABLE_UNITS). Seeded a
  usd/monthly preset so the limit is set by dollar value.
- minimax: documented the real M3 tiers (Plus ~1.633B/Max ~5.053B/Ultra ~9.796B)
  in-comment; EPSILON keeps it manual until tier-aware presets land.
- planRegistry already seeds codex/claude/glm/minimax/kimi/kimi-coding/xiaomi-mimo/
  deepseek/bailian/alibaba; PoolWizard 'Limite' step stays editable.

Researched plan structures + the tier-aware-preset follow-up are in the redesign plan.
…ridge-secret auth

Two bugs made `wscat ws://host/v1/responses` fail with
"Transfer-Encoding can't be present with Content-Length":

1. authz/management policy 401'd the proxy's own internal authenticate/prepare
   loopback call to /api/internal/codex-responses-ws (MANAGEMENT-classified, the
   per-process bridge secret wasn't recognized one layer up). Added a tightly-scoped
   carve-out: isValidWsBridgeRequest() honors a timing-safe sha256 match of
   OMNIROUTE_WS_BRIDGE_SECRET (x-omniroute-ws-bridge-secret header) for that exact
   internal path; the route still re-validates the secret. → auth now succeeds → 101.

2. On auth failure the proxy spread the internal fetch's response headers onto the
   raw upgrade socket — a chunked Transfer-Encoding + Next CSP/route-class headers
   collided with writeHttpError's Content-Length framing (and duplicated Content-Type
   via a case-mismatched spread). writeHttpError now strips framing + pipeline/security
   headers (case-insensitive), and the auth-fail callsite no longer forwards them.

Regression test: tests/unit/responses-ws-proxy-headers.test.mjs (exports writeHttpError;
asserts no TE+CL, single Content-Type, no CSP/route-class leak, safe headers forwarded).
…2-table layout)

The key list stacked many badges in one column (tall/cluttered) and didn't
distinguish quota keys. Now renders two sections — "Normal keys" and "Quota keys"
(purple QUOTA pill) — sharing the same compact table header via an extracted
renderKeyRow(). Quota rows prepend a qtSd-only mode chip + group-name chips
(resolved by fetching /api/quota/pools + /api/quota/groups → poolId→group map).
Empty sections are hidden. i18n en/pt-BR for the new labels.

Source-scan test + i18n parity in api-manager-quota-keys-section.test.ts.
…(Check 2.9)

E2E testing on the VPS showed a normal key (empty allowedQuotas) could call a
qtSd/<group>/<provider>/<model> virtual model and route through a shared quota
pool — because the quota-exclusive enforcement (Check 3) only ran when
allowedQuotas was non-empty, so an unallocated key fell through to the normal
model checks and qtSd was served. This is the "empty allowedQuotas = all pools"
gap from the redesign.

Add Check 2.9 in enforceApiKeyPolicy: if the requested model is a qtSd model and
the key is NOT allocated to any quota pool (allowedQuotas empty), reject 403
QUOTA_NOT_ALLOCATED. Allocated keys are unchanged (Check 3 still validates scope).
This matches the owner's rule: only a key selected in a pool may use its qtSd
models. Normal (non-qtSd) model access for normal keys is unchanged.

Test: tests/unit/apikeypolicy-quota-only.test.ts — new case asserts a non-quota
key is blocked from qtSd (QUOTA_NOT_ALLOCATED) yet still uses normal models.
…ota sync

The quota-sync path deliberately reuses a rotating-refresh provider's (Codex/
OpenAI/Claude — see refreshSerializer ROTATION_LOCK_GROUP) access_token WITHOUT
proactively refreshing it (#3019, to avoid the Auth0 family-revocation cascade).
When that token is expired the codex usage fetch returns "token expired", and
syncExpiredStatusIfNeeded then flagged the connection testStatus="expired" — a
false-negative: the credential is still valid (expires_at in the future) and the
reactive serialized 401 path refreshes the access_token on next use.

Symptom: freshly-added Codex accounts showed "expired" with no quota on the
quota page, while a providers-page refresh turned them green. They never lost
access — only the quota sync mislabeled them.

Fix: extract the decision into the pure, exported `quotaPathShouldMarkExpired()`
and skip rotating providers (rotationGroupFor !== null). Their status is owned by
the reactive path / connection test, never the quota sync. Adds unit coverage.
…ialized refresh)

Symptom: freshly-added Codex accounts (e.g. davi/gabriel) showed "No quota data"
even when healthy. Root cause: the quota path reuses the access_token without
refreshing rotating providers (#3019, anti Auth0 family-revocation cascade), so a
Codex account whose short-lived access_token has expired can never surface quota
from the sync — the live fetch returns "Codex token expired".

Fix (opt-in, cascade-safe):
- refreshAndUpdateCredentials gains `allowRotatingRefresh` + a pure exported gate
  `shouldAttemptRotatingRefresh`. The actual token mint is wrapped in
  `serializeRefresh` (one refresh at a time per Auth0 rotation group) — so even N
  concurrent per-account requests can never refresh siblings in parallel.
- The BULK scheduler (syncAllProviderLimits, concurrent) keeps the flag OFF →
  #3019 fully preserved (guardian test codex-quota-sync-no-proactive-refresh stays
  green). Only the on-demand, per-connection path (`GET /api/usage/[connectionId]`)
  opts in.
- Frontend: the quota page auto-fetches LIVE on open for the VISIBLE connections
  that have no cached quota (scoped to what's on screen — not all connections —
  and skips entries already cached), so expired-token Codex accounts surface real
  quota automatically and cascade-safely.

Adds unit coverage for the gate (bulk skips rotating, on-demand allows; non-rotating
always eligible). typecheck / lint clean.
…c mitm manager stub

The Docker image build (`docker compose --profile cli build`) runs `next build`
with OMNIROUTE_USE_TURBOPACK=1 and failed with two Turbopack errors that the
webpack-based VM build never hits — which is why the VM deploy validated but the
Docker build errored (#3066). The reporter's log was truncated before the real
errors; reproducing `OMNIROUTE_USE_TURBOPACK=1 npm run build` locally surfaced them:

1. node_modules/sqlite-vec-linux-x64/vec0.so — "Unknown module type". sqlite-vec
   ships a native vec0.so loaded at runtime via createRequire(); Turbopack tried to
   bundle the .so. Fixed by adding "sqlite-vec" to serverExternalPackages, exactly
   like better-sqlite3.

2. /api/tools/agent-bridge/state statically imports getAllAgentsStatus from
   @/mitm/manager, which next.config aliases to manager.stub.ts for the Turbopack
   build. The stub did not export getAllAgentsStatus → "Export getAllAgentsStatus
   doesn't exist in target module". Added the export (throws like the other heavy
   ops — MITM/agent-bridge is non-functional in the bundled build anyway).

Tests (tests/unit/next-config.test.ts):
- assert sqlite-vec is in serverExternalPackages.
- new guard: manager.stub.ts must export every name statically imported from
  @/mitm/manager across src/app (catches stub/manager drift — would have caught this).

Verified: OMNIROUTE_USE_TURBOPACK=1 npm run build → EXIT 0 (was: Build error
occurred); webpack build → EXIT 0; typecheck:core / check:cycles / lint clean.

Fixes #3066
… (review feedback)

Follow-up to 146244b (#3066), addressing optional review suggestions:

- manager.stub.ts: getAllAgentsStatus now returns [] (the truthful "no agents"
  state, type-faithful) instead of throwing. Unlike the dynamic-import heavy ops,
  this is a STATIC import baked into the Turbopack/bundled build, so it is
  legitimately reached at runtime there — returning an empty list degrades
  gracefully instead of erroring. (Functionally inert for the existing
  agent-bridge/state route, where getMitmStatus already rejects first.)

- next-config.test.ts: the stub-drift guard no longer hard-asserts a specific
  symbol (getAllAgentsStatus); the generic ">=1 import found" sanity plus the
  missing-exports check remain, so the guard survives an agent-bridge /
  traffic-inspector route being renamed or removed.

typecheck:core / lint / next-config suite (4/4) clean. The export still exists,
so the Turbopack build resolution is unchanged.
… review)

Addresses findings from the multi-agent PR review of the #3066 fix:

- manager.stub.ts comments: the previous inline comment claimed the throwing ops
  (getMitmStatus/startMitm/stopMitm) are "dynamic-import paths that should never hit
  the stub at runtime" — factually wrong: those are static imports too, baked into the
  bundled build just like getAllAgentsStatus. Rewrote the file header to describe the
  real split — exports with a safe degraded value return it (getCachedPassword/
  setCachedPassword/clearCachedPassword → null/no-op, getAllAgentsStatus → []) while
  getMitmStatus/startMitm/stopMitm throw STUB_ERROR — and trimmed the inline comment.
  Comment-only; no runtime/build change (the export still exists).

- stub-drift guard test: now scans ALL of src/ instead of only src/app —
  src/lib/tailscaleTunnel.ts statically imports getCachedPassword/setCachedPassword
  from @/mitm/manager and is pulled into routes transitively, so the src/app-only scan
  had a false-negative blind spot. Also skips inline `type` imports (erased at build,
  need no runtime export) and detects stub exports from declaration AND `export { … }`
  forms (no false-positive if the stub later uses class/re-export).

Verified: next-config suite 4/4, typecheck:core / lint clean.
…emory in Docker)

Completes the #3066 fix. Externalizing sqlite-vec unblocked the Turbopack build, but
Next.js does not trace sqlite-vec's platform-specific native package
(sqlite-vec-<os>-<arch>, which ships vec0.so) into .next/standalone — sqlite-vec
resolves it at runtime via require.resolve() (Next.js issue #88844). Result: in the
bundled/Docker build the wrapper loaded but getLoadablePath() threw MODULE_NOT_FOUND,
so vectorStore silently degraded vector/semantic memory to FTS5 keyword search.

build-next-isolated now syncs the sqlite-vec wrapper plus whichever sqlite-vec-<platform>
package npm installed into the standalone output (mirroring the existing better-sqlite3
native-binary handling). Platform-agnostic, so Docker (linux) and Electron (mac/win/linux)
builds all carry their matching vec0.so/.dylib/.dll.

Verified: vec0.so present in .next/standalone/node_modules/sqlite-vec-linux-x64;
createRequire("sqlite-vec") + require.resolve("sqlite-vec-linux-x64/vec0.so") both
resolve from inside the standalone (no FTS5 fallback). build-next-isolated tests 7/7.
Promote the active release line (v3.8.8) to main. Release is a superset of
main's features (Trae, Notion MCP, SiliconFlow, OOM fixes, MaintenanceBanner),
so all ~44 content conflicts were resolved to the release version; generated
.source/* files (gitignored) were dropped.
- CHANGELOG.md: restore release's authoritative changelog (auto-merge had
  re-introduced main's stale Unreleased section).
- auth.ts: port #3058 — getProviderSearchPool expands custom provider_nodes
  prefixes (e.g. "78code/gpt-5.4") to internal connection ids during credential
  lookup; release lacked this fix. Preserves nvidia alias special-casing.
- quota-plan-registry.test.ts: align knownProviders() count (6 -> 10) with the
  registry, which gained claude/deepseek/kimi-coding/xiaomi-mimo; cover the full
  set. (Pre-existing stale assertion on release.)
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2b0bd9d7d6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

if (authError) return authError;
const url = new URL(request.url);
const status = url.searchParams.get("status") as any;
const statusResult = StatusSchema.safeParse(url.searchParams.get("status"));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Handle absent plugin status filters

When GET /api/plugins is called without a status query parameter, URLSearchParams.get() returns null, but StatusSchema is only optional for undefined. This makes the default unfiltered plugin list return 400, which breaks the main plugins page/list API unless callers always add a status filter.

Useful? React with 👍 / 👎.

Comment on lines +5850 to +5853
runOnResponse(
{ requestId: traceId, body, model, provider, apiKeyInfo, metadata: {} },
{ status: 200 }
).catch(() => {});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Use the actual response in onResponse hooks

For successful chat requests with response-modifying plugins enabled, this new hook call passes only { status: 200 } and discards the runOnResponse return value. Since the plugin API supports onResponse(ctx, response) transformations, plugins that inspect or replace the completion payload never see the actual response and any returned modification is ignored in the chat success path.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request bumps the version to 3.8.8 and introduces a robust plugins framework, including structured error handling, rate limiting, and dev-mode hot-reloading. It also implements a Quota Share Engine, session pool fingerprint rotation for DuckDuckGo Web, and auto-calibration for the V8 heap-pressure guard. Feedback on these changes highlights several critical issues: request-scoped metadata is lost across plugin hooks due to block scoping in chatCore.ts; the absolute path validation regex breaks Windows compatibility; unprefixed model IDs block public model checks; redundant status checks create dead code in the DuckDuckGo executor; directory traversal checks are bypassed by path resolution; clearing number inputs in the plugin config page incorrectly defaults to zero; fetch error handling in the plugins page fails to catch non-2xx responses; and rate-limit logging can flood application logs.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +1645 to 1647
if (pluginResult?.metadata) {
Object.assign(pluginCtx.metadata, pluginResult.metadata);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The pluginCtx object is block-scoped to the try block of the onRequest hook. Any metadata merged into pluginCtx.metadata via Object.assign is lost once the block exits. Consequently, subsequent hooks like runOnError (line 3423) and runOnResponse (line 5847) receive a fresh, empty metadata: {} object, preventing plugins from sharing state across different phases of the request lifecycle.

To fix this, declare a request-scoped pluginMetadata object at the top of handleChatCore and pass it to all hook invocations.

Comment on lines +50 to +53
path: z.string().min(1).regex(/^\/[^]*$/, "Path must be absolute").refine(
(p) => !p.includes("\0") && !p.includes(".."),
"Path must not contain traversal patterns or null bytes"
),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The regex ^\/ strictly requires the path to start with /, which will reject valid Windows absolute paths (e.g., C:\path\to\plugin). Since the project supports Windows, this Unix-only check will break plugin installation on Windows.

Suggested change
path: z.string().min(1).regex(/^\/[^]*$/, "Path must be absolute").refine(
(p) => !p.includes("\0") && !p.includes(".."),
"Path must not contain traversal patterns or null bytes"
),
path: z.string().min(1).refine((p) => p.startsWith("/") || /^[A-Za-z]:[\\/]/.test(p), "Path must be absolute").refine(
(p) => !p.includes("\0") && !p.includes(".."),
"Path must not contain traversal patterns or null bytes"
),

Comment thread src/lib/db/apiKeys.ts
Comment on lines +1373 to +1389
if (disableNonPublicModels) {
const resolvedModelId = resolveModelAlias(modelId);
const effectiveModelId = resolvedModelId || modelId;

const providerId = effectiveModelId.split("/")[0];
const shortModelId = effectiveModelId.split("/").slice(1).join("/");
const syncedModelsByConnection = await getSyncedAvailableModelsByConnection(providerId);
const customModels = await getCustomModels(providerId);

// Combine synced and custom models
const allDiscoveredModels = Object.values(syncedModelsByConnection).flat().concat(customModels);
const discovered = allDiscoveredModels.some((m) => m.id === shortModelId);
if (!discovered) return false;

const isPublic = !getModelIsHidden(providerId, shortModelId);
if (!isPublic) return false;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

If a request is made with an unprefixed model ID (e.g., "gpt-4"), splitting by / results in providerId = "gpt-4" and shortModelId = "". This will fail to find the model in the provider catalog and return false, blocking valid requests. Unprefixed model IDs must be resolved to their canonical prefixed form before performing this check.

Comment on lines +172 to +181
// Report pool status based on response
if (pool && session) {
if (chatResponse.status === 429) {
pool.reportCooldown(session);
} else if (chatResponse.status >= 500) {
pool.reportDead(session);
} else {
pool.reportSuccess(session);
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The statuses 429 and >= 500 are already handled by early returns at lines 128 and 162. Therefore, the code inside if (pool && session) at lines 173-181 will only ever execute the else branch (pool.reportSuccess(session)). The checks for 429 and >= 500 are dead code.

      // Report pool status based on response
      if (pool && session) {
        pool.reportSuccess(session);
      }

Comment on lines +17 to +33
function validatePluginPath(path: string): string {
// Reject null bytes
if (path.includes("\0")) {
throw new Error("Invalid path: contains null bytes");
}
// Must be absolute
if (!isAbsolute(path)) {
throw new Error("Path must be absolute");
}
// Normalize and resolve to prevent traversal
const normalized = normalize(resolve(path));
// Reject paths with traversal patterns
if (normalized.includes("..") || normalized.includes("~")) {
throw new Error("Invalid path: directory traversal detected");
}
return normalized;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

resolve resolves relative segments like .. and . to their absolute paths before normalized.includes("..") is checked. For example, /foo/bar/../baz resolves to /foo/baz, which does not contain ... Thus, checking normalized.includes("..") after resolution is ineffective at detecting traversal attempts. The check should be performed on the raw input path before resolution.

function validatePluginPath(path: string): string {
  // Reject null bytes
  if (path.includes("\0")) {
    throw new Error("Invalid path: contains null bytes");
  }
  // Reject directory traversal attempts in the input path
  if (path.includes("..") || path.includes("~")) {
    throw new Error("Invalid path: directory traversal detected");
  }
  // Must be absolute
  if (!isAbsolute(path)) {
    throw new Error("Path must be absolute");
  }
  return normalize(resolve(path));
}

Comment on lines +127 to +135
) : field.type === "number" ? (
<input
type="number"
value={Number(value)}
min={field.min}
max={field.max}
onChange={(e) => handleChange(key, Number(e.target.value))}
className="w-full rounded border p-2"
/>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

When the user clears a number input field, e.target.value becomes "". Coercing this with Number(e.target.value) results in 0, which immediately populates the field with 0 instead of leaving it empty. This can be annoying for users trying to type a new number or clear an optional field.

Suggested change
) : field.type === "number" ? (
<input
type="number"
value={Number(value)}
min={field.min}
max={field.max}
onChange={(e) => handleChange(key, Number(e.target.value))}
className="w-full rounded border p-2"
/>
) : field.type === "number" ? (
<input
type="number"
value={value === "" ? "" : Number(value)}
min={field.min}
max={field.max}
onChange={(e) => handleChange(key, e.target.value === "" ? "" : Number(e.target.value))}
className="w-full rounded border p-2"
/>

Comment on lines +58 to +69
const handleToggle = async (name: string, enable: boolean) => {
const endpoint = enable ? "activate" : "deactivate";
try {
const res = await fetch(`/api/plugins/${name}/${endpoint}`, { method: "POST" });
if (res.ok) {
addNotification({ type: "success", message: enable ? t("activated", { name }) : t("deactivated", { name }) });
await fetchPlugins();
}
} catch {
addNotification({ type: "error", message: enable ? t("activateFailed", { name }) : t("deactivateFailed", { name }) });
}
};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

fetch does not throw an error on non-2xx HTTP status codes (like 400 or 500). If the server fails to activate/deactivate a plugin, res.ok will be false, but the catch block won't be triggered. As a result, no error notification will be shown to the user.

  const handleToggle = async (name: string, enable: boolean) => {
    const endpoint = enable ? "activate" : "deactivate";
    try {
      const res = await fetch(`/api/plugins/${name}/${endpoint}`, { method: "POST" });
      if (!res.ok) throw new Error();
      addNotification({ type: "success", message: enable ? t("activated", { name }) : t("deactivated", { name }) });
      await fetchPlugins();
    } catch {
      addNotification({ type: "error", message: enable ? t("activateFailed", { name }) : t("deactivateFailed", { name }) });
    }
  };

Comment thread src/lib/plugins/hooks.ts
Comment on lines +150 to +153
if (isRateLimited(reg.pluginName)) {
log.warn("hook.rate_limited", { event, pluginName: reg.pluginName });
continue;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Logging a warning for every single rate-limited hook call can flood the application logs and degrade performance if a plugin is misbehaving and making thousands of calls. Consider changing the log level to debug or debouncing the warning.

Suggested change
if (isRateLimited(reg.pluginName)) {
log.warn("hook.rate_limited", { event, pluginName: reg.pluginName });
continue;
}
if (isRateLimited(reg.pluginName)) {
log.debug("hook.rate_limited", { event, pluginName: reg.pluginName });
continue;
}

@kilo-code-bot
Copy link
Copy Markdown

kilo-code-bot Bot commented Jun 2, 2026

Code Review Summary

Status: No Issues Found | Recommendation: Merge

This PR promotes release/v3.8.8 to main (709 commits). The changes include:

  • Deleted .source/browser.ts and .source/server.ts (generated files, correctly removed)
  • .env.example updates: new env var documentation for PII window size, Trae settings, heap pressure, plugin exec, and trace flags
  • Test fixes for sidebar costs section (costs-quota-plans retired, costs-quota-share moved to HIDEABLE list)
  • New streaming PII transform tests for event name preservation and IPv6 handling
  • Thinking budget tests for model suffix handling and passthrough mode
  • Token limit counter tests, token refresh tests
  • Usage utils tests including ReDoS guard for subscription tier mapping
  • T23/T24 fallback resilience tests with improved type annotations

All changes are consistent with the release changelog and follow existing code patterns. No inline code changes require comments. The merge uses release version for conflict resolution as documented.

Files Reviewed (22 files)
  • .env.example — env var documentation
  • tests/unit/sidebar-*.test.ts — sidebar item tests (6 changes)
  • tests/unit/streamingPiiTransform.test.ts — new PII event handling tests
  • tests/unit/service-thinking-budget.test.ts — new thinking budget tests
  • tests/unit/service-token-*.test.ts — new token service tests
  • tests/unit/usage-utils.test.ts — new ReDoS guard test
  • tests/unit/t23-t24-fallback-resilience.test.ts — test improvements
  • docs/i18n/*/CHANGELOG.md — i18n changelog updates (40+ files)
  • docs/architecture/ARCHITECTURE.md — provider/executor counts updated
  • .agents/skills/deploy-*.md — deployment script updates

Reviewed by laguna-m.1-20260312:free · 3,242,445 tokens

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

CI Coverage Report

  • Coverage job: skipped
  • PR test policy: success

Coverage artifact was not available for this run.

@diegosouzapw
Copy link
Copy Markdown
Owner Author

Redundante — a promoção release/v3.8.8 → main já é a PR #2930. Os fixes (resolução de conflitos, port do #3058, alinhamento de testes e reparo do CI) vão direto na release/v3.8.8 (head da #2930).

@diegosouzapw diegosouzapw deleted the chore/merge-v3.8.8 branch June 2, 2026 21:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.