Skip to content

Release/v3.8.8#2930

Open
diegosouzapw wants to merge 716 commits into
mainfrom
release/v3.8.8
Open

Release/v3.8.8#2930
diegosouzapw wants to merge 716 commits into
mainfrom
release/v3.8.8

Conversation

@diegosouzapw
Copy link
Copy Markdown
Owner

@diegosouzapw diegosouzapw commented May 30, 2026

Release v3.8.8

Merges release/v3.8.8main. Full changelog for this release:


[3.8.8] — 2026-06-01

Added

Changed

  • Sidebar Tools group: added agent-bridge and traffic-inspector items after cloud-agents.
  • /api/tools/agent-bridge/ and /api/tools/traffic-inspector/ added to LOCAL_ONLY_API_PREFIXES
    and SPAWN_CAPABLE_PREFIXES in src/server/authz/routeGuard.ts.
  • .env.example: documented 9 new env vars (AGENTBRIDGE_UPSTREAM_CA_CERT,
    INSPECTOR_BUFFER_SIZE, INSPECTOR_HTTP_PROXY_PORT, INSPECTOR_HTTP_PROXY_AUTOSTART,
    INSPECTOR_TLS_INTERCEPT, INSPECTOR_SYSTEM_PROXY_GUARD_MINUTES, INSPECTOR_MAX_BODY_KB,
    INSPECTOR_MASK_SECRETS, INSPECTOR_LLM_HOSTS_EXTRA, INSPECTOR_INTERNAL_INGEST_TOKEN).

Fixed

🏆 Contributors

A special thanks to everyone who contributed to this release — 687 commits since v3.8.7:

Contributor PRs / Contribution
@diegosouzapw maintainer — AgentBridge, Traffic Inspector, Quota Share Engine, Nav Restructure, Plugins integration, releases & upstream ports
@oyi77 #2913, #2947, #2954, #2978, #3015, #3018, #3039, #3041, #3045, #3046, #3049
@terence71-glitch #2956, #2960, #2963, #2984, #3000, #3006, #3012, #3048, #3051
@soyelmismo #2951, #2965, #2973
@branben #2958, #2959
@makcimbx #2937, #2938
@guanbear #2931, #3031
@Lion-killer #2981, #2988
@JxnLexn per-API-key stream default mode
@androw #3017
@xz-dev #2975
@S0yora #2964
@NekoMonci12 #3008
@Tentoxa #3010
@ReqX #2957
@NomenAK #2943
@charithharshana #2940
@dhaern #2927
@dangeReis #3021
@bobbyunknown #3029
@CitrusIce #3035
@wussh #3036
@Chewji9875 #3037
@herjarsa #3043

A special thanks to everyone who contributed code, reviews, and tests for this release:
@androw, @bobbyunknown, @branben, @charithharshana, @Chewji9875, @CitrusIce, @dangeReis, @dhaern, @diegosouzapw, @guanbear, @herjarsa, @JxnLexn, @Lion-killer, @makcimbx, @NekoMonci12, @NomenAK, @oyi77, @ReqX, @S0yora, @soyelmismo, @Tentoxa, @terence71-glitch, @wussh, @xz-dev


@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements the Quota Sharing Engine (Group B, plans 16 and 22) to fairly distribute provider quotas across multiple API keys, along with a reorganized dashboard sidebar featuring a new Costs section and an Activity feed. The code reviewer identified several critical and high-severity issues: missing await calls on the asynchronous getQuotaStore function in enforce.ts that would cause runtime TypeErrors, in-place mutation of cached translation module exports in request.ts leading to cross-request state pollution, and unhandled non-2xx HTTP status codes in frontend fetch calls. Additionally, the reviewer pointed out dead database queries in sqliteQuotaStore.ts, a hardcoded empty provider name in the pool usage route that breaks catalog plan resolution, and redundant conditional checks in fairShare.ts.

Comment thread src/lib/quota/enforce.ts Outdated
Comment on lines +88 to +90

// 4. For each active dimension, peek consumption and saturation.
const store = getQuotaStore();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The getQuotaStore function is asynchronous and returns a Promise<QuotaStore>. Calling it without await here means store is a Promise rather than the actual store instance. This will cause store.peek to be undefined and throw a TypeError at runtime, breaking quota enforcement.

Suggested change
// 4. For each active dimension, peek consumption and saturation.
const store = getQuotaStore();
// 4. For each active dimension, peek consumption and saturation.
const store = await getQuotaStore();

Comment thread src/lib/quota/enforce.ts Outdated
Comment on lines +184 to +185

const store = getQuotaStore();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Similar to the issue in enforceQuotaShare, getQuotaStore is called without await in recordConsumption. This causes store to be a Promise, making store.consume undefined and failing to record any consumption at runtime.

Suggested change
const store = getQuotaStore();
const store = await getQuotaStore();

Comment thread src/i18n/request.ts
Comment on lines +53 to +56
if (locale !== FALLBACK_LOCALE) {
const fallbackMessages = (await import(`./messages/${FALLBACK_LOCALE}.json`)).default as Record<string, unknown>;
messages = deepMergeFallback({ ...localeMessages }, fallbackMessages);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The shallow copy { ...localeMessages } only copies the top-level properties of the imported translation module. Since deepMergeFallback mutates nested objects in-place, this will directly mutate the cached module exports of the translation files. On subsequent requests, the translations will remain polluted with English fallback keys, leading to cross-request state pollution. Using a deep clone like JSON.parse(JSON.stringify(...)) prevents this.

Suggested change
if (locale !== FALLBACK_LOCALE) {
const fallbackMessages = (await import(`./messages/${FALLBACK_LOCALE}.json`)).default as Record<string, unknown>;
messages = deepMergeFallback({ ...localeMessages }, fallbackMessages);
}
let messages = localeMessages as Record<string, unknown>;
if (locale !== FALLBACK_LOCALE) {
const fallbackMessages = (await import(`./messages/${FALLBACK_LOCALE}.json`)).default as Record<string, unknown>;
messages = deepMergeFallback(JSON.parse(JSON.stringify(localeMessages)), fallbackMessages);
}

Comment on lines +207 to +238
const handleCreate = useCallback(
async (poolData: Omit<QuotaPool, "id" | "createdAt">) => {
await fetch("/api/quota/pools", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify(poolData),
});
await mutate();
},
[pools, savePools]
[mutate]
);

// ── Derived ──────────────────────────────────────────────────────────────
const handleSaveAllocations = useCallback(
async (pool: QuotaPool, allocations: PoolAllocation[]) => {
await fetch(`/api/quota/pools/${pool.id}`, {
method: "PATCH",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ allocations }),
});
await mutate();
},
[mutate]
);

const stats = useMemo(() => {
let allocations = 0;
let atCap = 0;
let uncappedPct = 0;
for (const p of pools) {
allocations += p.allocations.length;
const totalPct = p.allocations.reduce((s, a) => s + a.percent, 0);
if (totalPct < 100) uncappedPct += 100 - totalPct;
// Simulated "at cap" — without backend tracking, mark pools with >=1
// allocation summing to 100% as "fully utilized config" (proxy metric).
if (totalPct >= 100 && p.allocations.length > 0) atCap += 0; // see disclaimer
}
return {
activePools: pools.length,
allocations,
atCap,
uncapped: pools.length > 0 ? Math.round(uncappedPct / pools.length) : 0,
};
}, [pools]);
const handleRemovePool = useCallback(
async (id: string) => {
if (!confirm(t("removeConfirm"))) return;
await fetch(`/api/quota/pools/${id}`, { method: "DELETE" });
await mutate();
},
[mutate, t]
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The fetch API does not throw on non-2xx HTTP status codes (such as 400 or 500). Since these helper functions do not check res.ok, any server-side validation or processing errors will be silently ignored. This bypasses the try/catch blocks in the calling modals (e.g., CreatePoolModal and EditAllocationsModal), causing them to close and report success to the user even when the operation failed on the server.

  const handleCreate = useCallback(
    async (poolData: Omit<QuotaPool, "id" | "createdAt">) => {
      const res = await fetch("/api/quota/pools", {
        method: "POST",
        headers: { "Content-Type": "application/json" },
        body: JSON.stringify(poolData),
      });
      if (!res.ok) {
        throw new Error(`Failed to create pool: ${res.statusText}`);
      }
      await mutate();
    },
    [mutate]
  );

  const handleSaveAllocations = useCallback(
    async (pool: QuotaPool, allocations: PoolAllocation[]) => {
      const res = await fetch(`/api/quota/pools/${pool.id}`, {
        method: "PATCH",
        headers: { "Content-Type": "application/json" },
        body: JSON.stringify({ allocations }),
      });
      if (!res.ok) {
        throw new Error(`Failed to save allocations: ${res.statusText}`);
      }
      await mutate();
    },
    [mutate]
  );

  const handleRemovePool = useCallback(
    async (id: string) => {
      if (!confirm(t("removeConfirm"))) return;
      const res = await fetch(`/api/quota/pools/${id}`, { method: "DELETE" });
      if (!res.ok) {
        throw new Error(`Failed to remove pool: ${res.statusText}`);
      }
      await mutate();
    },
    [mutate, t]
  );

Comment on lines +156 to +181
// instead we scan listAllocationsForApiKey to find which pools the key
// participates in, and derive dimensions via best-effort getBucket.
// For poolUsage we rely on the dimension keys we can discover.
// Since dimensions live in ProviderPlan (resolved separately), we peek
// via direct getBucket reads for the current bucket only.
//
// Note: This is intentionally a lightweight implementation. The full
// dimension list should come from the resolved plan; here we surface
// what's been stored in quota_consumption for this pool.

const { apiKeyId } = alloc;
// listAllocationsForApiKey returns pairs across all pools; filter to this one
const allAllocsForKey = listAllocationsForApiKey(apiKeyId);
for (const { poolId: pid } of allAllocsForKey) {
if (pid !== poolId) continue;
// The dimension keys for this pool are known if consumption exists
// We can't list all keys without a query, so we rely on the calling
// context having pre-populated via consume(). For dashboard use,
// the pool dimensions are read from the provider plan.
}

// We only read dimensions that we can discover from what was actually
// consumed. For a richer implementation, the caller should pass the
// resolved plan dimensions (done in REST routes - F8).
// Here: peek for common windows to detect what's in use.
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This for (const alloc of allocations) loop is completely dead code. It queries the database for each allocation's pools but does not use or store the results anywhere, nor does it populate dimMap. This results in redundant database queries and CPU overhead on every poolUsage call.

Suggested change
// instead we scan listAllocationsForApiKey to find which pools the key
// participates in, and derive dimensions via best-effort getBucket.
// For poolUsage we rely on the dimension keys we can discover.
// Since dimensions live in ProviderPlan (resolved separately), we peek
// via direct getBucket reads for the current bucket only.
//
// Note: This is intentionally a lightweight implementation. The full
// dimension list should come from the resolved plan; here we surface
// what's been stored in quota_consumption for this pool.
const { apiKeyId } = alloc;
// listAllocationsForApiKey returns pairs across all pools; filter to this one
const allAllocsForKey = listAllocationsForApiKey(apiKeyId);
for (const { poolId: pid } of allAllocsForKey) {
if (pid !== poolId) continue;
// The dimension keys for this pool are known if consumption exists
// We can't list all keys without a query, so we rely on the calling
// context having pre-populated via consume(). For dashboard use,
// the pool dimensions are read from the provider plan.
}
// We only read dimensions that we can discover from what was actually
// consumed. For a richer implementation, the caller should pass the
// resolved plan dimensions (done in REST routes - F8).
// Here: peek for common windows to detect what's in use.
}
// Note: This is intentionally a lightweight implementation. The full
// dimension list should come from the resolved plan; here we surface
// what's been stored in quota_consumption for this pool.

Comment on lines +50 to +52
// 2. Resolve the provider plan for this pool's connection
// Provider name is not stored on pool — use empty string to trigger catalog/empty fallback
const plan = resolvePlan(pool.connectionId, "");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Passing an empty string "" as the provider name to resolvePlan prevents catalog-defined plans (like codex or kimi) from being resolved unless a manual override is present in the database. Resolving the provider name from the connection ID first allows catalog plans to work out-of-the-box.

    // 2. Resolve the provider plan for this pool's connection
    let provider = "";
    try {
      const { getProviderConnectionById } = await import("@/lib/localDb");
      const conn = getProviderConnectionById(pool.connectionId);
      if (conn && typeof (conn as { provider?: string }).provider === "string") {
        provider = (conn as { provider: string }).provider;
      }
    } catch {
      // ignore
    }
    const plan = resolvePlan(pool.connectionId, provider);

Comment on lines +96 to +104
// ── Teto global intransponível ─────────────────────────────────────────
// If the pool's global limit is already reached AND this key's request
// would exceed it (burst mode without borrow room), block as "global-saturated".
if (dim.consumedTotal >= dim.limit) {
if (allocation.policy !== "burst") {
return { kind: "block", reason: "global-saturated" };
}
// burst also blocked when no room at all
return { kind: "block", reason: "global-saturated" };
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The conditional check if (allocation.policy !== "burst") is redundant because both branches return the exact same { kind: "block", reason: "global-saturated" } value. This can be simplified.

Suggested change
// ── Teto global intransponível ─────────────────────────────────────────
// If the pool's global limit is already reached AND this key's request
// would exceed it (burst mode without borrow room), block as "global-saturated".
if (dim.consumedTotal >= dim.limit) {
if (allocation.policy !== "burst") {
return { kind: "block", reason: "global-saturated" };
}
// burst also blocked when no room at all
return { kind: "block", reason: "global-saturated" };
if (dim.consumedTotal >= dim.limit) {
return { kind: "block", reason: "global-saturated" };
}

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 30, 2026

CI Coverage Report

  • Coverage job: failure
  • PR test policy: success

Coverage artifact was not available for this run.

Comment thread src/i18n/request.ts
) {
deepMergeFallback(targetValue as Record<string, unknown>, sourceValue as Record<string, unknown>);
} else if (targetValue === undefined) {
target[key] = sourceValue;
Copy link
Copy Markdown
Contributor

@github-advanced-security github-advanced-security AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeQL found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

diegosouzapw and others added 14 commits May 30, 2026 04:13
…y-engine-redesign

feat(memory): memory engine redesign — sqlite-vec + hybrid RRF + Studio UI (plan 21)
…ols Studio — plans 17+18)

Conflicts: migration 076_playground_presets->084; localDb/.env union; REPOSITORY_MAP dedup; package.json keep base (version 3.8.7, coverage --functions 40); .source regenerated (+3 docs); deps cli-table3/wtfnode/@types/bun/uuid (npm install).

i18n pt-BR: 19 collisions — 18 playground keys -> HEAD pt-BR translations (base had untranslated EN: Send->Enviar, Cancel->Cancelar etc), costsSection -> base Custos. Rule: prefer side != en.json (translated).

openapi: --theirs base + 3 playground/search paths + 2 schemas + 1 tag (js-yaml surgical insert; union-blind breaks YAML). No open-sse, typecheck:core 0 errors.
…ound-search-tools

feat(playground,search-tools): Playground Studio + Search Tools Studio (planos 17+18)
Buffer Claude Code Read tool calls through the existing shim layer so empty pages placeholders are removed before streaming input_json_delta to the client. Also clean JSON-string Responses tool arguments, not only object arguments.

Closes #2935
Addresses #2889
Translate Claude Code web_search_YYYYMMDD server tools to the native OpenAI Responses web_search tool and preserve filters/location. Convert forced Claude tool_choice for web_search to the native Responses tool choice while leaving ordinary custom functions unchanged.

Closes #2936
Only translate Claude Code web_search_YYYYMMDD server tools to native Responses web_search when the final target is OpenAI Responses. Keep the Chat Completions target on function-tool shape and cover the full translateRequest path.
Keep existing object-argument cleanup behavior, but avoid parsing and stripping arbitrary JSON-string arguments for unrelated tools where empty strings or arrays may be valid payloads. Add regression coverage for non-Read and non-object Read arguments.
Remove the AuditLogTab from the dashboard logs page now that audit logs
live under the dedicated /dashboard/audit route. Update integration wiring
expectations and add metadata frontmatter to studio framework docs.
…ded in-memory caches

Root cause: Bottleneck rate limiter instances in rateLimitManager accumulate
without cleanup. Each instance runs an internal heartbeat setInterval every
250ms. Under heavy load with many provider:connection:model combinations,
hundreds of limiters accumulate causing CPU to grow ~0.1%/min until server
collapse (~2% after 5 minutes of intensive use).

Changes:
- rateLimitManager: Add idle limiter eviction in watchdogTick() using the
  previously defined but unused INACTIVE_LIMITER_MS threshold. Populate
  limiterLastUsed on every getLimiter() call. Clean up all 3 Maps
  (limiters, lastDispatchAt, limiterLastUsed) consistently.
- combo.ts: Add size-based FIFO eviction to rrCounters, resetAwareConnectionCache,
  and resetAwareQuotaCache Maps. Convert per-target log.info calls in combo
  execution loops to log.debug?. to reduce serialization overhead.
- chatCore.ts: Fix double-serialization in estimateTokens(JSON.stringify(x))
  calls (estimateTokens already handles objects). Make trace() conditional
  on OMNIRROUTE_TRACE/DEBUG env vars. Make per-request usage logging conditional.
- apiKeyRotator.ts: Add eviction guards to _keyHealth and _connectionExtraKeys
  Maps (MAX 500 entries each). Ensure removeConnectionIndex cleans all 3 Maps.
- codexQuotaFetcher.ts: Add eviction guard to connectionRegistry and quotaCache
  Maps (MAX 200 entries each).
@kilo-code-bot
Copy link
Copy Markdown

kilo-code-bot Bot commented May 30, 2026

Code Review Summary

Status: 1 Suggestions Found | Recommendation: Address before merge

Overview

Severity Count
CRITICAL 0
WARNING 0
SUGGESTION 1
Issue Details (click to expand)

SUGGESTION

File Line Issue
tests/e2e/playground-compare.spec.ts 121 Inconsistent comment and code: comment says 'exactly one' but code checks for at least one.
Other Observations (not in diff)

Issues found in unchanged code that cannot receive inline comments:

File Line Issue
src/i18n/messages/en.json 1771 The translation "qtSd-only" is a nonsensical placeholder that does not convey meaning to users
src/i18n/messages/pt-BR.json 444 The translation "só-qtSd" is a nonsensical placeholder that does not convey meaning to users
Files Reviewed (4 files)
  • tests/e2e/group-b-quota-plans-config.spec.ts
  • tests/e2e/playground-compare.spec.ts
  • tests/unit/quota-combo-balancing.test.ts
  • tests/unit/quota-multiprovider.test.ts

NomenAK and others added 10 commits May 30, 2026 12:56
…native Claude OAuth

Native Claude OAuth (claude->claude passthrough) forwards client tool
definitions verbatim. Anthropic's first-party Messages API then rejects:
  - invalid tool input_schemas (deep-truncation placeholders such as
    `enum: "[MaxDepth]"`, or index-keyed objects where arrays are required), and
  - tool names it fingerprints as a third-party agent harness (specific
    blacklisted names like `mixture_of_agents`, or a large enough set of
    recognizable snake_case agent tool names),
both surfaced as a misleading `400 You're out of extra usage` placeholder
(the SSE stream is refused — not a real billing event). The same request
succeeds on translator-backed providers (OpenAI/Codex), which already sanitize
and re-shape tool payloads — so the gap is specific to the native passthrough.

Adds the missing guards on the native Claude OAuth path (executors/base.ts):
  - sanitizeClaudeToolSchemas(): coerce/drop invalid draft-2020-12 constructs
    (non-array enum/required/anyOf/..., placeholder schema slots -> {}).
  - cloakThirdPartyToolNames(): deterministically alias non-Claude-Code tool
    names (Claude Code canonical mapping where one exists, else PascalCase),
    tracked in the existing per-request _toolNameMap so remapToolNamesInResponse
    restores the caller's original names. Opt out via
    CLAUDE_DISABLE_TOOL_NAME_CLOAK=true.

Genuine Claude Code tool names (PascalCase) and already-valid schemas are
left untouched, so existing first-party traffic is unaffected.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Extract 3 high-value CPU/RAM optimizations from perf branch:

1. estimateSizeFast() — fast object-tree size estimator replacing
   JSON.stringify().length in isSmallEnoughForSemanticCache(). Walks
   object tree with a stack, zero string allocation, early exit at 256KB.

2. Consolidate settings reads — move getCachedSettings() to a single
   early read in handleChatCore(), eliminating a redundant second read
   200 lines later. Also removes the isDetailedLoggingEnabled() wrapper
   call (reads settings internally) in favor of direct field check.

3. Registry Proxy→direct export — convert 8 registries from lazy
   Proxy+getOrCreate pattern to simple exported const objects. Eliminates
   Proxy trap overhead on every provider property access during routing.
   Affected: audio, embedding, image, moderation, music, rerank, search,
   video registries (-451 lines of Proxy boilerplate).

These changes are independent of the CPU leak fix (limiter eviction)
and complement it by reducing per-request CPU overhead.
… null-guards, docs

Follow-up commit on PR #2943 review:

- Preserve boolean schemas in `sanitizeClaudeToolSchemas` (Gemini Code Assist,
  high severity). `additionalProperties: false` is the canonical JSON Schema
  lock-down for object tools; the previous coercion silently turned it into the
  permissive `{}`, which would invite models to hallucinate extra arguments
  during tool calling. Same rule now applies to per-property boolean schemas
  under `properties`. Placeholder strings still get the permissive `{}` slot —
  booleans get preserved verbatim.

- Defensive null guards in `cloakThirdPartyToolNames` for `tools[]` and
  `messages[]` entries that might be `null`/`undefined`. Prevents a runtime
  `TypeError` if a malformed payload reaches the cloak.

- Document `CLAUDE_DISABLE_TOOL_NAME_CLOAK` in `.env.example` and
  `docs/reference/ENVIRONMENT.md` (env/docs contract was failing in CI).

- Regression tests covering all of the above (5 boolean preservation cases,
  2 null-tolerance cases).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ecutor path

The native Claude OAuth guard in executors/base.ts is bypassed when
`upstream_proxy_config.mode = cliproxyapi` routes the request through the
CliproxyAPI executor — it has its own execute()/transformRequest() and never
reaches BaseExecutor.execute(), so the cloak/sanitizer never ran for that
(common) deployment. Wire the same guards into
CliproxyapiExecutor.transformRequest (Anthropic-shape branch), composing with
the existing bisected `mcp_*` reserved-namespace rewrite:

- sanitizeClaudeToolSchemas() on transformed.tools.
- cloakThirdPartyToolNames() with skip = mcp-reserved, so applyMcpToolNameRewrite
  keeps authority over `mcp_*` (its bisected `Mcp_X` form) and the two reverse
  maps stay disjoint / single-hop. Both merge into the non-enumerable
  _toolNameMap the response stream already uses to restore the caller's names.

cloakThirdPartyToolNames is now non-mutating (clones changed entries) to respect
transformRequest's no-input-mutation contract, and takes an optional `skip`
predicate.

Verified end-to-end through the live CPA path: a real ~100-tool harness payload
that returned the "out of extra usage" placeholder now returns 200 with original
tool names restored on the response stream; `mcp_*` tools and genuine PascalCase
Claude Code tools are unaffected.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…t boundary

The /dashboard/tools/agent-bridge page (Server Component) passed ALL_TARGETS
directly to AgentBridgePageClient (a Client Component). Each MitmTarget carries
a `handler: () => Promise<...>` function, which Next.js forbids across the
Server/Client boundary, raising at SSR time:
  "Functions cannot be passed directly to Client Components ..."
This broke the whole page ("erro ao carregar").

Fix: introduce MitmTargetView = Omit<MitmTarget, "handler"> and pass a
sanitized array (ALL_TARGETS.map(({ handler, ...rest }) => rest)). The UI never
invokes handler, so behavior is unchanged. Adds a regression test asserting the
sanitized targets are function-free and JSON-serializable.
dangeReis and others added 12 commits June 1, 2026 23:13
…on fixes (#3056)

Co-authored-by: Ruslan Sivak <russ@ruslansivak.com>
…sts)

#3054 ("remove 9 dead/unreachable free providers") removed the petals/nanobanana
configs, registry entries and validators but left dangling references that broke
the build and the unit suite on release/v3.8.8:

- open-sse/executors/petals.ts imported the deleted ../config/petals.ts
  (webpack "Module not found" → `next build` failed). Removed the executor, its
  registration + re-export in executors/index.ts, and the leftover
  `providerId === "petals"` branch in providerAllowsOptionalApiKey.
- Removed tests for the now-deleted providers: executor-petals.test.ts and
  poolside-provider.test.ts (REGISTRY.poolside was removed), and the petals /
  nanobanana validator assertions in provider-validation-specialty.test.ts,
  plus the stale petals catalog assertions in providers-page-utils.test.ts,
  proxy-connection-test.test.ts and providers-route-managed-catalog.test.ts.

The image/video/embed registries for nanobanana/replicate/nomic are real and
untouched — only the dead chat/api-key surfaces were removed. 146/146 affected
tests pass; typecheck / build clean.
…ropic + collapse)

Bugs found while testing the Quota Share engine on the local VPS:

- B1 hidden/stuck pools: pools created while the page group filter was "all"
  were persisted with group_id="all", matched no real group, and rendered
  nowhere — so they could not be seen, edited or deleted. PoolWizard now resolves
  the group id away from the "all" sentinel before POST/PATCH (falls back to the
  first real group / seed group-demo), and QuotaSharePageClient renders an
  "Ungrouped" recovery bucket so already-orphaned pools stay editable/deletable.
- B3 one-connection-per-pool made explicit: existingPoolConnectionIds now spans
  every member connection (not just the primary), and the wizard shows which pool
  an already-used connection belongs to instead of silently disabling it.
- B4 delete group: wired the missing UI control + handler (handleDeleteGroup,
  409-aware) — the backend DELETE handler + deleteGroup already existed. Hidden for
  "all" and the protected seed group-demo.
- B5a endpoints card now surfaces the native Anthropic POST /v1/messages line when
  a claude*/anthropic provider is in scope (previously only /v1/chat/completions).
- B5b endpoints card gained a collapse/minimize toggle (the card was too tall).

Source-scan tests + en/pt-BR i18n parity in quota-share-bugfixes-v388.test.ts.
The larger quota-key redesign (key type bound to a group, default-restricted with
opt-in normal-model access, recoverable keys, api-keys page layout) is planned
separately in _tasks/features-v3.8.8/quota-share-key-redesign.plan.md.
Remove the Petals executor from registration and exports.

Improve type safety by replacing broad any usage in MCP tool registration
with inferred types and documenting dynamic handler type limitations.

Add request validation for the agent bridge cert route and expand tests to
ensure switch buttons explicitly declare type="button", preventing implicit
form submissions.
…on fixes (#3059)

* fix(sse): defer enqueuing of event lines to align event names with data lines and prevent stop-signal event name misattribution

* fix(sse): preserve keep-alives and prevent pending event leakage on dropped chunks

* fix(sse): preserve pending event lines before other non-data lines and fix zero-window-size bypass

* fix(sse): defer lastEventLine update until after flush check to preserve previous event context on flush

* fix(sse): flush trailing pendingEventLine when stream closes

* fix(sse): preserve consecutive event lines without intervening data

---------

Co-authored-by: Ruslan Sivak <russ@ruslansivak.com>
…3061) (#3062)

No-auth / keyless providers (opencode, opencode-zen) returned synthetic
"noauth" credentials BEFORE honoring excludeConnectionIds, so the chat
account-fallback loop re-selected the same synthetic connection forever on
a persistent upstream error (e.g. the opencode public endpoint answering
401 "Model X is not supported"). The synthetic id has no DB row, so
markAccountUnavailable could not persist a cooldown to brake it — each
iteration wrote key-health + request logs immediately, growing the DB until
the disk filled (see @paraflu's "failure #320" trace in discussion #3038).

Honor the exclusion set in both synthetic-credential paths
(getProviderCredentials NOAUTH_PROVIDERS block + opencode-zen keyless
fallback): once "noauth" is already excluded, return null so the handler
stops after a single attempt. The happy path (nothing excluded -> synthetic
noauth) is preserved, so keyless access still works.

Closes #3061.

Tests (TDD): tests/unit/auth-noauth-fallback-loop-3061.test.ts — the two
exclusion cases failed before the fix and pass after; two happy-path guards
ensure first-selection synthetic noauth still resolves.
… not placeholders

The "Available endpoints" card's no-key (default) view generated representative
model ids from a hardcoded PREVIEW_MODELS_BY_PROVIDER map, so providers absent
from that map (claude, xiaomi-mimo, kimi-coding) rendered fake "model-a/b/c"
placeholders. It now fetches the REAL minted qtSd/* combos from /api/combos,
parses them (parseQuotaModelName), and groups by group → provider — falling back
to the placeholder map only when the fetch fails or returns nothing. The per-key
view already showed real models via /api/quota/keys/[id]/models; this aligns the
default view with it.

Verified on the local VPS: an exclusive key (share01) returns ONLY the real qtSd
models of its groups (claudao + chinas) and a non-quota key returns []. The
remaining /v1/models leak (non-quota keys still see qtSd among all models) is
tracked in the quota-key redesign plan.
…d, plan presets

- Beta banner scoped to the Quota Share page (functional-but-bugs-expected) with a
  pre-filled "open an issue" link (labels quota-share,beta). Page-only.
- Endpoints card now also surfaces POST /v1/responses (codex/github) and the
  codex-only WS /v1/responses line (the Responses-over-WebSocket proxy), each gated
  on the in-scope provider slug.
- planRegistry: seed xiaomi-mimo (4.1B-token weekly "lite" cap) and kimi-coding so the
  PoolWizard "Limite" step pre-fills a fair-share limit for these no-balance-API
  providers (fair-share enforces from the proxy's own token count, not an upstream
  balance — set the real cap manually in step 2).
- docs(API_REFERENCE): document the codex Responses-over-WebSocket endpoint.
- i18n en/pt-BR for all new keys.

Tracked in _tasks/features-v3.8.8/quota-share-key-redesign.plan.md (codex-WS config
toggle + per-provider balance fetchers + %-quota attribution are planned follow-ups).
Claude Code (Pro/Max) is a percentage-of-plan quota (5h rolling + weekly cap,
shared Claude+Code); exact token caps are unpublished/task-variable so percent is
the practical unit. Unblocks the PoolWizard 'Limite' pre-fill for claude pools.
Researched plan structures (codex/claude/glm/kimi/minimax/xiaomi) captured in the
quota-share redesign plan.
…n tiers

- xiaomi-mimo: token plan is MONTHLY (per platform.xiaomimimo.com/token-plan), so
  the seed is now tokens/monthly/4.1B (was weekly).
- deepseek: prepaid in USD — its balance API is already wired (deepseekQuotaFetcher)
  and the fair-share engine supports the usd unit (COUNTABLE_UNITS). Seeded a
  usd/monthly preset so the limit is set by dollar value.
- minimax: documented the real M3 tiers (Plus ~1.633B/Max ~5.053B/Ultra ~9.796B)
  in-comment; EPSILON keeps it manual until tier-aware presets land.
- planRegistry already seeds codex/claude/glm/minimax/kimi/kimi-coding/xiaomi-mimo/
  deepseek/bailian/alibaba; PoolWizard 'Limite' step stays editable.

Researched plan structures + the tier-aware-preset follow-up are in the redesign plan.
…ridge-secret auth

Two bugs made `wscat ws://host/v1/responses` fail with
"Transfer-Encoding can't be present with Content-Length":

1. authz/management policy 401'd the proxy's own internal authenticate/prepare
   loopback call to /api/internal/codex-responses-ws (MANAGEMENT-classified, the
   per-process bridge secret wasn't recognized one layer up). Added a tightly-scoped
   carve-out: isValidWsBridgeRequest() honors a timing-safe sha256 match of
   OMNIROUTE_WS_BRIDGE_SECRET (x-omniroute-ws-bridge-secret header) for that exact
   internal path; the route still re-validates the secret. → auth now succeeds → 101.

2. On auth failure the proxy spread the internal fetch's response headers onto the
   raw upgrade socket — a chunked Transfer-Encoding + Next CSP/route-class headers
   collided with writeHttpError's Content-Length framing (and duplicated Content-Type
   via a case-mismatched spread). writeHttpError now strips framing + pipeline/security
   headers (case-insensitive), and the auth-fail callsite no longer forwards them.

Regression test: tests/unit/responses-ws-proxy-headers.test.mjs (exports writeHttpError;
asserts no TE+CL, single Content-Type, no CSP/route-class leak, safe headers forwarded).
…2-table layout)

The key list stacked many badges in one column (tall/cluttered) and didn't
distinguish quota keys. Now renders two sections — "Normal keys" and "Quota keys"
(purple QUOTA pill) — sharing the same compact table header via an extracted
renderKeyRow(). Quota rows prepend a qtSd-only mode chip + group-name chips
(resolved by fetching /api/quota/pools + /api/quota/groups → poolId→group map).
Empty sections are hidden. i18n en/pt-BR for the new labels.

Source-scan test + i18n parity in api-manager-quota-keys-section.test.ts.
Comment thread src/i18n/messages/en.json
"normalKeysSection": "Normal keys",
"quotaKeysSection": "Quota keys",
"quotaPill": "QUOTA",
"quotaModeOnly": "qtSd-only"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SUGGESTION: The translation "qtSd-only" is a nonsensical placeholder that doesn't convey meaning to users. Consider using a meaningful label like "Quota-only" or "Quota mode" instead.

"normalKeysSection": "Chaves normais",
"quotaKeysSection": "Chaves de cota",
"quotaPill": "QUOTA",
"quotaModeOnly": "só-qtSd"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SUGGESTION: The translation "só-qtSd" is a nonsensical placeholder that doesn't convey meaning to users. Consider using a meaningful label like "Cota apenas" or "Modo cota" instead.

…(Check 2.9)

E2E testing on the VPS showed a normal key (empty allowedQuotas) could call a
qtSd/<group>/<provider>/<model> virtual model and route through a shared quota
pool — because the quota-exclusive enforcement (Check 3) only ran when
allowedQuotas was non-empty, so an unallocated key fell through to the normal
model checks and qtSd was served. This is the "empty allowedQuotas = all pools"
gap from the redesign.

Add Check 2.9 in enforceApiKeyPolicy: if the requested model is a qtSd model and
the key is NOT allocated to any quota pool (allowedQuotas empty), reject 403
QUOTA_NOT_ALLOCATED. Allocated keys are unchanged (Check 3 still validates scope).
This matches the owner's rule: only a key selected in a pool may use its qtSd
models. Normal (non-qtSd) model access for normal keys is unchanged.

Test: tests/unit/apikeypolicy-quota-only.test.ts — new case asserts a non-quota
key is blocked from qtSd (QUOTA_NOT_ALLOCATED) yet still uses normal models.
…ota sync

The quota-sync path deliberately reuses a rotating-refresh provider's (Codex/
OpenAI/Claude — see refreshSerializer ROTATION_LOCK_GROUP) access_token WITHOUT
proactively refreshing it (#3019, to avoid the Auth0 family-revocation cascade).
When that token is expired the codex usage fetch returns "token expired", and
syncExpiredStatusIfNeeded then flagged the connection testStatus="expired" — a
false-negative: the credential is still valid (expires_at in the future) and the
reactive serialized 401 path refreshes the access_token on next use.

Symptom: freshly-added Codex accounts showed "expired" with no quota on the
quota page, while a providers-page refresh turned them green. They never lost
access — only the quota sync mislabeled them.

Fix: extract the decision into the pure, exported `quotaPathShouldMarkExpired()`
and skip rotating providers (rotationGroupFor !== null). Their status is owned by
the reactive path / connection test, never the quota sync. Adds unit coverage.
…ialized refresh)

Symptom: freshly-added Codex accounts (e.g. davi/gabriel) showed "No quota data"
even when healthy. Root cause: the quota path reuses the access_token without
refreshing rotating providers (#3019, anti Auth0 family-revocation cascade), so a
Codex account whose short-lived access_token has expired can never surface quota
from the sync — the live fetch returns "Codex token expired".

Fix (opt-in, cascade-safe):
- refreshAndUpdateCredentials gains `allowRotatingRefresh` + a pure exported gate
  `shouldAttemptRotatingRefresh`. The actual token mint is wrapped in
  `serializeRefresh` (one refresh at a time per Auth0 rotation group) — so even N
  concurrent per-account requests can never refresh siblings in parallel.
- The BULK scheduler (syncAllProviderLimits, concurrent) keeps the flag OFF →
  #3019 fully preserved (guardian test codex-quota-sync-no-proactive-refresh stays
  green). Only the on-demand, per-connection path (`GET /api/usage/[connectionId]`)
  opts in.
- Frontend: the quota page auto-fetches LIVE on open for the VISIBLE connections
  that have no cached quota (scoped to what's on screen — not all connections —
  and skips entries already cached), so expired-token Codex accounts surface real
  quota automatically and cascade-safely.

Adds unit coverage for the gate (bulk skips rotating, on-demand allows; non-rotating
always eligible). typecheck / lint clean.
…c mitm manager stub

The Docker image build (`docker compose --profile cli build`) runs `next build`
with OMNIROUTE_USE_TURBOPACK=1 and failed with two Turbopack errors that the
webpack-based VM build never hits — which is why the VM deploy validated but the
Docker build errored (#3066). The reporter's log was truncated before the real
errors; reproducing `OMNIROUTE_USE_TURBOPACK=1 npm run build` locally surfaced them:

1. node_modules/sqlite-vec-linux-x64/vec0.so — "Unknown module type". sqlite-vec
   ships a native vec0.so loaded at runtime via createRequire(); Turbopack tried to
   bundle the .so. Fixed by adding "sqlite-vec" to serverExternalPackages, exactly
   like better-sqlite3.

2. /api/tools/agent-bridge/state statically imports getAllAgentsStatus from
   @/mitm/manager, which next.config aliases to manager.stub.ts for the Turbopack
   build. The stub did not export getAllAgentsStatus → "Export getAllAgentsStatus
   doesn't exist in target module". Added the export (throws like the other heavy
   ops — MITM/agent-bridge is non-functional in the bundled build anyway).

Tests (tests/unit/next-config.test.ts):
- assert sqlite-vec is in serverExternalPackages.
- new guard: manager.stub.ts must export every name statically imported from
  @/mitm/manager across src/app (catches stub/manager drift — would have caught this).

Verified: OMNIROUTE_USE_TURBOPACK=1 npm run build → EXIT 0 (was: Build error
occurred); webpack build → EXIT 0; typecheck:core / check:cycles / lint clean.

Fixes #3066
… (review feedback)

Follow-up to 146244b (#3066), addressing optional review suggestions:

- manager.stub.ts: getAllAgentsStatus now returns [] (the truthful "no agents"
  state, type-faithful) instead of throwing. Unlike the dynamic-import heavy ops,
  this is a STATIC import baked into the Turbopack/bundled build, so it is
  legitimately reached at runtime there — returning an empty list degrades
  gracefully instead of erroring. (Functionally inert for the existing
  agent-bridge/state route, where getMitmStatus already rejects first.)

- next-config.test.ts: the stub-drift guard no longer hard-asserts a specific
  symbol (getAllAgentsStatus); the generic ">=1 import found" sanity plus the
  missing-exports check remain, so the guard survives an agent-bridge /
  traffic-inspector route being renamed or removed.

typecheck:core / lint / next-config suite (4/4) clean. The export still exists,
so the Turbopack build resolution is unchanged.
… review)

Addresses findings from the multi-agent PR review of the #3066 fix:

- manager.stub.ts comments: the previous inline comment claimed the throwing ops
  (getMitmStatus/startMitm/stopMitm) are "dynamic-import paths that should never hit
  the stub at runtime" — factually wrong: those are static imports too, baked into the
  bundled build just like getAllAgentsStatus. Rewrote the file header to describe the
  real split — exports with a safe degraded value return it (getCachedPassword/
  setCachedPassword/clearCachedPassword → null/no-op, getAllAgentsStatus → []) while
  getMitmStatus/startMitm/stopMitm throw STUB_ERROR — and trimmed the inline comment.
  Comment-only; no runtime/build change (the export still exists).

- stub-drift guard test: now scans ALL of src/ instead of only src/app —
  src/lib/tailscaleTunnel.ts statically imports getCachedPassword/setCachedPassword
  from @/mitm/manager and is pulled into routes transitively, so the src/app-only scan
  had a false-negative blind spot. Also skips inline `type` imports (erased at build,
  need no runtime export) and detects stub exports from declaration AND `export { … }`
  forms (no false-positive if the stub later uses class/re-export).

Verified: next-config suite 4/4, typecheck:core / lint clean.
…emory in Docker)

Completes the #3066 fix. Externalizing sqlite-vec unblocked the Turbopack build, but
Next.js does not trace sqlite-vec's platform-specific native package
(sqlite-vec-<os>-<arch>, which ships vec0.so) into .next/standalone — sqlite-vec
resolves it at runtime via require.resolve() (Next.js issue #88844). Result: in the
bundled/Docker build the wrapper loaded but getLoadablePath() threw MODULE_NOT_FOUND,
so vectorStore silently degraded vector/semantic memory to FTS5 keyword search.

build-next-isolated now syncs the sqlite-vec wrapper plus whichever sqlite-vec-<platform>
package npm installed into the standalone output (mirroring the existing better-sqlite3
native-binary handling). Platform-agnostic, so Docker (linux) and Electron (mac/win/linux)
builds all carry their matching vec0.so/.dylib/.dll.

Verified: vec0.so present in .next/standalone/node_modules/sqlite-vec-linux-x64;
createRequire("sqlite-vec") + require.resolve("sqlite-vec-linux-x64/vec0.so") both
resolve from inside the standalone (no FTS5 fallback). build-next-isolated tests 7/7.
@diegosouzapw
Copy link
Copy Markdown
Owner Author

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

Scope: the #3066 Docker/Turbopack build-fix commits added to this release branch (e438139b0..34e0eab09: next.config.mjs, scripts/build/build-next-isolated.mjs, src/mitm/manager.stub.ts, tests/unit/next-config.test.ts) — not the full 542-commit release diff. Five independent passes (CLAUDE.md adherence, bug scan, git history, prior-PR comments, code comments); the two candidate comment nits both scored as false positives (pre-existing / already-documented).

🤖 Generated with Claude Code

Back-merge to resolve PR #2930 (release/v3.8.8 -> main) conflicts. Release is a
superset of main's features, so all ~44 content conflicts resolved to the
release ("ours") version; generated .source/* dropped.

Reconciliation:
- auth.ts: port #3058 (getProviderSearchPool expands custom provider_nodes
  prefixes to internal connection ids) — release lacked this main fix.
- quota-plan-registry.test.ts: align knownProviders() 6 -> 10 (pre-existing
  stale assertion vs the registry).
The "recent" memory strategy maps to the internal "exact" retrieval path, whose
post-query relevance filter (score > 0) silently dropped recent memories whose
text didn't overlap the current prompt. Since the user-facing strategy enum is
only recent|semantic|hybrid (no "exact"), forwarding the prompt as `query` for
"recent" always engaged that filter, so recency-based injection returned nothing
when the prompt was unrelated to the stored memory.

Skip query forwarding for the "recent" strategy so retrieveMemories returns the
most recent memories (ORDER BY created_at DESC) regardless of prompt overlap.
Semantic/hybrid still forward the query for vector search.

Fixes the chat-pipeline + memory-pipeline integration memory-injection tests.
Two pre-existing failures (release CI never ran):
1. Auth: the /api/compliance/audit-log route now requires management auth
   (requireManagementAuth). The test issued bare requests; in CI INITIAL_PASSWORD
   makes auth required, so it got 401. Now attaches a signed dashboard-session
   cookie via the shared managementSession helper (like sibling management-route
   tests).
2. Taxonomy: the test seeded stale action names (provider.added, combo.created)
   and treated provider.validation.ssrf_blocked as non-high. Aligned the seed to
   real HIGH_LEVEL_ACTIONS (provider.credentials.created, quota.pool.created) so
   the level=high filter assertion validates the actual filter.
…3058 follow-up)

In checkModelAvailable and handleSingleModelChat, when the combo target's
providerId is merely the prefix already encoded in the model string (e.g. "p2"
from "p2/test-model"), prefer the fully-resolved provider (e.g. the generated
custom node id openai-compatible-chat-e2e-p2) so the executor resolves the
custom baseUrl from the connection instead of falling back to the base provider
(real openai). Intentional providerId overrides (providerId not encoded in the
model string) are preserved.

Also fixes the resilience-http-e2e combo tests (cooldown window + DB-write
visibility for the cooled-down-primary skip).
- quota-equal-split / quota-summed-budget: drop top-level `await` from test()
  registrations. Under --test-force-exit --test-concurrency=4 the awaited
  registrations were cancelled mid-module-eval when a sibling's slow SQLite
  migration briefly emptied the event loop. No assertions changed.
- proxy-registry-flow: the legacy /api/settings/proxy GET is now a unified bridge
  over the new proxy registry; after an atomic create-with-assignment it resolves
  to the newly assigned proxy (atomic-flow) and supersedes the legacy config —
  assert that instead of expecting null.
- e2e: agent-skills redirect regex now matches the bare /login auth redirect;
  memory-qdrant uses the unique heading locator (strict-mode fix); group-b specs
  navigate to the real pages / tolerate the auth redirect like sibling specs;
  playground-compare checks the toolbar control (Run all|Cancel all) per state.
- quota-combo-balancing / quota-multiprovider: eliminate the per-test full SQLite
  migration (migrate once at module load; resetStorage now DELETEs rows instead of
  rmSync+re-migrate) so it no longer races --test-force-exit under concurrency;
  drain the fire-and-forget syncQuotaCombosGuarded dispatched by createPool/
  updatePool (flushPendingSyncs via setImmediate) so assertions see deterministic
  combo state; assert the GROUP-slug combo name (combos are named by group, like
  quota-combo-groups) and seed the group. Validated on CI shards 5/8 + 8/8 (7 runs).
- playground-compare: wait for CompareTab (dynamic import) to mount, and use
  expect().toBeVisible() instead of locator.isVisible() (which no longer waits in
  Playwright 1.50+).
- group-b-quota-plans-config: drop the unreliable raw-HTML "500" substring check
  (Next.js chunk hashes contain "500"); keep the real error-boundary text check.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] MaintenanceBanner shows false 'Server is unreachable' under load