Skip to content

Credential ranking ignores resetsAt for non-blocked accounts — wastes perishable (tumbling-window) quota #633

@devswha

Description

@devswha

Summary

AuthStorage.#rankOAuthSelections (packages/ai/src/auth-storage.ts, comparator ~L2528) ranks non-blocked OAuth credentials purely by least usage / lowest drain rate. A window's resetsAt is only consulted to order already-blocked credentials by blockedUntil; accounts that still have headroom are picked solely by "who is least drained."

For Claude Max (and Codex) the 5h/7d limits are tumbling windows — unused quota is lost at reset. Least-used-first can route load to a far-resetting account while a soon-resetting account's remaining quota silently expires.

Why this is suboptimal

For perishable (tumbling-window) quota, the throughput-optimal greedy is earliest-expiry-first: drain the soonest-to-reset account that still has headroom, because its quota is the most perishable.

Example — A (5h resets in 10m, 50% left) and B (5h resets in 4h, 30% left):

  • Current (least-used-first) picks B → A resets in 10m with ~50% wasted, B depleted.
  • Earliest-reset picks A → A's about-to-perish quota is used; B still available for 4h.

At low/medium load this is a structural quota leak.

Proposal

Add an opt-in ranking mode, default unchanged:

  • AuthStorageOptions.credentialRankingMode: "balanced" (default) | "earliest-reset".
  • In earliest-reset, among non-blocked candidates, add a dominant sort key on the soonest window resetsAt (primary window, falling back to secondary), keeping the existing drain/used terms as tiebreakers.
  • Wire via env GJC_CREDENTIAL_RANKING_MODE in discoverAuthStorage (coding-agent/src/sdk.ts), matching the existing env-driven pattern (GJC_AUTH_BROKER_URL).

Why low-risk / low-cost

  • balanced path is byte-identical to today — zero change for existing users.
  • resetsAt, used fraction, and drain rate are already computed inside #rankOAuthSelections — this is a comparator/weighting change + a flag, no new telemetry.
  • Session stickiness / prompt-cache locality is already preserved by the existing shouldRank guard (L2581-2607): re-ranking only happens at session start or when the preferred credential is blocked, so an EEF tiebreak never thrashes accounts mid-session.

Constraints honored

  • Blocked/exhausted accounts still sort last regardless of mode (no driving an imminent-reset account into a hard 429 mid-task; mid-request limit hits still fall back via #markCredentialBlocked).
  • EEF only weights by resetsAt when window data exists; with no usage data it is a no-op (falls back to round-robin order).
  • Applies to tumbling-window providers (Claude 5h/7d, Codex); sliding-window semantics fall back to balanced naturally since the dominant key is gated behind the opt-in.

PR follows, referencing this issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions