-
-
Notifications
You must be signed in to change notification settings - Fork 741
Make provider availability snapshot reliable for MCP-heavy providers #1314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -27,7 +27,11 @@ import { | |
| import { applyMutableProviderConfigToOverrides } from "../daemon-config-store.js"; | ||
| import type { MutableDaemonConfig } from "../daemon-config-store.js"; | ||
|
|
||
| const DEFAULT_REFRESH_TIMEOUT_MS = 30_000; | ||
| // MCP-heavy providers (omp/pi with many configured MCP servers, opencode) can | ||
| // take well over 30s to start an RPC session and enumerate models/commands when | ||
| // several providers refresh concurrently. Use a generous budget so those probes | ||
| // resolve to available instead of erroring out under contention. | ||
| const DEFAULT_REFRESH_TIMEOUT_MS = 90_000; | ||
|
|
||
| type ProviderSnapshotChangeListener = (entries: ProviderSnapshotEntry[], cwd: string) => void; | ||
|
|
||
|
|
@@ -520,9 +524,16 @@ export class ProviderSnapshotManager { | |
| } | ||
|
|
||
| private async loadProviders(options: ProviderLoadOptions): Promise<void> { | ||
| await Promise.allSettled( | ||
| options.providers.map((provider) => this.loadProvider({ ...options, provider })), | ||
| ); | ||
| // Probe providers sequentially rather than all at once. MCP-heavy providers | ||
| // (omp/pi with many configured MCP servers, opencode) each spawn an RPC | ||
| // session and connect every configured MCP server during their probe; | ||
| // running them concurrently starves CPU/IO on smaller hosts and makes | ||
| // availability probes flake with spurious timeouts/crashes. Those failures | ||
| // are then cached and gate on-demand model fetches. Sequential probing | ||
| // trades a longer total refresh for reliable per-provider results. | ||
| for (const provider of options.providers) { | ||
| await this.loadProvider({ ...options, provider }).catch(() => undefined); | ||
| } | ||
|
Comment on lines
+534
to
+536
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The behavioral change from Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time! |
||
| } | ||
|
|
||
| private loadProvider(options: ProviderLoadOptions & { provider: AgentProvider }): Promise<void> { | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refreshProviderappliesthis.refreshTimeoutMstwice per provider — once toisAvailable()and again tofetchModels + fetchModes. If both phases hit the ceiling, a single provider consumes up to2 × 90 s = 3 min. For an N-provider setup in a daemon restart, the sequential loop can now block for up to N × 3 min before any snapshot is considered fresh. Lightweight providers are fine in practice (binary-presence checks return in milliseconds), but if any slow provider is positioned early inoptions.providersit delays all providers behind it. A small concurrency limit (e.g. 2–3 parallel probes) would bound the latency regression while still resolving the contention problem the PR targets — the PR description mentions this as a ready alternative.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!