-
Notifications
You must be signed in to change notification settings - Fork 1.3k
feat: automatic model fallback system with background support #795
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: automatic model fallback system with background support #795
Conversation
Add support for connecting to remote MCP servers via HTTP in addition to
the existing stdio (local process) connections. This enables skills to
use cloud-hosted MCP servers and aggregated MCP gateways.
## Changes
- Extend SkillMcpManager to detect connection type from config:
- Explicit `type: "http"` or `type: "sse"` → HTTP connection
- Explicit `type: "stdio"` → stdio connection
- Infer from `url` field → HTTP connection
- Infer from `command` field → stdio connection
- Add StreamableHTTPClientTransport from MCP SDK for HTTP connections
- Supports custom headers for authentication (e.g., API keys)
- Proper error handling with helpful hints
- Maintain full backward compatibility with existing stdio configurations
## Usage
```yaml
# HTTP connection (new)
mcp:
remote-server:
url: https://mcp.example.com/mcp
headers:
Authorization: Bearer ${API_KEY}
# stdio connection (existing, unchanged)
mcp:
local-server:
command: npx
args: [-y, @some/mcp-server]
```
## Tests
Added comprehensive tests for:
- Connection type detection (explicit type vs inferred)
- HTTP URL validation and error messages
- Headers configuration
- Backward compatibility with stdio configs
Add mocks for HTTP transport to avoid real network calls during tests. This addresses reviewer feedback about test reliability: - Tests are now faster (no network latency) - Tests are deterministic across environments - Test intent is clearer (unit testing error handling logic) The mock throws immediately with a controlled error message, allowing tests to validate error handling without network dependencies. Co-Authored-By: Claude Opus 4.5 <[email protected]>
…a fields Add new optional fields to AgentOverrideConfigSchema and CategoryConfigSchema: - fallback: string[] - array of fallback model strings - fallbackDelayMs: number (100-30000) - delay between retry attempts - fallbackRetryCount: number (1-5) - maximum models to try These fields enable automatic model fallback when primary model fails due to rate limits, service unavailability, or other transient errors.
Implement core model fallback utilities in src/features/model-fallback/: - parseModelString(): Parse 'provider/model' format into ModelSpec - modelSpecToString(): Convert ModelSpec back to string - buildModelChain(): Build ordered list of models from primary + fallbacks - isModelError(): Detect retryable errors (rate limits, 429, 503, timeouts) - withModelFallback(): Generic retry wrapper for simple use cases - formatRetryErrors(): Format error list for user-friendly display The module detects these transient errors for automatic retry: - Rate limits (rate_limit, 429) - Service unavailable (503, 502, unavailable, overloaded, capacity) - Network errors (timeout, ECONNREFUSED, ENOTFOUND) Non-model errors (auth failures, invalid input) are NOT retried.
Update sisyphus-task to use the model fallback system: Integration changes: - Import parseModelString, isModelError, RetryConfig from model-fallback - Update resolveCategoryConfig() to return modelChain and retryConfig - Use configurable fallbackDelayMs and fallbackRetryCount from config - Build model chain from primary model + fallback array User notification on fallback: - Show which model was actually used in task completion message - Display warning when fallback occurred with list of failed models - Example: '⚠️ Model fallback occurred: Primary model failed, used anthropic/claude-haiku-4-5 instead. Failed models: deepseek/deepseek-v3-2' Special handling: - Agent configuration errors (agent.name) do NOT trigger fallback - Only transient model errors (rate limits, timeouts) retry with fallback - Inline retry logic preserved for session lifecycle management
Add 36 unit tests covering all model-fallback functions: parseModelString() tests: - Valid provider/model strings - Nested paths (provider/sub/model) - Invalid single-segment strings - Empty string handling buildModelChain() tests: - Primary model only - Primary with fallback array - Invalid models filtered out - Empty/undefined fallback handling isModelError() tests: - Rate limit detection (rate_limit, 429) - Service errors (503, 502, unavailable, overloaded, capacity) - Network errors (timeout, ECONNREFUSED, ENOTFOUND) - Non-model errors return false - Non-Error objects return false withModelFallback() tests: - Success on first model - Fallback on model error - No retry on non-model errors - Exhausts all models on persistent failures - Respects maxAttempts config - Empty model chain handling - Records usedModel on success/failure formatRetryErrors() tests: - Empty array handling - Single error formatting - Multiple errors as numbered list
- P1: Default maxAttempts=1 when no category (was 0, sync prompts failed) - P2: Preserve variant on primary model when building modelChain - P2: Clean up subagentSessions on early returns (memory leak) - P2: Ensure maxAttempts >= 1 even with empty modelChain - P2: Update warning text to use generic model descriptions
P1 Fixes: - maxAttempts validation ensures minimum 1 attempt P2 Fixes: - Remove over-broad undefined error matching (only check for agent.name) - Add modelChain support to background tasks (LaunchInput, BackgroundTask) - Pass modelChain to background manager for fallback support All model-fallback tests pass (36/36).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4 issues found across 12 files
Confidence score: 3/5
- Duplicate
session.promptinvocation insrc/features/background-agent/manager.tscan double-run the same task and leak concurrency before registration, which is a concrete regression risk. parseModelStringinsrc/features/model-fallback/index.tscurrently accepts empty provider/model pieces, so malformed specs could enter the fallback chain and trigger unexpected fallbacks.src/features/skill-mcp-manager/manager.test.tsdoesn’t confirm headers reach the transport layer, leaving header regressions undetected even though the test would still pass.- Pay close attention to
src/features/background-agent/manager.ts,src/features/skill-mcp-manager/manager.test.ts,src/features/model-fallback/index.ts,pr_790_body.txt- functionality and documentation mismatches need to be resolved.
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="pr_790_body.txt">
<violation number="1" location="pr_790_body.txt:3">
P2: PR body describes HTTP transport for SkillMcpManager, which does not match the PR title about automatic model fallback, likely misleading reviewers/release notes.</violation>
</file>
<file name="src/features/background-agent/manager.ts">
<violation number="1" location="src/features/background-agent/manager.ts:94">
P1: Duplicate session.prompt calls in launch() cause double execution and potential concurrency leak before task registration</violation>
</file>
<file name="src/features/skill-mcp-manager/manager.test.ts">
<violation number="1" location="src/features/skill-mcp-manager/manager.test.ts:47">
P2: HTTP headers test does not verify headers reach the transport; regression dropping headers would still pass</violation>
</file>
<file name="src/features/model-fallback/index.ts">
<violation number="1" location="src/features/model-fallback/index.ts:10">
P2: parseModelString treats empty provider/model segments as valid, allowing malformed model specs into the fallback chain</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| /missing required 'command' field/ | ||
| ) | ||
| // #when / #then | ||
| await expect(manager.getOrCreateClient(info, config)).rejects.toThrow( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: HTTP headers test does not verify headers reach the transport; regression dropping headers would still pass
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/features/skill-mcp-manager/manager.test.ts, line 47:
<comment>HTTP headers test does not verify headers reach the transport; regression dropping headers would still pass</comment>
<file context>
@@ -3,46 +3,326 @@ import { SkillMcpManager } from "./manager"
- /missing required 'command' field/
- )
+ // #when / #then
+ await expect(manager.getOrCreateClient(info, config)).rejects.toThrow(
+ /no valid connection configuration/
+ )
</file context>
- Fix P1: Remove duplicate session.prompt calls in manager.ts that caused double execution before task registration. Synced with origin/dev and properly integrated modelChain support. - Fix P2: parseModelString now rejects empty provider/model segments, preventing malformed model specs from entering the fallback chain. - Remove pr_790_body.txt which belonged to a different PR (HTTP transport).
- Restore constants.ts to use standard provider names (google/, openai/) instead of custom nextologies provider - Add modelChain and retryConfig to resolveCategoryConfig return type
|
All contributors have signed the CLA. Thank you! ✅ |
|
I have read the CLA Document and I hereby sign the CLA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
6 issues found across 41 files (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="src/tools/lsp/tools.ts">
<violation number="1">
P2: `lsp_symbols` can throw when `limit` ≤ 0 because it accesses `limited[0]` after slicing to an empty array.</violation>
</file>
<file name="src/tools/sisyphus-task/tools.ts">
<violation number="1">
P1: User-controlled resume session ID is joined into filesystem path without validation, allowing path traversal outside MESSAGE_STORAGE.</violation>
</file>
<file name="README.md">
<violation number="1">
P2: Beta release link still points to v3.0.0-beta.1 while install text references v3.0.0-beta.6, causing user-facing version mismatch.</violation>
</file>
<file name="src/tools/AGENTS.md">
<violation number="1">
P3: Documentation inconsistency: overview still says LSP has 11 tools while this row indicates 7, causing conflicting guidance about available LSP tools.</violation>
</file>
<file name="src/tools/sisyphus-task/constants.ts">
<violation number="1">
P2: Default category configs no longer define fallback chains, so modelChain has no alternatives and the new automatic fallback system doesn’t activate for category/background tasks.</violation>
</file>
<file name="src/tools/sisyphus-task/tools.test.ts">
<violation number="1">
P2: Test reimplements model source detection instead of exercising production modelInfo logic, so it can pass even if the real bug remains.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| import { resolveMultipleSkills } from "../../features/opencode-skill-loader/skill-content" | ||
| import { createBuiltinSkills } from "../../features/builtin-skills/skills" | ||
| import { getTaskToastManager } from "../../features/task-toast-manager" | ||
| import type { ModelFallbackInfo } from "../../features/task-toast-manager/types" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P1: User-controlled resume session ID is joined into filesystem path without validation, allowing path traversal outside MESSAGE_STORAGE.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/tools/sisyphus-task/tools.ts, line 251:
<comment>User-controlled resume session ID is joined into filesystem path without validation, allowing path traversal outside MESSAGE_STORAGE.</comment>
<file context>
@@ -239,9 +248,18 @@ Use \`background_output\` with task_id="${task.id}" to check progress.`
})
try {
+ const resumeMessageDir = getMessageDir(args.resume)
+ const resumeMessage = resumeMessageDir ? findNearestMessageWithFields(resumeMessageDir) : null
+ const resumeAgent = resumeMessage?.agent
</file context>
|
Replaced by #797 with clean commit history and proper conflict resolution. |
Summary\n\nAdds an automatic model fallback system that gracefully handles API failures by retrying with configured fallback models. Background tasks now properly support model fallback chains, ensuring reliability even for async operations.\n\n## Features\n\n### Automatic Fallback on Transient Errors\n- Rate limits and HTTP 429 errors\n- Service unavailable (HTTP 503, 502)\n- Capacity issues (overloaded, capacity, unavailable)\n- Network errors (timeout, ECONNREFUSED, ENOTFOUND)\n\nNon-model errors (authentication failures, invalid input) are NOT retried.\n\n### Background Task Support\n- Background tasks properly use modelChain for fallback\n- Ensures reliability for async operations\n\n### Validated Configuration\n- maxAttempts >= 1 enforced to prevent silent failures\n- Clean error handling: only aborts on configuration errors, continues on model errors\n\n## Implementation\n\nCore module src/features/model-fallback/ provides utilities for model parsing, chain building, error detection, and retry logic with validation. Integrated with sisyphus-task for seamless fallback handling.\n\n## Fixes from Review\n
✅ 36/36 model-fallback tests passing\n✅ Build successful\n✅ TypeScript types valid\n\nAll commits authored by stranger2904 with GitHub noreply email.\nClean PR without auto-generated descriptions.\n\nRelated: Fixes PR feat: automatic model fallback system with user notification #785 authorship issue.
Summary by cubic
Adds an automatic model fallback system for both sync and background tasks to keep prompts running through transient API failures. Also adds HTTP transport for MCP servers.
New Features
Bug Fixes
Written for commit aa1c1dc. Summary will update on new commits.