-
Notifications
You must be signed in to change notification settings - Fork 2k
[F:746] debug mode #777
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
[F:746] debug mode #777
Conversation
- Add explicit timeout error surfacing for sync polling loop (10 min max) - Add timeout error surfacing for resume polling loop (60s max) - Include session ID and partial response in timeout error message - Properly cleanup toast manager and session state on timeout This fixes the freeze issue where sisyphus_task would hang silently when an agent (like @sherlock) never reached idle status within the timeout period. Now users see a clear error message with debugging info.
|
All contributors have signed the CLA. Thank you! ✅ |
|
I have read the CLA Document and I hereby sign the CLA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 901047cb9d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
src/tools/sisyphus-task/tools.ts
Outdated
| const partialMsgs = ((partialResult as { data?: unknown }).data ?? partialResult) as Array<{ | ||
| info?: { role?: string; time?: { created?: number } } | ||
| parts?: Array<{ type?: string; text?: string }> | ||
| }> | ||
| const partialAssistant = partialMsgs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guard timeout partial fetch against session errors
In the timeout branch you coerce client.session.messages into an array without checking for an error result. If the session is expired or the API call fails, messages() can return an { error: ... } object (as handled later in the function), so partialMsgs is not an array and the subsequent .filter/.sort path will throw, preventing the tool from returning the timeout message. Consider checking partialResult.error or guarding with Array.isArray/try-catch (the same pattern appears in the resume timeout block above).
Useful? React with 👍 / 👎.
Add error and Array.isArray checks before processing partial messages
in both resume and task timeout handlers. Previously, if the session
was expired or the API call failed, client.session.messages() could
return an { error: ... } object instead of an array, causing .filter()
to throw and preventing the timeout message from being returned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No issues found across 10 files
Confidence score: 5/5
- Automated review surfaced no issues in the provided summaries.
- No files require special attention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 issue found across 10 files
Confidence score: 4/5
- The duplicate partial-response parsing logic in
src/tools/sisyphus-task/tools.tsacross the timeout branches increases the chance the branches drift out of sync, potentially causing inconsistent parser fixes over time. - Pay close attention to
src/tools/sisyphus-task/tools.ts- consolidate the repeated timeout parsing logic to prevent divergence.
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="src/tools/sisyphus-task/tools.ts">
<violation number="1" location="src/tools/sisyphus-task/tools.ts:274">
P2: Duplicate partial-response parsing logic added in two timeout branches; should be consolidated to avoid divergence</violation>
</file>
Since this is your first cubic review, here's how it works:
- cubic automatically reviews your code and comments on bugs and improvements
- Teach cubic by replying to its comments. cubic learns from your replies and gets better over time
- Ask questions if you need clarification on any suggestion
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
|
| GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
|---|---|---|---|---|---|
| - | - | Generic Password | e49802c | src/agents/sherlock.ts | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secret safely. Learn here the best practices.
- Revoke and rotate this secret.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
Extract duplicate code from resume and task timeout handlers into reusable helper functions: - extractPartialResponseText(): handles error/array guards and message parsing - formatPartialResponse(): consistent truncation and formatting Reduces code duplication and prevents future divergence between the two timeout branches.
jkoelker
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've run this agent locally and it did a pretty decent job of diagnosing the suspend issues on my laptop (i already know the underlying issue and it found it as well).
Add skill_mcp tool documentation to Sherlock's prompt for browser-based debugging scenarios. This enables: - Visual bug diagnosis via screenshots - Browser console error inspection - UI interaction debugging - Network request analysis Sherlock can now use Playwright MCP when debugging frontend/UI issues.
… boundary support - Add buildSherlockSection() to sisyphus-prompt-builder for dynamic docs - Add debugging step (step 5) to pre-delegation planning decision tree - Update Phase 2C to delegate to Sherlock after 2 failed fix attempts - Add Docker/Container debugging strategies (5 strategies) - Add System Boundary Analysis for ORM/DB/API bugs (timezone, type coercion) - Add Dependency Scan for 3rd iteration escalation - Add escalation path: iteration 1-2 -> 3 (scan) -> 4+ (boundaries) -> Oracle - Update triggers to include container debugging and system boundary bugs
- Change flow from 'Sherlock then Oracle' to 'Oracle FIRST then Sherlock' - Oracle provides system context (architecture, known gotchas, focus areas) - Sherlock receives Oracle's context to form better hypotheses - Prevents wasted iterations on wrong subsystems - Example: Oracle knows 'Prisma strips timezone' -> Sherlock targets ORM immediately - Update Phase 2C with Oracle consultation and Sherlock delegation formats - Update buildSherlockSection with Oracle -> Sherlock usage pattern
- Replace deprecated 'debugging-master' with 'sherlock' across orchestrator - Update agent selection table to include sherlock with Oracle context note - Update decision matrix: 'Debug complex issue' -> Oracle context -> Sherlock - Update delegation table: Hard debugging -> Oracle -> Sherlock flow - Change failure threshold from 3 to 2 consecutive failures - Add debugging flow diagram: Failure -> Oracle -> Sherlock -> Fix - Update delegation targets to show Oracle + Sherlock pattern
Summary
Add Sherlock - a hypothesis-driven debugging specialist that uses runtime evidence (not guesswork) to diagnose bugs. Implements the Oracle → Sherlock debugging flow where Oracle provides system context first, then Sherlock performs evidence-based debugging.
Key Features
skill_mcpDebugging Flow
Changes
src/agents/sherlock.tssrc/agents/sherlock.test.tssrc/agents/sisyphus.tssrc/agents/sisyphus-prompt-builder.tsbuildSherlockSection()with Oracle vs Sherlock comparisonsrc/agents/orchestrator-sisyphus.tsdebugging-masterwithsherlock, Oracle→Sherlock flowsrc/tools/sisyphus-task/tools.tssrc/agents/types.tssrc/agents/utils.tssrc/agents/index.tssrc/hooks/sisyphus-orchestrator/index.tsisGitRepo()guard to prevent git diff spamsrc/features/skill-mcp-manager/env-cleaner.ts"undefined"string valuessrc/auth/antigravity/storage.test.tsTotal: 13 files changed, +1182 lines, -18 lines
Commits
Testing
Sherlock Capabilities
Escalation Path
Docker Debugging Strategies
docker logs -f)System Boundary Bug Example (Timezone)
Related Issues
Closes #746