-
Notifications
You must be signed in to change notification settings - Fork 4
15 tasks stuck in exploration loops: 30 turns of reading without editing #129
Copy link
Copy link
Open
Description
Problem
In our SWE-bench-verified evaluation, 15 tasks consumed all 30 iterations on exploration (reading files, calling symbol_context, grepping) without ever making a single edit. The agent gets stuck in analysis paralysis.
Data
- 15 tasks with 30 iterations and 0 file edits
- These tasks have tool call patterns like:
symbol_context→Read→Grep→Read→symbol_context→Read→ ... (repeating for 30 turns) - The agent keeps gathering more context but never transitions to the "fix" phase
Root Cause
Two contributing factors:
- No iteration budget awareness: The agent doesn't know it has a 30-iteration limit and doesn't pace itself
- Rich exploration tools encourage over-exploration: When
symbol_contextreturns detailed context with callers, callees, and related symbols, the agent follows every lead instead of focusing
Impact
These 15 tasks are guaranteed losses. If even half transitioned to editing, that's +3-4 additional resolves.
Recommended Fixes
- Add budget awareness to instructions: "You have a limited number of turns. Spend no more than 10 turns exploring before making your first edit."
- Exploration cap hint: After the agent has made 8-10
symbol_context/Read calls without editing, include a hint in the next response: "Consider making your edit now — you've gathered significant context." - Progressive brevity: Make tool responses progressively shorter as iteration count increases (if iteration count is available to the server)
- Structured workflow in instructions: "Phase 1 (turns 1-5): Understand the problem. Phase 2 (turns 6-20): Implement the fix. Phase 3 (turns 21-30): Test and refine."
Labels
performance, swe-bench
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels