-
Notifications
You must be signed in to change notification settings - Fork 4
TodoWrite consumes 13.7% of tool call budget with no measurable benefit #128
Copy link
Copy link
Open
Description
Problem
In our SWE-bench-verified evaluation, the agent spends 13.7% of all tool calls on TodoWrite (Claude Code's built-in task tracking tool), averaging 3.6 calls per task. This tool call overhead provides no measurable benefit — tasks with heavy TodoWrite usage don't resolve at higher rates.
Data
- 3.6 TodoWrite calls per task on average (across all MCP tasks)
- 13.7% of total tool budget consumed by TodoWrite
- No correlation between TodoWrite usage and task resolution
- With a 30-iteration limit, each wasted call is ~3.3% of the total budget
Root Cause
Claude Code's default behavior includes proactive task list management. When the agent receives a complex problem statement, it creates a todo list, updates it as it works, and marks items complete — all consuming tool call turns that could be spent on actual exploration and coding.
Impact
Recovering even half of these wasted calls would give the agent ~2 additional exploration or editing turns per task. Over 500 tasks, this is significant.
Recommended Fixes
- Add instruction to MCP server: "Do not use TodoWrite or task management tools — focus all tool calls on exploration and code editing"
- Include in agent_prompt: Add a line discouraging TodoWrite usage (though this must be balanced against the prompt length findings from Long agent_prompt suppresses parallel tool calling in Claude Code harness #123)
- Investigate: Whether this can be suppressed via Claude Code configuration rather than instructions
Labels
performance, swe-bench
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels