Skip to content

TodoWrite consumes 13.7% of tool call budget with no measurable benefit #128

@greynewell

Description

@greynewell

Problem

In our SWE-bench-verified evaluation, the agent spends 13.7% of all tool calls on TodoWrite (Claude Code's built-in task tracking tool), averaging 3.6 calls per task. This tool call overhead provides no measurable benefit — tasks with heavy TodoWrite usage don't resolve at higher rates.

Data

  • 3.6 TodoWrite calls per task on average (across all MCP tasks)
  • 13.7% of total tool budget consumed by TodoWrite
  • No correlation between TodoWrite usage and task resolution
  • With a 30-iteration limit, each wasted call is ~3.3% of the total budget

Root Cause

Claude Code's default behavior includes proactive task list management. When the agent receives a complex problem statement, it creates a todo list, updates it as it works, and marks items complete — all consuming tool call turns that could be spent on actual exploration and coding.

Impact

Recovering even half of these wasted calls would give the agent ~2 additional exploration or editing turns per task. Over 500 tasks, this is significant.

Recommended Fixes

  1. Add instruction to MCP server: "Do not use TodoWrite or task management tools — focus all tool calls on exploration and code editing"
  2. Include in agent_prompt: Add a line discouraging TodoWrite usage (though this must be balanced against the prompt length findings from Long agent_prompt suppresses parallel tool calling in Claude Code harness #123)
  3. Investigate: Whether this can be suppressed via Claude Code configuration rather than instructions

Labels

performance, swe-bench

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions