Skip to content

symbol_context followed by Read 64% of the time — output may lack sufficient source code #130

@greynewell

Description

@greynewell

Problem

In our SWE-bench-verified evaluation, 64% of symbol_context calls are immediately followed by a Read tool call, suggesting the agent needs to see actual source code that symbol_context doesn't provide (or doesn't provide enough of).

Data

  • 64% of symbol_context calls are followed by Read on the same or related file
  • This pattern wastes a turn: the agent calls symbol_context to find what it needs, then calls Read to see the actual code
  • The full render mode does include source code, but either:
    • The agent is using brief mode (which omits source code)
    • The source code snippet is too narrow (just the definition, not surrounding context)
    • The agent needs to see a broader section of the file

Context

PR #122 partially addressed this by adding source code to the full render and creating a brief mode. However, the 64% follow-up-Read rate was measured on a run before those changes. A new evaluation is needed to measure whether the changes help.

Recommended Investigation

  1. Re-evaluate after PR feat: Batch symbol_context, inject overview into instructions, remove overview tool #122's changes are deployed: does the Read-follow-up rate decrease?
  2. Expand source context: Include more surrounding lines (e.g., ±20 lines around the definition rather than just the definition)
  3. Include file path prominently: Make sure the agent knows exactly where to look if it does need to Read more
  4. Suggest related regions: If the symbol is a method, include the class definition header and any closely-related methods

Impact

If symbol_context returned sufficient code to avoid the follow-up Read 50% of the time, that saves ~0.3 turns per task across all tasks — a meaningful efficiency gain.

Labels

enhancement, swe-bench

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions