Skip to content

grep: implement line-level --invert-match for text output #8

@tony

Description

@tony

Context

The agentgrep grep -v PATTERN text-output path is currently a
silent no-op: argparse accepts the flag, the command runs, and
the user gets back the matching records (not the inverted set).
The acknowledgement comment lives in
src/agentgrep/cli/render.py (the print_grep_results early
branch). Only grep -c -v and grep -L -v honor inversion today
because they collapse to a "did anything match?" question that
the eager record list can answer.

In the upcoming commit that lands with this issue, -v outside
-c/-L will be hard-rejected with exit 2 so we no longer
silently return wrong output. This issue tracks the real
implementation.

Implementation paths

v2 — consumer-layer enumeration. Add a helper that calls
discover_sources() + re-reads each source's records, then
filters at the CLI consumer to lines that don't match the
pattern. The engine surface stays unchanged; the CLI does the
inversion. Cost: roughly 2x I/O for -v queries because every
source is read regardless of match.

v3 — engine include_non_matching mode. Add an opt-in flag
to iter_search_events so it emits RecordEmitted for every
record (matching or not) with a matched: bool field, and the
CLI consumer filters. Pros: single pass over the stores; engine
becomes the single source of truth for the universal candidate
set. Cons: touches the pydantic event union and every consumer
that reads it.

Decisions to make

  • Should -v invert at the line level (rg parity — emits
    non-matching lines from records that have at least one match
    too) or at the record level (whole records that don't
    match)? The rest of agentgrep treats records as atomic units,
    so record-level may be the more consistent default.
  • Should -v change the streaming live-output mode, or stay
    eager-only? rg's -v streams; per-record inversion can stream;
    per-line inversion needs the full record buffer either way.

Related

  • The interim "refuse outside -c/-L" commit:
    (PR / commit hash to back-fill)
  • TODO comment at src/agentgrep/cli/render.py in
    print_grep_results documenting the v1 simplification.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions