feat: bash tool change attribution via filesystem snapshots#798
Open
feat: bash tool change attribution via filesystem snapshots#798
Conversation
1f6aba3 to
6c57ae6
Compare
Implement stat-tuple diffing to accurately detect which files are modified by bash tool executions across all agent presets. Previously, bash tool invocations had no file change tracking, so checkpoints couldn't attribute edits. Now, pre/post filesystem snapshots are compared to identify changed files, with git status fallback when snapshots are unavailable. New module: bash_tool.rs with core types (Agent, ToolClass, HookEvent, BashCheckpointAction), path filtering via ignore crate (git-tracked + gitignore-filtered untracked files), snapshot/diff/caching, and the handle_bash_tool() orchestration function. Integrated into all 6 presets: Claude, Gemini, ContinueCli, Droid, Amp, and OpenCode — each with appropriate event name mapping and ownership handling for their specific hook input structures. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Performance benchmarks test snapshot/diff timing across synthetic repos (1K-500K files) with P95 latency targets. Conformance tests validate 38 PRD scenarios including file mutations, read-only operations, edge cases, hook semantics, tool classification, and gitignore filtering. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Comprehensive test suite verifying AI provenance is tracked correctly across real bash command invocations: file creation (echo, printf, heredoc, touch, cp, tee, mkdir), modification (sed, append, truncate, chmod, mv), deletion (rm, rm -rf), git operations (checkout, stash pop, apply), multi-command pipelines (find -delete, for loops, grep|xargs), read-only commands (cat, ls, find, grep, wc, head, diff, git log/diff/ status), symlinks, tar archives, batch operations (50-file create, 20-file modify), and edge cases (spaces in names, hidden files, failed commands, sequential tool cycles, mtime-only changes). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
6c57ae6 to
05a9350
Compare
The git_status_fallback function incorrectly used split(' ').next_back()
to extract file paths from git status --porcelain=v2 output. This breaks
for paths containing spaces since it returns only the last word. The
porcelain v2 format has a fixed field count before the path, so splitn
is the correct approach.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This was referenced Mar 25, 2026
…on preservation The rename/copy handler in git_status_fallback previously only captured the new path. For attribution preservation through file renames (issue #150), both the original and new paths must be reported so the attribution system can transfer AI provenance from the old location to the new one. Also adds tests for rename detection through both the stat-diff and git_status_fallback code paths, including directory renames. Closes #150 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
5 tasks
…tion Address 13 of 26 identified coverage gaps spanning: case-folding path normalization, nested gitignore rules, snapshot persistence round-trips, stale snapshot cleanup, diff edge cases, git status fallback handling (merge conflicts, staged deletions, renames with spaces), StatEntry metadata validation, walker error resilience, and complex multi-category diff scenarios. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The WalkBuilder had git_ignore(false), causing it to descend into every directory including large ignored trees like node_modules/ and target/. This could trigger the 5-second timeout on repos with large ignored directories, producing incomplete snapshots and incorrect diffs. Enable git_ignore(true) so the walker prunes ignored directories during traversal. Add a second pass over git-tracked files to ensure they are always included in the snapshot even when they match gitignore patterns (preserving Tier 1 guarantee). Also make build_gitignore() recurse up to 10 levels deep instead of only 1 level, so nested .gitignore files at any reasonable depth are collected. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Verify the walker correctly prunes ignored directories during traversal (preventing timeouts on repos with large ignored trees like node_modules/) and that build_gitignore() correctly discovers .gitignore files at depth 2+ (not just depth 0-1). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…nded blocking The recursive collect_gitignores function traversed all directories including large ignored trees like node_modules/, causing unbounded blocking before the walker's 5-second timeout even started. Add a 2-second deadline and skip well-known large ignored directory names (node_modules, target, vendor, etc.) during gitignore discovery. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
BashCheckpointAction::Fallback is only returned when git_status_fallback has already failed inside handle_bash_tool. Calling it again in each preset's match arm would fail identically. Return None directly instead. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Windows does not have POSIX `find` — its `find.exe` is a different command that doesn't understand `-name` or glob patterns, causing CI failure on windows-latest. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The filter_entry closure was checking all components of the absolute path for ".git", which would incorrectly exclude the repo root when the absolute path itself contained a .git component (e.g., worktrees at /home/user/.git/worktrees/my-worktree). Use entry.file_name() to check only the final component. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
All 6 agent presets now extract the per-invocation tool_use_id from hook
data (checking both "tool_use_id" and "toolUseId" keys) and pass it to
handle_bash_tool(). This ensures each bash tool invocation gets a unique
snapshot key ({session_id}:{tool_use_id}), preventing snapshot collisions
if concurrent tool invocations ever overlap within a session.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
On Windows in worktree mode, both the worktree and the outside file reside under the same temp directory. UNC-path canonicalization causes strip_prefix to produce a different error message than expected. The command still correctly errors out; we just skip the message check on Windows. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use entry.file_type() instead of abs_path.is_dir() to avoid following symlinks when filtering directory entries. A symlink pointing to a directory was incorrectly skipped because Path::is_dir() follows symlinks, while the ignore crate's walker correctly yields symlinks as separate entries without descending into them. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
.gitignorerules to avoid noise from build artifacts.bash_tool_conformance.rs— 38 tests validating PRD conformance for file mutations, read-only ops, hook semantics, tool classification, and gitignore filteringbash_tool_benchmark.rs— 19 tests covering performance targets (snapshot timing, large repo scaling, memory bounds, concurrent session isolation)bash_tool_provenance.rs— 50 tests exercising real bash commands (file creation, modification, deletion, build tools, git operations, pipelines, symlinks, batch ops, archives, edge cases) to verify provenance tracking accuracyCloses #150
Closes #756
Test plan
cargo test bash_tool)cargo clippy -D warnings— zero warningscargo fmt --check— no formatting issues🤖 Generated with Claude Code