Skip to content

feat: working clean/smudge round-trip for a Rust subset#26

Draft
bdelanghe wants to merge 3 commits into
mainfrom
claude/git-ast-architecture-tdhbqs
Draft

feat: working clean/smudge round-trip for a Rust subset#26
bdelanghe wants to merge 3 commits into
mainfrom
claude/git-ast-architecture-tdhbqs

Conversation

@bdelanghe

Copy link
Copy Markdown
Collaborator

What this does

Turns the filter skeleton into a real, git-invoked AST round-trip. With the filter installed, git add stores Rust in canonical form and git checkout returns it — so reformatting never enters history.

git add      →  clean:  your .rs ──Tree-sitter parse──▶ tree ──printer──▶ canonical bytes (stored)
git checkout →  smudge: stored bytes ──▶ working file   (identity; already canonical)

Before this PR the filter was a no-op that didn't even speak git's protocol; the only "round-trip" was a string-prefix marker in a unit test.

Changes

  • printer.rs — the AST-native core. Parses Rust with Tree-sitter and re-emits canonical source by walking the tree. Fail-closed: syntax errors reject the commit; any unsupported node kind returns an error rather than silently corrupting code.
  • pktline.rs — codec for git's long-running filter pkt-line framing.
  • filters.rs — implements the real filter-process protocol (handshake → capabilities → per-blob). clean canonicalizes *.rs, smudge is identity, non-Rust passes through.
  • setup.rs / git-ast setup — one command to register the filter + .gitattributes in a repo (idempotent).
  • examples/demo.sh — end-to-end proof: a pure reformat produces no diff, a real a + ba - b change shows a clean one-line diff.

Verified

18 tests pass; cargo fmt --check and cargo clippy --all-targets -- -D warnings are clean. Demonstrated through real git add/git checkout/git diff in a throwaway repo (see examples/demo.sh).

Scope — deliberately honest

  • One language, a documented subset of it (functions, params, blocks, let, binary/call/macro expressions, literals, comments). Widening coverage is additive — one arm per node kind.
  • Diff and merge drivers remain placeholders. Making those structural depends on the hardest open problem — stable AST node identity across versions — which this PR does not address. Canonical formatting removes formatting churn from history; it does not track a node through a move or rename. See docs/planning/scope.md.

🤖 Generated with Claude Code


Generated by Claude Code

claude added 3 commits June 25, 2026 20:00
Turn the filter skeleton into a real, git-invoked AST round-trip:

- printer: parse Rust with Tree-sitter and re-emit canonical source by
  walking the tree. Fail-closed — syntax errors and unsupported node
  kinds error rather than silently corrupting code.
- pktline: implement Git's long-running filter pkt-line codec.
- filters: speak the real `filter-process` protocol; `clean`
  canonicalizes `*.rs`, `smudge` is identity, non-Rust passes through.
- setup: `git-ast setup` registers the filter + .gitattributes in a repo.
- examples/demo.sh: end-to-end proof that reformatting produces no diff
  while a real change shows a clean one.

Scope stays honest: one language, a documented subset, fail-closed.
Diff/merge drivers remain placeholders pending stable node identity,
which this does not address.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01NCp6PSoWKvsbFWyav6CeeC
…sport)

Add a README section framing the hard, deferred problem precisely: node
identity is heuristic not exact, computed by tree-matching rather than
stored, helped by content-addressed subtree hashing, and git notes are a
transport for attribution across rewrites — not the identity mechanism.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01NCp6PSoWKvsbFWyav6CeeC
frond exercises the same parse -> regenerate -> compare primitive for
JavaScript/TypeScript (SWC on Deno) that git-ast does for Rust
(Tree-sitter). Cross-link them: frond validates round-trip fidelity, the
prerequisite git-ast's canonical printer depends on.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01NCp6PSoWKvsbFWyav6CeeC
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants