Skip to content

fix(pool): prune stale worktree registrations before add in get#1

Closed
e-jung wants to merge 2 commits into
mainfrom
fix/get-prunes-stale-worktree-registrations
Closed

fix(pool): prune stale worktree registrations before add in get#1
e-jung wants to merge 2 commits into
mainfrom
fix/get-prunes-stale-worktree-registrations

Conversation

@e-jung

@e-jung e-jung commented Jun 20, 2026

Copy link
Copy Markdown
Owner

Intent

The developer tasked an autonomous crewmate agent with investigating whether the treehouse get command is resilient to stale/prunable worktree registrations on current main (v1.7.0), building on real-world evidence that a stale git worktree entry caused treehouse get to fail hard on v1.4.0 until a human manually ran git worktree prune. The initial scout task required reading the code path in cmd/get.go and internal/pool/, reproducing the bug deterministically in throwaway git repos under /tmp (never touching the live fleet pool), and filing a GitHub issue to kunchenguid/treehouse if the bug persisted, explicitly without opening a PR or pushing. The task was then promoted to ship: reset to clean main, branch as fix/get-prunes-stale-worktree-registrations, implement the prune-before-add fix (Option A from the report) plus an e2e test, fork e-jung/treehouse, run the no-mistakes pipeline, and open a PR referencing issue kunchenguid#30. Stated constraints included staying inside the worktree, no push/no PR during the scout phase, using gh-axi for GitHub operations, and reporting status via a one-line append to a status file.

What Changed

  • pool.Acquire now calls git.PruneWorktrees before adding a worktree, so treehouse get recovers instead of failing hard with "missing but already registered worktree" when a previous worktree's directory was removed out-of-band.
  • Adds the git.PruneWorktrees helper and an e2e test (TestGetRecoversFromStaleWorktreeRegistration) that plants a prunable registration and asserts get recreates the worktree and exits 0.
  • Documents the self-healing behavior in README and AGENTS.md.

Risk Assessment

✅ Low: The fix is minimal and well-bounded: a single safe git worktree prune before the only AddWorktree call site, with correct lock placement, clear error wrapping, and a deterministic isolated e2e test.

Testing

Built both the unfixed (v1.7.0) and fixed (target) treehouse binaries and reproduced the stale-worktree-registration crash scenario end-to-end in throwaway git repos under /tmp: base build wedges treehouse get with exit 1 and 'missing but already registered worktree', while the fixed build prunes the stale entry and enters the recreated worktree (exit 0). The added e2e test TestGetRecoversFromStaleWorktreeRegistration passes, and the full go test ./... suite is green. Working tree left clean.

Evidence: Before/after reproduction log (base vs fixed)

########## BASE (unfixed v1.7.0) ########## [BASE-v1.7.0] running: treehouse get [BASE-v1.7.0] exit code: 1 [BASE-v1.7.0] RESULT: WEDGED - worktree not created [BASE-v1.7.0] relevant get output: failed to create worktree: ... fatal: '.../1/myrepo' is a missing but already registered worktree; ########## TARGET (fixed) ########## [TARGET-FIXED] running: treehouse get [TARGET-FIXED] exit code: 0 [TARGET-FIXED] RESULT: RECOVERED - worktree dir exists at .../1/myrepo [TARGET-FIXED] relevant get output: 🌳 Entered worktree at ~/.treehouse/myrepo-45755e/1/myrepo. Type 'exit' to return.

########## BASE (unfixed v1.7.0) ##########
[BASE-v1.7.0] pool dir: /tmp/tmp.V9w2KhBOUr/home/.treehouse/myrepo-0b523e
[BASE-v1.7.0] git worktree list --porcelain BEFORE get (filtered):
    worktree /tmp/tmp.V9w2KhBOUr/myrepo
    worktree /tmp/tmp.V9w2KhBOUr/home/.treehouse/myrepo-0b523e/1/myrepo
    prunable gitdir file points to non-existent location
[BASE-v1.7.0] running: treehouse get
[BASE-v1.7.0] exit code: 1
[BASE-v1.7.0] RESULT: WEDGED - worktree not created
[BASE-v1.7.0] relevant get output:
    failed to create worktree: git worktree add --detach /tmp/tmp.V9w2KhBOUr/home/.treehouse/myrepo-0b523e/1/myrepo refs/remotes/origin/main: fatal: '/tmp/tmp.V9w2KhBOUr/home/.treehouse/myrepo-0b523e/1/myrepo' is a missing but already registered worktree;

########## TARGET (fixed) ##########
[TARGET-FIXED] pool dir: /tmp/tmp.bGoIKj1MD6/home/.treehouse/myrepo-45755e
[TARGET-FIXED] git worktree list --porcelain BEFORE get (filtered):
    worktree /tmp/tmp.bGoIKj1MD6/myrepo
    worktree /tmp/tmp.bGoIKj1MD6/home/.treehouse/myrepo-45755e/1/myrepo
    prunable gitdir file points to non-existent location
[TARGET-FIXED] running: treehouse get
[TARGET-FIXED] exit code: 0
[TARGET-FIXED] RESULT: RECOVERED - worktree dir exists at /tmp/tmp.bGoIKj1MD6/home/.treehouse/myrepo-45755e/1/myrepo
[TARGET-FIXED] relevant get output:
    🌳 Entered worktree at ~/.treehouse/myrepo-45755e/1/myrepo. Type 'exit' to return.
Evidence: Reproduction script
#!/usr/bin/env bash
# End-to-end manual reproduction of the stale-worktree-registration bug.
# Mirrors the real-world scenario: a crashed agent leaves a git worktree
# registration whose directory is gone. Without the fix, `treehouse get`
# fails hard with "missing but already registered worktree".
#
# Usage: reproduce.sh /path/to/treehouse-binary label
set -u

BIN="$1"
LABEL="$2"

BASE="$(mktemp -d)"
export HOME="$BASE/home"
export TREEHOUSE_NO_UPDATE_CHECK=1
export SHELL=/bin/true
mkdir -p "$HOME"

REPO="$BASE/myrepo"
REMOTE="$BASE/remote.git"

git init -q --bare --initial-branch=main "$REMOTE"
git init -q --initial-branch=main "$REPO"
git -C "$REPO" config user.email test@test.com
git -C "$REPO" config user.name Test
git -C "$REPO" remote add origin "$REMOTE"
printf 'hello\n' > "$REPO/README.md"
git -C "$REPO" add . && git -C "$REPO" commit -qm "initial"
git -C "$REPO" push -qu origin main >/dev/null

# Materialize pool dir by running status once (run from inside repo).
( cd "$REPO" && "$BIN" status ) >/dev/null 2>&1 || true
POOLDIR="$(ls -d "$HOME/.treehouse/$(basename "$REPO")-"* 2>/dev/null | head -1)"
echo "[$LABEL] pool dir: $POOLDIR"

STALE="$POOLDIR/1/$(basename "$REPO")"
mkdir -p "$(dirname "$STALE")"

# Simulate a crashed scout: register a worktree, then nuke its directory.
git -C "$REPO" worktree add --detach "$STALE" main >/dev/null 2>&1
rm -rf "$STALE"

echo "[$LABEL] git worktree list --porcelain BEFORE get (filtered):"
git -C "$REPO" worktree list --porcelain | grep -E "worktree|prunable" | sed 's/^/    /'

echo "[$LABEL] running: treehouse get"
set +e
out_err=$( cd "$REPO" && "$BIN" get 2>&1 )
code=$?
set -e
echo "[$LABEL] exit code: $code"

if [ -d "$STALE/." ]; then
    echo "[$LABEL] RESULT: RECOVERED - worktree dir exists at $STALE"
else
    echo "[$LABEL] RESULT: WEDGED - worktree not created"
fi
echo "[$LABEL] relevant get output:"
echo "$out_err" | grep -iE "entered worktree|already registered|failed to create|failed to prune|error" | sed 's/^/    /'

# Cleanup
rm -rf "$BASE"
Evidence: Regression test transcript

=== RUN TestGetRecoversFromStaleWorktreeRegistration --- PASS: TestGetRecoversFromStaleWorktreeRegistration (0.15s) PASS ok github.com/kunchenguid/treehouse/cmd 1.099s

=== RUN   TestGetRecoversFromStaleWorktreeRegistration
--- PASS: TestGetRecoversFromStaleWorktreeRegistration (0.15s)
PASS
ok  	github.com/kunchenguid/treehouse/cmd	1.099s

Pipeline

Updates from git push no-mistakes

✅ **intent** - passed

✅ No issues found.

✅ **Rebase** - passed

✅ No issues found.

✅ **Review** - passed

✅ No issues found.

✅ **Test** - passed

✅ No issues found.

  • go build ./... (Go 1.26.4, go.mod requires 1.25.5)
  • go test ./... (full suite passes)
  • go test ./cmd/ -run TestGetRecoversFromStaleWorktreeRegistration -v (PASS)
  • Manual e2e reproduction in throwaway /tmp git repo against base (unfixed v1.7.0) binary: planted prunable worktree registration, ran treehouse get -> exit 1, WEDGED, 'missing but already registered worktree'
  • Manual e2e reproduction against target (fixed) binary: same prunable state, ran treehouse get -> exit 0, RECOVERED, 'Entered worktree at ...'
✅ **Document** - passed

✅ No issues found.

✅ **Lint** - passed

✅ No issues found.

✅ **Push** - passed

✅ No issues found.

e-jung added 2 commits June 20, 2026 18:52
A crashed or forcibly removed worktree leaves prunable bookkeeping in
.git/worktrees/. The next "treehouse get" then fails hard with
"missing but already registered worktree" because git refuses the add,
wedging every subsequent spawn until a human runs "git worktree prune"
by hand.

Run "git worktree prune" immediately before "git worktree add" in
Acquire's create-new-worktree branch. Prune is safe by design: it only
removes registrations whose target directories are already gone, so it
cannot destroy live worktrees or data.

Adds TestGetRecoversFromStaleWorktreeRegistration covering the recovery.

Refs kunchenguid#30
@e-jung

e-jung commented Jun 20, 2026

Copy link
Copy Markdown
Owner Author

Closing in favor of the cross-repo PR to upstream: kunchenguid#31 (Refs kunchenguid#30). The cross-repo PR is the canonical surface for maintainer review and will run upstream CI (kunchenguid/treehouse has registered workflows; this fork's Actions workflows are not registered yet, so CI would never report here). The branch fix/get-prunes-stale-worktree-registrations is unchanged and remains the head of the upstream PR.

@e-jung e-jung closed this Jun 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant