Skip to content

treehouse get fails on stale/prunable worktree registration (no auto-prune, no -f, no recovery) #30

Description

@e-jung

Summary

treehouse get fails hard when git has a stale/prunable worktree registration at the pool path it tries to create the new worktree in. It does not run git worktree prune, does not pass --force, and does not catch the "already registered" error to recover. A manual git worktree prune is required before get works again.

This blocks every spawn until a human prunes by hand. Observed in the wild on v1.4.0; reproduced deterministically on current main (v1.7.0, 7e3fd54).

Reproduce (deterministic, on main)

# Build from current main.
git clone https://github.com/kunchenguid/treehouse && cd treehouse
go build -o /tmp/treehouse .

# Throwaway repo + bare origin.
rm -rf /tmp/th-repro && mkdir -p /tmp/th-repro/home /tmp/th-repro/remote.git /tmp/th-repro/myrepo
git init --bare --initial-branch=main /tmp/th-repro/remote.git
git init --initial-branch=main            /tmp/th-repro/myrepo
git -C /tmp/th-repro/myrepo config user.email t@t.com
git -C /tmp/th-repro/myrepo config user.name  Tester
git -C /tmp/th-repro/myrepo remote add origin /tmp/th-repro/remote.git
echo hello > /tmp/th-repro/myrepo/README.md
git -C /tmp/th-repro/myrepo add . && git -C /tmp/th-repro/myrepo commit -m initial
git -C /tmp/th-repro/myrepo push -u origin main

# Simulate a crashed scout: leave a prunable worktree at the exact path
# treehouse get will target for slot 1 (i.e. <pool>/1/<repo-basename>).
POOL_DIR=/tmp/th-repro/home/.treehouse/myrepo-808643   # use the dir treehouse status prints for your repo
mkdir -p "$POOL_DIR/1"
git -C /tmp/th-repro/myrepo worktree add --detach "$POOL_DIR/1/myrepo" main
rm -rf "$POOL_DIR/1/myrepo"          # <-- the crash; bookkeeping left behind
git -C /tmp/th-repro/myrepo worktree list --porcelain   # entry shows: prunable gitdir file points to non-existent location

# Now try to acquire:
cd /tmp/th-repro/myrepo
env HOME=/tmp/th-repro/home SHELL=/bin/true TREEHOUSE_NO_UPDATE_CHECK=1 /tmp/treehouse get

Actual output

🌳 Setting up worktree...
failed to create worktree: git worktree add --detach /tmp/th-repro/home/.treehouse/myrepo-808643/1/myrepo refs/remotes/origin/main: fatal: '/tmp/th-repro/home/.treehouse/myrepo-808643/1/myrepo' is a missing but already registered worktree;
use 'add -f' to override, or 'prune' or 'remove' to clear

Exit code 1. The pool is wedged — every subsequent treehouse get fails the same way until a human runs git worktree prune.

Expected

treehouse get recovers on its own (the registration is for a directory that no longer exists — there is nothing to destroy), then completes.

Confirming the fix clears it

git -C /tmp/th-repro/myrepo worktree prune -v
# Removing worktrees/myrepo: gitdir file points to non-existent location
env HOME=/tmp/th-repro/home SHELL=/bin/true TREEHOUSE_NO_UPDATE_CHECK=1 /tmp/treehouse get   # succeeds, exit 0

Root cause (code path)

The get path never prunes:

  • cmd/get.go:50getRunE calls pool.Acquire(repoRoot, poolDir, cfg.MaxTrees, cfg.Hooks.PostCreate). cmd/get.go does not reference prune anywhere.
  • internal/pool/pool.go:94 — when no clean worktree is reusable and the pool is under max_trees, Acquire calls git.AddWorktree(repoRoot, wtPath, branch) to create a new worktree at <poolDir>/<slot>/<repoName>. There is no git worktree prune before this call, and no error-handling that would catch a "missing but already registered" failure and retry.
  • internal/git/git.go:140-143AddWorktree runs git worktree add --detach <path> <ref> with no --force/-f, and returns the git error verbatim.

The healState pass in internal/pool/pool.go:365 only reconciles treehouse's own state file against os.Stat on each worktree dir; it does not touch git's worktree bookkeeping (.git/worktrees/<name>/), which is what produces the "already registered" failure.

Relationship to #28

#28 (released in v1.7.0, 836044f) hardened the prune subcommandcmd/prune.go and internal/pool/prune.go — to classify and report orphaned/backing-repository-missing worktrees when a user explicitly runs treehouse prune. The get acquisition path (cmd/get.gointernal/pool/pool.go:Acquireinternal/git/git.go:AddWorktree) was untouched and still has no defense against a stale registration. The reporter verified this on main post-#28: the failure above is on 7e3fd54.

There is also no test coverage for this case — cmd/e2e_test.go has TestGetAndStatus, TestGetReusesWorktree, TestGetDetachesWorktreeWhenLeavingDirty, but none seed a prunable git worktree registration before calling get.

Proposed fix

Make get self-healing. Two viable options (option A is the smallest, both are safe):

Option A — prune before add (preferred). Run git worktree prune before git worktree add in the create-new-worktree branch of Acquire. Concretely, add to internal/git/git.go:

// PruneWorktrees removes git worktree bookkeeping for worktrees whose
// directories no longer exist. It is safe: it only deletes registrations
// for already-missing directories and never touches live worktrees.
func PruneWorktrees(repoRoot string) error {
    _, err := runGit(repoRoot, "worktree", "prune")
    return err
}

and call it in internal/pool/pool.go just before git.AddWorktree(...) at line 94:

if err := git.PruneWorktrees(repoRoot); err != nil {
    return fmt.Errorf("failed to prune stale worktrees: %w", err)
}
if err := git.AddWorktree(repoRoot, wtPath, branch); err != nil {
    return fmt.Errorf("failed to create worktree: %w", err)
}

git worktree prune is safe by design — it removes only bookkeeping entries whose target directories are already gone, so it cannot destroy live work or data.

Option B — catch and retry. Leave AddWorktree as-is, but on the specific "already registered"/"prunable" error from AddWorktree, run git worktree prune once and retry the add. This narrows the behavior change to the failure case but requires string-matching git's error output (brittle across git versions).

Option A is simpler and the prune step is idempotent and safe, so it can run unconditionally on every acquire.

Optional hardening

  • Add an e2e test (cmd/e2e_test.go) that seeds a prunable worktree at the slot path before calling treehouse get, asserting get succeeds and the worktree is created.
  • Consider also pruning in Acquire's reusable-worktree loop, since git.ResetWorktree (called at internal/pool/pool.go:67) could surface related errors for a stale reusable entry — though healState already filters out missing-dir entries from treehouse's own state, so the create-new path is the primary exposure.

Environment

  • treehouse: built from main (7e3fd54, post-v1.7.0)
  • go: 1.26.4 (go.mod requires 1.25.5)
  • git: standard git worktree behavior on Linux

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions