Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,7 @@ Otherwise it prints one line per problem; handle each:
- `MISSING: <tool> (install: <command>)` - list the missing tools to the captain with a one-line purpose each plus the printed install commands, wait for consent (one approval may cover the list), then run `bin/fm-bootstrap.sh install <approved tools...>`.
For `treehouse`, this also covers an installed version whose `treehouse get` lacks `--lease`; treat it as an upgrade request.
- `NEEDS_GH_AUTH` - ask the captain to run `! gh auth login` (interactive; you cannot run it for them).
- `HOST_GH_ACCESS_REQUIRED` - the current harness sandbox cannot verify or use host GitHub credentials directly; verify host access with an unsandboxed `gh auth status`/`gh-axi` check if needed, then use approved unsandboxed GitHub command prefixes for GitHub operations. Do not ask the captain to re-authenticate unless the unsandboxed host check also fails.
- `CREW_HARNESS_OVERRIDE: <name>` - record and use the override silently; surface a harness fact only if it actually blocks work or the captain asks.
- `FLEET_SYNC: <repo>: skipped: <reason>` - bootstrap continued; investigate only if the dirty, diverged, or offline clone blocks work.

Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ cd firstmate && claude
```

That is the whole install.
On first launch the first mate detects what its toolchain is missing or too old (tmux, node, gh, treehouse with durable lease support, no-mistakes, gh-axi, chrome-devtools-axi, lavish-axi), lists it with the exact install commands, and installs only after you say go.
On first launch the first mate detects what its toolchain is missing or too old (tmux, node, gh, treehouse with durable lease support, no-mistakes, gh-axi, chrome-devtools-axi, lavish-axi), lists it with the exact install commands, and installs only after you say go. When a sandboxed harness cannot see or use the host GitHub keyring, firstmate reports that host GitHub access is required instead of telling you to log in again.

**Run it inside tmux for the best experience.**
firstmate works from any terminal - outside tmux, crewmates land in a detached `firstmate` session you can attach to - but launching your harness from inside tmux puts every crewmate window in your own session, one per task, where you can watch the crew work in real time or type into any window to intervene.
Expand Down Expand Up @@ -223,7 +223,7 @@ shellcheck bin/*.sh tests/*.sh # lint the toolbelt and behavior tests
for test_script in tests/*.test.sh; do "$test_script"; done # behavior tests, matching CI
tests/fm-wake-queue.test.sh # durable wake queue, singleton behavior, sub-supervisor classifier, and /afk presence-gating tests
tests/fm-afk-inject-e2e.test.sh # private-socket end-to-end test of the afk injection path (partial-input deferral, swallowed-Enter retry)
tests/fm-bootstrap.test.sh # bootstrap dependency and feature-probe tests
tests/fm-bootstrap.test.sh # bootstrap dependency, feature-probe, and GitHub-auth sandbox-diagnostic tests
tests/fm-secondmate.test.sh # persistent secondmate routing, seeding, idle charter, backlog handoff, spawn, recovery, teardown, and FM_HOME tests
tests/fm-teardown.test.sh # fm-teardown.sh unpushed-work safety check: local-only fork-remote allow, truly-unpushed refuse, merged-to-main allow, no-mistakes regression, --force override
[ "$(readlink CLAUDE.md)" = "AGENTS.md" ]
Expand Down
13 changes: 12 additions & 1 deletion bin/fm-bootstrap.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
# Usage: fm-bootstrap.sh
# Detect: prints one line per problem and exits 0. Silent = all good.
# Lines: "MISSING: <tool> (install: <command>)", "NEEDS_GH_AUTH",
# "HOST_GH_ACCESS_REQUIRED",
# "CREW_HARNESS_OVERRIDE: <name>", "FLEET_SYNC: <repo>: skipped: <reason>".
# treehouse is also MISSING when its installed version lacks
# "treehouse get --lease" support.
Expand Down Expand Up @@ -57,6 +58,10 @@ fleet_sync() {
rm -f "$tmp"
}

running_in_codex_sandbox() {
[ -n "${CODEX_SANDBOX:-}" ] || [ "${CODEX_SANDBOX_NETWORK_DISABLED:-}" = 1 ]
}

install_cmd() {
case "$1" in
tmux|node|gh) echo "brew install $1 # or the platform's package manager" ;;
Expand Down Expand Up @@ -91,7 +96,13 @@ done
if command -v treehouse >/dev/null 2>&1 && ! treehouse_supports_lease; then
echo "MISSING: treehouse (install: $(install_cmd treehouse))"
fi
gh auth status >/dev/null 2>&1 || echo "NEEDS_GH_AUTH"
if ! gh auth status >/dev/null 2>&1; then
if running_in_codex_sandbox; then
echo "HOST_GH_ACCESS_REQUIRED"
else
echo "NEEDS_GH_AUTH"
fi
fi
crew=
[ -f "$CONFIG/crew-harness" ] && crew=$(tr -d '[:space:]' < "$CONFIG/crew-harness" || true)
[ -n "$crew" ] && [ "$crew" != "default" ] && echo "CREW_HARNESS_OVERRIDE: $crew"
Expand Down
40 changes: 39 additions & 1 deletion tests/fm-bootstrap.test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,9 @@ SH
cat > "$fakebin/gh" <<'SH'
#!/usr/bin/env bash
if [ "${1:-}" = auth ] && [ "${2:-}" = status ]; then
if [ "${FM_FAKE_GH_AUTH_STATUS:-ok}" = fail ]; then
exit 1
fi
exit 0
fi
exit 0
Expand All @@ -60,7 +63,12 @@ SH

run_bootstrap() {
local home=$1 fakebin=$2
PATH="$fakebin:$PATH" FM_HOME="$home" "$ROOT/bin/fm-bootstrap.sh"
CODEX_SANDBOX='' CODEX_SANDBOX_NETWORK_DISABLED='' PATH="$fakebin:$PATH" FM_HOME="$home" "$ROOT/bin/fm-bootstrap.sh"
}

run_bootstrap_in_codex_sandbox() {
local home=$1 fakebin=$2
CODEX_SANDBOX=seatbelt CODEX_SANDBOX_NETWORK_DISABLED=1 PATH="$fakebin:$PATH" FM_HOME="$home" "$ROOT/bin/fm-bootstrap.sh"
}

test_bootstrap_accepts_treehouse_lease_support() {
Expand All @@ -87,5 +95,35 @@ test_bootstrap_reports_treehouse_without_lease_support() {
pass "bootstrap reports treehouse without get --lease support"
}

test_bootstrap_reports_gh_auth_without_sandbox() {
local case_dir fakebin out
case_dir="$TMP_ROOT/gh-auth-missing"
mkdir -p "$case_dir/home"
fakebin=$(make_fake_toolchain "$case_dir")

out=$(FM_FAKE_GH_AUTH_STATUS=fail FM_FAKE_TREEHOUSE_LEASE_HELP=1 run_bootstrap "$case_dir/home" "$fakebin")
printf '%s\n' "$out" | grep -Fx 'NEEDS_GH_AUTH' >/dev/null \
|| fail "bootstrap did not report gh auth when the host gh auth probe failed: $out"
printf '%s\n' "$out" | grep -F 'HOST_GH_ACCESS_REQUIRED' >/dev/null \
&& fail "bootstrap reported host access outside a sandbox"
pass "bootstrap reports gh auth outside sandbox"
}

test_bootstrap_reports_host_access_inside_codex_sandbox() {
local case_dir fakebin out
case_dir="$TMP_ROOT/gh-host-access"
mkdir -p "$case_dir/home"
fakebin=$(make_fake_toolchain "$case_dir")

out=$(FM_FAKE_GH_AUTH_STATUS=fail FM_FAKE_TREEHOUSE_LEASE_HELP=1 run_bootstrap_in_codex_sandbox "$case_dir/home" "$fakebin")
printf '%s\n' "$out" | grep -Fx 'HOST_GH_ACCESS_REQUIRED' >/dev/null \
|| fail "bootstrap did not report host gh access when sandboxed gh auth probe failed: $out"
printf '%s\n' "$out" | grep -F 'NEEDS_GH_AUTH' >/dev/null \
&& fail "bootstrap reported gh auth instead of host access inside sandbox"
pass "bootstrap reports host gh access inside codex sandbox"
}

test_bootstrap_accepts_treehouse_lease_support
test_bootstrap_reports_treehouse_without_lease_support
test_bootstrap_reports_gh_auth_without_sandbox
test_bootstrap_reports_host_access_inside_codex_sandbox