Skip to content

Use refs/pull/N/head for AI review checkout (unblock /ai-review on merged PRs)#417

Merged
igerber merged 6 commits into
mainfrom
fix-audit-checkout-head
May 12, 2026
Merged

Use refs/pull/N/head for AI review checkout (unblock /ai-review on merged PRs)#417
igerber merged 6 commits into
mainfrom
fix-audit-checkout-head

Conversation

@igerber
Copy link
Copy Markdown
Owner

@igerber igerber commented May 12, 2026

Summary

  • AI review CI fails at the checkout step on any merged PR with fatal: couldn't find remote ref refs/pull/<N>/merge: GitHub GCs the merge ref shortly after PR close.
  • Swap the workflow's checkout target from refs/pull/<N>/merge to refs/pull/<N>/head. The head ref persists indefinitely on both open and merged PRs.
  • The diff the reviewer sees is unchanged: it is computed from the frozen BASE_SHA/HEAD_SHA on the PR object, not from the checked-out filesystem. For open PRs the only visible difference is that Codex now reads file context at the PR branch tip rather than GitHub's test-merge state - the canonical "what the developer pushed" view for code review.
  • Empirically verified: git ls-remote https://github.com/igerber/diff-diff.git refs/pull/{401,402,413}/{head,merge} returns SHAs only for head on all three merged PRs.

Motivation

Re-review of merged PRs (e.g. to compare reviewer behavior across CI configurations) is currently impossible without this fix.

Test plan

  • CI on this PR exercises the new checkout target while the PR is still open (head ref is alive on open PRs too, so the workflow runs successfully).
  • After merge, /ai-review comment on any previously-merged PR should successfully fetch refs/pull/<N>/head, build the diff, and post a rerun review comment.

🤖 Generated with Claude Code

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 12, 2026

Overall Assessment

✅ Looks good — no unmitigated P0/P1 findings.

Executive Summary

  • No methodology-affecting library code is touched; this PR only updates the AI review workflow, so docs/methodology/REGISTRY.md has no affected estimator section.
  • The new ref-selection flow is internally consistent with the existing SHA-based diff generation: open PRs pin checkout to head_sha from the head repo, closed PR reruns use refs/pull/<N>/head, and the review diff still comes from git diff "$BASE_SHA" "$HEAD_SHA" rather than checkout state (.github/workflows/ai_pr_review.yml L73-L118; .github/workflows/ai_pr_review.yml L220-L229).
  • That split matches GitHub’s documented primitives: actions/checkout supports SHA refs, and GitHub documents fetching inactive PRs via pull/<N>/head. (github.com)
  • One informational gap remains: the new checkout/base-fetch contract is not asserted by tests yet; current workflow tests still only cover prompt hardening, and broader workflow-contract coverage is already tracked in TODO.md (tests/test_openai_review.py L1712-L1755; TODO.md L138).

Methodology

No findings. Affected method(s): none.

Code Quality

No findings.

Performance

No findings.

Maintainability

No findings.

Tech Debt

No new findings beyond the already-tracked workflow test-coverage gap noted below.

Security

No findings.

Documentation/Tests

  • Severity: P3. Impact: this PR changes the core checkout contract (state split, head_repo_full_name, base-SHA fetch), but the current tests still only assert tag-wrapping/sanitization behavior, so a future regression in ref selection or base-fetch logic would not be caught automatically. This is already tracked in TODO.md, so it is informational rather than blocking. Concrete fix: extend tests/test_openai_review.py (or add a dedicated workflow contract test) to assert the open/closed checkout split, the closed-PR refs/pull/${{ ... }}/head path, the open-PR repository + head_sha path, and the explicit base remote / base_sha fetch step (.github/workflows/ai_pr_review.yml L88-L118; tests/test_openai_review.py L1712-L1755; TODO.md L138).

igerber and others added 5 commits May 12, 2026 14:37
The merge ref is garbage-collected on closed/merged PRs, which breaks
`/ai-review` reruns on merged PRs at the actions/checkout step with
`fatal: couldn't find remote ref refs/pull/<N>/merge`. The head ref
persists indefinitely on both open and merged PRs, so swapping to it
unblocks post-merge re-reviews without affecting the open-PR path.

The diff that the reviewer sees is unchanged: it's computed from the
frozen BASE_SHA/HEAD_SHA on the PR object, not from the checked-out
filesystem. For open PRs the only visible difference is that Codex
now reads files at the PR branch tip rather than at GitHub's test-merge
state - the canonical "what the developer pushed" view for code review.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address the rerun-blocker more robustly: switch the checkout target from
`refs/pull/<N>/head` to the immutable head SHA from `pulls.get`. The PR
ref is documented as racy right after API-created PR creation
(.claude/commands/submit-pr.md:327-345), so unconditionally depending
on it leaves `pull_request: opened` reviews vulnerable to checkout
failures on a known edge case.

The head SHA is frozen on the PR object the moment the PR is created
and is fetchable via GitHub's `uploadpack.allowReachableSHA1InWant`, so
checkout-by-SHA works in both branches the prior commit was trying to
cover: merged-PR reruns (no merge ref) and freshly-opened API PRs (no
head ref yet).

Also drop the second `refs/pull/<N>/head` dependency in the prefetch
step. After checkout-by-SHA, only the base side needs to be fetched
to enable `git diff BASE_SHA HEAD_SHA`; the prefetch now pulls the
base SHA directly so it survives base-branch movement after PR open.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous comment said the head SHA was "frozen the moment the PR
is created," which misreads `pulls.get`: it returns the PR's current
head SHA at API-call time, which can change as the PR receives more
commits. The accurate framing is that the metadata step resolves the
head SHA for *this* workflow run, and the checkout step uses that
already-resolved value. Same wording fix applied to the base-side
race description so it does not over-promise either.

No behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
For owner-created PRs the head repo equals github.repository, so this
is a no-op. For fork PRs it lets the workflow fetch head_sha directly
from the fork, instead of relying on `refs/pull/<N>/head` having been
mirrored to the base repo. The mirror is documented as racy on
API-created PRs (.claude/commands/submit-pr.md:327-345), so depending
on it could leave fresh fork PR reviews failing at checkout.

base_sha still lives on the base repo regardless of where head lives,
so the prefetch now adds an explicit `base` remote pointing at
github.repository rather than reusing `origin` (which equals the head
repo after checkout). `git diff BASE_SHA HEAD_SHA` continues to find
both trees locally.

head.repo can be null on fork PRs whose fork was deleted; fall back
to the base repo's full name in that case so the checkout step still
has a sensible target.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three rounds of review feedback have surfaced overlapping checkout
robustness concerns. Picking one approach and unifying behind it.

Closed/merged PR path (the audit-campaign use case):
  Use `refs/pull/<N>/head` from the base repo. GitHub keeps this
  mirror durably regardless of whether the original branch is
  deleted, the fork is removed, or the merge was rebase/squash.
  Replaces the prior `refs/pull/<N>/merge` checkout, which is
  garbage-collected on close.

Open PR path:
  Use `head_sha` from `head_repo_full_name`. For owner PRs the
  head repo equals github.repository so this is identical to a
  base-repo checkout. For fork PRs it avoids the documented race
  where the base repo has not yet mirrored a freshly API-created
  fork PR's head (.claude/commands/submit-pr.md:327-345).

base_sha lives on github.repository regardless of which checkout
path runs, so the prefetch step continues to add a `base` remote
explicitly. `git diff BASE_SHA HEAD_SHA` therefore finds both
trees locally in every covered scenario.

The PR state is sourced from `pulls.get` in the existing metadata
step (works for both `pull_request` and `issue_comment` events).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber igerber force-pushed the fix-audit-checkout-head branch from def89e1 to fb1c468 Compare May 12, 2026 18:37
@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: fb1c46880db1d69709e47fd624961b951230b57f


Overall Assessment

⚠️ Needs changes — the new checkout contract silently mixes a BASE_SHA..HEAD_SHA diff with review-control files from the PR head checkout, so the reviewer can consult stale methodology/prompt source material on open PRs and reruns.

Executive Summary

  • No estimator, weighting, variance, or identification code changed, so no library method in docs/methodology/REGISTRY.md is directly modified by this PR.
  • The rerun-unblock itself is sound: closed PRs now avoid the GC-prone refs/pull/<N>/merge, and the workflow explicitly fetches base_sha before building the diff from BASE_SHA/HEAD_SHA at .github/workflows/ai_pr_review.yml:L93-L118 and .github/workflows/ai_pr_review.yml:L225-L234.
  • P1: for open PRs, the workflow now checks out head_sha from the head repo and then copies .github/codex/prompts/pr_review.md from that checkout, even though the reviewer is instructed to adjudicate against docs/methodology/REGISTRY.md. That makes trusted review inputs branch-stale/historical rather than aligned to the base revision being diffed. .github/workflows/ai_pr_review.yml:L104-L108, .github/workflows/ai_pr_review.yml:L154-L156, .github/workflows/ai_pr_review.yml:L225-L234, .github/codex/prompts/pr_review.md:L3-L10
  • The previous informational test-gap finding is still unresolved: current tests cover prompt hardening and rerun-gate parity, but not the new checkout/base-fetch contract. It remains tracked in TODO.md, so it is P3-only. tests/test_openai_review.py:L1712-L1835, TODO.md:L138-L138

Methodology

  • Severity: P1. Impact: no estimator implementation changed, but the workflow now violates the review-source contract. The diff is built from BASE_SHA and HEAD_SHA, while the actual review prompt and repo files the model consults come from the checked-out PR head tree. On a long-lived branch or merged rerun, Codex can therefore miss a base-side REGISTRY.md note/deviation and incorrectly raise an “undocumented” methodology defect, or follow stale review criteria. Concrete fix: keep the PR-head checkout for code context if desired, but source trusted review-control files from the base side instead — e.g. a second checkout of github.repository at base_sha, or git show "$BASE_SHA":... materialized into dedicated files for .github/codex/prompts/pr_review.md and docs/methodology/REGISTRY.md. .github/workflows/ai_pr_review.yml:L93-L118, .github/workflows/ai_pr_review.yml:L154-L156, .github/workflows/ai_pr_review.yml:L225-L234, .github/codex/prompts/pr_review.md:L3-L10

Code Quality

No findings.

Performance

No findings.

Maintainability

No additional findings beyond the source-material alignment issue above.

Tech Debt

No new untracked tech-debt findings.

Security

No direct secret-handling or permission regression found in the changed lines.

Documentation/Tests

  • Severity: P3. Impact: the prior informational gap remains: tests still assert prompt-tag sanitization and rerun-gate parity, but they do not lock down the new open/closed checkout split, the explicit base_sha fetch, or the source of trusted review files. A future regression here would be easy to miss. This is already tracked in TODO.md, so it is non-blocking. Concrete fix: extend tests/test_openai_review.py with workflow-string assertions for the two checkout branches, the trusted prompt/registry source, and the git fetch --no-tags --depth=1 base "${{ steps.pr.outputs.base_sha }}" contract. tests/test_openai_review.py:L1712-L1835, TODO.md:L138-L138

Path to Approval

  1. Decouple trusted review inputs from the PR-head checkout: read .github/codex/prompts/pr_review.md and docs/methodology/REGISTRY.md from base_sha (or a separate trusted base checkout), while leaving the PR head checkout only for code context.
  2. Add a workflow contract test that asserts the new checkout split and base fetch behavior, plus the chosen source for prompt/registry files, so this alignment bug does not recur.

The prompt defines HOW the reviewer reviews. Sourcing it from the PR
head allowed a PR to silently change its own review rules. Read it
from base_sha via `git show` instead; the prefetch step has already
fetched that commit's tree.

Scope-limited intentionally: docs/methodology/REGISTRY.md and TODO.md
remain sourced from the PR head. The prompt itself instructs the
reviewer to recognize PR-added Note/Deviation labels in REGISTRY.md
and new entries in TODO.md as mitigations (`.github/codex/prompts/
pr_review.md:4,9,62,97`), so those files must reflect the PR's
edits to behave correctly. Only the review-rules file is moved.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: bab8fe4e43b03ada5bbb425967343c3547000475


Overall Assessment

✅ Looks good — the prior P1 on trusted review-rule sourcing is resolved, and I did not find any unmitigated P0/P1 issues in this diff. The only remaining item is a P3 workflow-test gap that is already tracked in TODO.md.

Executive Summary

  • No estimator, weighting, variance/SE, identification, or default library behavior changed in this PR; the library-methodology cross-check is therefore not triggered for any estimator implementation.
  • The previous source-material finding is addressed: the trusted review prompt is now read from BASE_SHA, so a PR can no longer modify its own review rules. .github/workflows/ai_pr_review.yml:L156-L163
  • Keeping docs/methodology/REGISTRY.md and TODO.md on the PR-head checkout is consistent with the trusted prompt contract, which explicitly treats PR-added registry notes and tracked TODO items as mitigations rather than defects. .github/codex/prompts/pr_review.md:L3-L10, .github/codex/prompts/pr_review.md:L57-L68, docs/methodology/REGISTRY.md:L451-L454
  • The new checkout split is coherent: open PRs use head_sha from the head repo to avoid the refs/pull/<N>/head race, closed PRs use durable refs/pull/<N>/head, and base_sha is fetched explicitly for diffing. .github/workflows/ai_pr_review.yml:L88-L118
  • Tests still do not pin this new checkout/prompt-source contract, but that gap is already tracked in TODO.md, so it remains P3-only. tests/test_openai_review.py:L1712-L1835, TODO.md:L138-L138

Methodology

No findings. This is a workflow-only change, not an estimator/maths/inference change. The prior review-control mismatch is resolved because the prompt now comes from BASE_SHA, while leaving REGISTRY.md/TODO.md on the PR head matches the prompt’s explicit mitigation rules for documented deviations and tracked debt. .github/workflows/ai_pr_review.yml:L156-L163, .github/codex/prompts/pr_review.md:L3-L10, .github/codex/prompts/pr_review.md:L57-L68

Code Quality

No findings.

Performance

No findings.

Maintainability

No findings.

Tech Debt

No new untracked tech-debt findings.

Security

No findings.

Documentation/Tests

  • Severity: P3-informational (tracked in TODO.md). Impact: tests/test_openai_review.py still only asserts wrapper-tag hardening and rerun-gate parity; it does not lock down the new open/closed checkout split, explicit base_sha fetch, or base-sourced prompt contract, so a future regression here could slip through CI. Concrete fix: add workflow-string assertions for the state == 'open' / state != 'open' checkout branches, git fetch --no-tags --depth=1 base "${{ steps.pr.outputs.base_sha }}", and git show "${BASE_SHA}":.github/codex/prompts/pr_review.md. tests/test_openai_review.py:L1712-L1835, .github/workflows/ai_pr_review.yml:L93-L118, .github/workflows/ai_pr_review.yml:L156-L163, TODO.md:L138-L138

@igerber igerber added the ready-for-ci Triggers CI test workflows label May 12, 2026
@igerber igerber merged commit 002879d into main May 12, 2026
5 of 6 checks passed
@igerber igerber deleted the fix-audit-checkout-head branch May 12, 2026 18:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-ci Triggers CI test workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant