🤖 fix: make PR readiness waiting fail fast by ThomasK33 · Pull Request #10 · coder/coder-k8s

ThomasK33 · 2026-02-09T13:18:01Z

Summary

Refactor the PR readiness waiting flow so it evaluates Codex and CI checks concurrently with fail-fast behavior.

Background

The previous readiness flow waited on Codex first and only then waited on checks, which delayed feedback when CI failed early.

Implementation

Added a --once mode to scripts/wait_pr_codex.sh and scripts/wait_pr_checks.sh with a shared status contract:
- 0 = passed
- 10 = still waiting
- 1 = terminal failure
Preserved default behavior for both scripts (looping wait mode) while routing polling logic through single-pass helpers.
Refactored scripts/wait_pr_ready.sh into a coordinator loop that:
- polls both gates each cycle,
- prints both gate states,
- exits immediately on first terminal failure,
- succeeds only when both gates pass.
Updated AGENTS.md PR loop documentation to use ./scripts/wait_pr_ready.sh <pr_number> as the unified wait step.

Validation

make verify-vendor
make test
make build
bash -n scripts/wait_pr_ready.sh
bash -n scripts/wait_pr_codex.sh
bash -n scripts/wait_pr_checks.sh
Usage guard checks for all updated scripts (missing args/unknown flag cases)

Risks

Low-to-moderate risk in shell control flow. Changes are localized to PR helper scripts and include defensive assertions for unexpected state/status values.

📋 Implementation Plan

Plan: unify PR readiness waiting into fail-fast concurrent checks

Context / Why

The current PR wait flow is sequential: wait_pr_ready.sh waits for Codex first, then waits for CI checks. This delays feedback when CI fails early, because the script can block on Codex response first.

Goal: make wait_pr_ready.sh a single command that evaluates Codex + CI in the same polling loop, exits immediately on the first terminal failure, and only succeeds when both gates are green.

Evidence

scripts/wait_pr_ready.sh currently runs phases in order (wait_pr_codex.sh then wait_pr_checks.sh, lines 46–70).
scripts/wait_pr_codex.sh and scripts/wait_pr_checks.sh each have their own while true polling loops and terminal exits.
AGENTS.md currently documents a split workflow (wait_pr_codex.sh then wait_pr_checks.sh, lines 99–106), so docs must be aligned with the new behavior.

Implementation details

Add a single-iteration mode to both wait scripts (reusable status contract).
- Files: scripts/wait_pr_codex.sh, scripts/wait_pr_checks.sh.
- Add --once mode that performs one status evaluation and exits with:
  - 0 = gate passed
  - 10 = still waiting/pending
  - 1 = terminal failure (needs user action)
- Keep default mode backward-compatible: no flag continues to poll in a loop.
- Add defensive assertions for impossible states (unexpected PR state, unknown merge/check state, unexpected exit code contract violations).
```
# shape in each script
MODE="wait"
if [[ "${2:-}" == "--once" ]]; then
  MODE="once"
fi

check_once() {
  # ...existing logic, but single pass...
  # return 0|10|1 only
}

if [[ "$MODE" == "once" ]]; then
  check_once; exit $?
fi

while true; do
  check_once
  rc=$?
  case "$rc" in
    0|1) exit "$rc" ;;
    10) sleep 5 ;;
    *) echo "❌ assertion failed: unexpected status code $rc"; exit 1 ;;
  esac
done
```

Refactor wait_pr_ready.sh into a true combined coordinator loop.

File: scripts/wait_pr_ready.sh.
Replace sequential step execution with one loop that calls both scripts in --once mode each tick.
Fail fast if either phase returns 1; succeed only when both return 0; otherwise continue when either is 10.
Print both gate states on each iteration for observability.

while true; do
  CODEX_OUT=$("$WAIT_CODEX_SCRIPT" "$PR_NUMBER" --once 2>&1); CODEX_RC=$?
  CHECKS_OUT=$("$WAIT_CHECKS_SCRIPT" "$PR_NUMBER" --once 2>&1); CHECKS_RC=$?

  # defensive contract checks
  for rc in "$CODEX_RC" "$CHECKS_RC"; do
    case "$rc" in 0|1|10) ;; *) echo "❌ assertion failed: invalid phase rc=$rc"; exit 1 ;; esac
  done

  if [[ "$CODEX_RC" -eq 1 || "$CHECKS_RC" -eq 1 ]]; then
    echo "❌ PR #$PR_NUMBER is not ready"
    echo "Codex: rc=$CODEX_RC"; echo "$CODEX_OUT"
    echo "Checks: rc=$CHECKS_RC"; echo "$CHECKS_OUT"
    exit 1
  fi

  if [[ "$CODEX_RC" -eq 0 && "$CHECKS_RC" -eq 0 ]]; then
    echo "🎉 PR #$PR_NUMBER is ready"
    exit 0
  fi

  echo -ne "\r⏳ Codex=$CODEX_RC Checks=$CHECKS_RC ..."
  sleep 5
done

Keep script responsibilities explicit and non-duplicative.
- wait_pr_codex.sh remains the source of truth for:
  - latest @codex review request detection,
  - approval/rate-limit comment matching,
  - delegating unresolved comment/thread reporting to check_codex_comments.sh.
- wait_pr_checks.sh remains source of truth for:
  - mergeability + gh pr checks status,
  - unresolved review-thread and Codex-comment gates.
- Coordinator (wait_pr_ready.sh) should orchestrate only, not re-implement these internals.
Update workflow documentation to match new single-command flow.
- File: AGENTS.md.
- Update the PR loop section so the waiting step is only ./scripts/wait_pr_ready.sh <pr_number> (remove explicit split steps for Codex then checks).
Validation plan after implementation.
- Syntax checks:
  - bash -n scripts/wait_pr_ready.sh
  - bash -n scripts/wait_pr_codex.sh
  - bash -n scripts/wait_pr_checks.sh
- Behavioral sanity checks (non-destructive):
  - usage/argument guards for each script (<missing pr_number> cases).
- End-to-end smoke test on a real PR (recommended):
  - confirm early exit on intentional CI failure without waiting for Codex,
  - confirm success path when both Codex approval and required checks pass.

Why this approach (brief)

It preserves existing script intent and observability while adding the needed parallel/fail-fast behavior.
It avoids background-process complexity (wait -n, trap/kill coordination, interleaved output).
The explicit 0/10/1 contract makes control flow deterministic and easy to assert.

Generated with mux • Model: openai:gpt-5.3-codex • Thinking: xhigh • Cost: $0.26

ThomasK33 · 2026-02-09T13:30:00Z

@codex review

Please take another look after the rebase and CI rerun.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2974392993

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

ThomasK33 · 2026-02-09T13:36:45Z

@codex review

Addressed the fail-fast concern in scripts/wait_pr_ready.sh by short-circuiting on Codex terminal failures before invoking the checks gate.

Add --once status contract to wait_pr_codex.sh and wait_pr_checks.sh, then refactor wait_pr_ready.sh to poll both gates in one loop and exit on first terminal failure. Update AGENTS.md to document the single-command readiness flow. --- _Generated with [`mux`](https://github.com/coder/mux) • Model: `openai:gpt-5.3-codex` • Thinking: `xhigh` • Cost: `$0.26`_

Update wait_pr_ready.sh to exit immediately when wait_pr_codex.sh --once returns a terminal failure, instead of always invoking wait_pr_checks.sh first. This preserves true fail-fast behavior and avoids unnecessary delay when Codex has already produced actionable feedback. --- _Generated with [`mux`](https://github.com/coder/mux) • Model: `openai:gpt-5.3-codex` • Thinking: `xhigh` • Cost: `$0.26`_

chatgpt-codex-connector · 2026-02-09T13:42:19Z

Codex Review: Didn't find any major issues. 👍

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

ThomasK33 · 2026-02-09T13:45:01Z

https://mux.md/hRMLm#hsnhYY8k9hsGew

This comment was marked as resolved.

Sign in to view

ThomasK33 force-pushed the scripts-tvd4 branch from 6562320 to 2974392 Compare February 9, 2026 13:22

chatgpt-codex-connector Bot reviewed Feb 9, 2026

View reviewed changes

Comment thread scripts/wait_pr_ready.sh

ThomasK33 added 2 commits February 9, 2026 13:40

ThomasK33 force-pushed the scripts-tvd4 branch from 7e3acb1 to 4999651 Compare February 9, 2026 13:41

ThomasK33 added this pull request to the merge queue Feb 9, 2026

Merged via the queue into main with commit eeb1e2f Feb 9, 2026
6 checks passed

ThomasK33 deleted the scripts-tvd4 branch February 9, 2026 13:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🤖 fix: make PR readiness waiting fail fast#10

🤖 fix: make PR readiness waiting fail fast#10
ThomasK33 merged 2 commits into
mainfrom
scripts-tvd4

ThomasK33 commented Feb 9, 2026 •

edited

Loading

Uh oh!

This comment was marked as resolved.

ThomasK33 commented Feb 9, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

ThomasK33 commented Feb 9, 2026

Uh oh!

chatgpt-codex-connector Bot commented Feb 9, 2026

Uh oh!

Uh oh!

ThomasK33 commented Feb 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ThomasK33 commented Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Background

Implementation

Validation

Risks

Plan: unify PR readiness waiting into fail-fast concurrent checks

Context / Why

Evidence

Implementation details

Uh oh!

This comment was marked as resolved.

ThomasK33 commented Feb 9, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

ThomasK33 commented Feb 9, 2026

Uh oh!

chatgpt-codex-connector Bot commented Feb 9, 2026

Uh oh!

Uh oh!

ThomasK33 commented Feb 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ThomasK33 commented Feb 9, 2026 •

edited

Loading