Skip to content

Commit f8b95f6

Browse files
authored
🤖 ci: add Codex PR loop tooling for agents (#6)
## Summary This PR ports the mux pull-request workflow tooling into `coder-k8s`, adapted for a Go/Kubernetes controller repository. ## Background We want agents in this repo to follow a strict PR lifecycle: - create/update PRs with standard guidance - request Codex review and wait for response - address/resolve review comments in a loop - wait for required CI checks before considering the PR ready ## Implementation - Added `.mux/skills/pull-requests/SKILL.md` with this repo's validation commands (`make verify-vendor`, `make test`, `make build`) and explicit Codex loop discipline. - Added PR helper scripts under `scripts/`: - `wait_pr_checks.sh` - `wait_pr_codex.sh` - `check_codex_comments.sh` - `resolve_pr_comment.sh` - `check_pr_reviews.sh` (adapted to dynamically resolve owner/repo) - `extract_pr_logs.sh` (adapted for dynamic owner/repo and Go-oriented repro hints) - Added `scripts/wait_pr_ready.sh` as a one-command orchestrator (`Codex -> CI`) for end-of-workflow waiting. - Added a repository `AGENTS.md` with non-TypeScript guidance adapted from mux and a required Codex review loop. - Updated `.github/workflows/ci.yaml` with a PR-only `codex-comments` job to fail when unresolved Codex feedback remains. ## Validation - `bash -n scripts/*.sh` - `go run github.com/rhysd/actionlint/cmd/actionlint@v1.7.10 .github/workflows/ci.yaml` - `make verify-vendor` - `make test` - `make build` ## Risks Low-to-moderate process risk: this PR mostly adds tooling and instructions, but the new CI `codex-comments` gate can block merges if bot comments are unresolved or if the GitHub API is transiently unavailable. --- _Generated with [`mux`](https://github.com/coder/mux) • Model: `openai:gpt-5.3-codex` • Thinking: `xhigh`_ _Generated with `mux` • Model: `openai:gpt-5.3-codex` • Thinking: `xhigh` • Cost: `$0.82`_ <!-- mux-attribution: model=openai:gpt-5.3-codex thinking=xhigh costs=0.82 -->
1 parent 7ca163f commit f8b95f6

10 files changed

Lines changed: 1268 additions & 5 deletions

File tree

.github/workflows/ci.yaml

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -101,6 +101,25 @@ jobs:
101101
advanced-security: false
102102
inputs: .github/workflows
103103

104+
codex-comments:
105+
name: Codex Comments
106+
if: github.event_name == 'pull_request'
107+
runs-on: ubuntu-latest
108+
permissions:
109+
contents: read
110+
pull-requests: read
111+
steps:
112+
- name: Checkout
113+
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4.3.1
114+
with:
115+
fetch-depth: 0
116+
persist-credentials: false
117+
118+
- name: Check unresolved Codex comments
119+
env:
120+
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
121+
run: ./scripts/check_codex_comments.sh ${{ github.event.pull_request.number }}
122+
104123
publish-main:
105124
name: Publish GHCR :main
106125
needs: [test, lint, lint-actions]

.mux/skills/pull-requests/SKILL.md

Lines changed: 129 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,129 @@
1+
---
2+
name: pull-requests
3+
description: Guidelines for creating and managing Pull Requests in this repo
4+
---
5+
6+
# Pull Request Guidelines
7+
8+
## Attribution Footer
9+
10+
Public work (issues/PRs/commits) must use 🤖 in the title and include this footer in the body:
11+
12+
```md
13+
---
14+
15+
_Generated with `mux` • Model: `<modelString>` • Thinking: `<thinkingLevel>` • Cost: `$<costs>`_
16+
17+
<!-- mux-attribution: model=<modelString> thinking=<thinkingLevel> costs=<costs> -->
18+
```
19+
20+
Always check `$MUX_MODEL_STRING`, `$MUX_THINKING_LEVEL`, and `$MUX_COSTS_USD` via bash before creating or updating PRs—include them in the footer if set.
21+
22+
## Lifecycle Rules
23+
24+
- Before submitting a PR, ensure the branch name reflects the work and the base branch is correct.
25+
- PRs are always squash-merged into `main`.
26+
- Often, work begins from another PR's merged state; rebase onto `main` before submitting a new PR.
27+
- Reuse existing PRs; never close or recreate without instruction.
28+
- Force-push minor PR updates; otherwise add a new commit to preserve the change timeline.
29+
- If a PR is already open for your change, keep it up to date with the latest commits; don't leave it stale.
30+
- Never enable auto-merge or merge into `main` yourself. The user must explicitly merge PRs.
31+
32+
## CI & Validation
33+
34+
- After pushing, you may use `./scripts/wait_pr_checks.sh <pr_number>` to wait for CI to pass.
35+
- Use `wait_pr_checks` only when there's no more useful work to do.
36+
- Waiting for PR checks can take 10+ minutes, so prefer local validation first (for this repo: `make verify-vendor`, `make test`, `make build`) to catch issues early.
37+
- If asked to fix an issue in CI, first replicate it locally, get it to pass locally, then use `wait_pr_checks` to wait for CI to pass.
38+
39+
## Status Decoding
40+
41+
| Field | Value | Meaning |
42+
| ------------------ | ------------- | ------------------- |
43+
| `mergeable` | `MERGEABLE` | Clean, no conflicts |
44+
| `mergeable` | `CONFLICTING` | Needs resolution |
45+
| `mergeStateStatus` | `CLEAN` | Ready to merge |
46+
| `mergeStateStatus` | `BLOCKED` | Waiting for CI |
47+
| `mergeStateStatus` | `BEHIND` | Needs rebase |
48+
| `mergeStateStatus` | `DIRTY` | Has conflicts |
49+
50+
If behind: `git fetch origin && git rebase origin/main && git push --force-with-lease`.
51+
52+
## Codex Review Workflow
53+
54+
When posting multi-line comments with `gh` (e.g., `@codex review`), **do not** rely on `\n` escapes inside quoted `--body` strings (they will be sent as literal text). Prefer `--body-file -` with a heredoc to preserve real newlines:
55+
56+
```bash
57+
gh pr comment <pr_number> --body-file - <<'EOF'
58+
@codex review
59+
60+
<message>
61+
EOF
62+
```
63+
64+
### Handling Codex Comments
65+
66+
Use these scripts to check, resolve, and wait on Codex review comments:
67+
68+
- `./scripts/check_codex_comments.sh <pr_number>` — Lists unresolved Codex comments (both regular comments and review threads). Outputs thread IDs needed for resolution.
69+
- `./scripts/resolve_pr_comment.sh <thread_id>` — Resolves a review thread by its ID (e.g., `PRRT_abc123`).
70+
- `./scripts/wait_pr_codex.sh <pr_number>` — Waits for Codex to respond to the latest `@codex review` request. When the PR looks good, Codex leaves an explicit approval comment (e.g., it will say `Didn't find any major issues`).
71+
72+
- `./scripts/wait_pr_ready.sh <pr_number>` — Convenience wrapper that runs Codex waiting first and CI checks second. Use it when you are done coding and want to block until the PR is ready or actionable feedback appears.
73+
74+
When Codex leaves review comments, you **must** address them before the PR can merge:
75+
76+
1. Push your fixes
77+
2. Resolve each review thread: `./scripts/resolve_pr_comment.sh <thread_id>`
78+
3. Comment `@codex review` to re-request review
79+
4. Run `./scripts/wait_pr_codex.sh <pr_number>` to wait for the next Codex response (either new comments to address, or an explicit approval comment)
80+
81+
### Required Loop Discipline
82+
83+
After a PR is open, stay in a review loop until completion:
84+
85+
1. Run local validation and push fixes.
86+
2. Request review with `@codex review`.
87+
3. Run `./scripts/wait_pr_codex.sh <pr_number>` and wait for Codex to respond.
88+
4. If Codex leaves comments, address them, resolve each thread, push, and repeat from step 2.
89+
5. Once Codex explicitly approves, run `./scripts/wait_pr_checks.sh <pr_number>` and wait for required checks.
90+
91+
Only stop the loop early if a reviewer is clearly misunderstanding the change intent and further edits would be counterproductive. In that case, leave a clarifying PR comment and pause for human direction.
92+
93+
## PR Title Conventions
94+
95+
- Title prefixes: `perf|refactor|fix|feat|ci|tests|bench`
96+
- Example: `🤖 fix: handle workspace rename edge cases`
97+
- Use `tests:` for test-only changes (test helpers, flaky test fixes, storybook)
98+
- Use `ci:` for CI config changes
99+
100+
## PR Bodies
101+
102+
### Structure
103+
104+
PR bodies should generally follow this structure; omit sections that are N/A or trivially inferable for the change.
105+
106+
- Summary
107+
- Single-paragraph executive summary of the change
108+
- Background
109+
- The "why" behind the change
110+
- What problem this solves
111+
- Relevant commits, issues, or PRs that capture more context
112+
- Implementation
113+
- Validation
114+
- Steps taken to prove the change works as intended
115+
- Avoid boilerplate like `ran tests`; include this section only for novel, change-specific steps
116+
- Do not include steps implied by passing PR checks
117+
- Risks
118+
- PRs that touch intricate logic must include an assessment of regression risk
119+
- Explain regression risk in terms of severity and affected product areas
120+
121+
## Upkeep
122+
123+
Once the code is pushed to the remote (even if not yet a Pull Request), do your best to commit
124+
and push all changes before responding to ensure its visible to the user. Commits on the working branch
125+
are for yourself to understand the change, they do not have to follow repository conventions as the
126+
PR body and title become the commit subject and body respectively.
127+
128+
Whenever generating a compaction summary, include whether or not a Pull Request was opened
129+
and the general state of the remote (e.g. CI checks, known reviews, divergence).

AGENTS.md

Lines changed: 16 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -88,9 +88,20 @@ Run from repository root.
8888

8989
### Pull request descriptions
9090
- Include: what changed, why, validation commands run, and any follow-up work.
91-
- For public mux-generated PRs/commits in this environment, include footer:
91+
- For public mux-generated PRs/commits in this environment, include the attribution footer defined in `.mux/skills/pull-requests/SKILL.md`.
9292

93-
```md
94-
---
95-
_Generated with [`mux`](https://github.com/coder/mux) • Model: `<modelString>` • Thinking: `<thinkingLevel>`_
96-
```
93+
## PR Workflow (Codex)
94+
95+
- Before creating or updating any PR, commit, or public issue, read `.mux/skills/pull-requests/SKILL.md` and follow it.
96+
- Use `./scripts/wait_pr_ready.sh <pr_number>` for a one-command wait flow after requesting review.
97+
- Prefer `gh` CLI for GitHub interactions over manual web/curl flows.
98+
99+
When a PR exists, stay in this loop until ready:
100+
1. Push your latest fixes.
101+
2. Run local validation (`make verify-vendor`, `make test`, `make build`).
102+
3. Request review with `@codex review`.
103+
4. Run `./scripts/wait_pr_codex.sh <pr_number>` and wait for Codex.
104+
5. If Codex leaves comments, address them, resolve threads with `./scripts/resolve_pr_comment.sh <thread_id>`, push, and repeat.
105+
6. After explicit Codex approval, run `./scripts/wait_pr_checks.sh <pr_number>`.
106+
107+
Only stop the loop early if the reviewer is clearly misunderstanding the intended change and further churn would be counterproductive. In that case, leave a clarifying PR comment and wait for human direction.

scripts/check_codex_comments.sh

Lines changed: 203 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,203 @@
1+
#!/usr/bin/env bash
2+
set -euo pipefail
3+
4+
if [ $# -eq 0 ]; then
5+
echo "Usage: $0 <pr_number>"
6+
exit 1
7+
fi
8+
9+
PR_NUMBER=$1
10+
BOT_LOGIN_GRAPHQL="chatgpt-codex-connector"
11+
12+
if ! [[ "$PR_NUMBER" =~ ^[0-9]+$ ]]; then
13+
echo "❌ PR number must be numeric. Got: '$PR_NUMBER'"
14+
exit 1
15+
fi
16+
17+
echo "Checking for unresolved Codex comments in PR #${PR_NUMBER}..."
18+
19+
REPO_INFO=$(gh repo view --json owner,name --jq '{owner: .owner.login, name: .name}')
20+
OWNER=$(echo "$REPO_INFO" | jq -r '.owner')
21+
REPO=$(echo "$REPO_INFO" | jq -r '.name')
22+
23+
# Depot runners sometimes hit transient network timeouts to api.github.com.
24+
# Retry the GraphQL request a few times before failing the required check.
25+
MAX_ATTEMPTS=5
26+
BACKOFF_SECS=2
27+
28+
# shellcheck disable=SC2016 # Single quotes are intentional - these are GraphQL queries.
29+
COMMENTS_QUERY='query($owner: String!, $repo: String!, $pr: Int!, $cursor: String) {
30+
repository(owner: $owner, name: $repo) {
31+
pullRequest(number: $pr) {
32+
comments(first: 100, after: $cursor) {
33+
pageInfo {
34+
hasNextPage
35+
endCursor
36+
}
37+
nodes {
38+
id
39+
author { login }
40+
body
41+
createdAt
42+
isMinimized
43+
}
44+
}
45+
}
46+
}
47+
}'
48+
49+
# shellcheck disable=SC2016 # Single quotes are intentional - these are GraphQL queries.
50+
THREADS_QUERY='query($owner: String!, $repo: String!, $pr: Int!, $cursor: String) {
51+
repository(owner: $owner, name: $repo) {
52+
pullRequest(number: $pr) {
53+
reviewThreads(first: 100, after: $cursor) {
54+
pageInfo {
55+
hasNextPage
56+
endCursor
57+
}
58+
nodes {
59+
id
60+
isResolved
61+
comments(first: 1) {
62+
nodes {
63+
id
64+
author { login }
65+
body
66+
createdAt
67+
path
68+
line
69+
}
70+
}
71+
}
72+
}
73+
}
74+
}
75+
}'
76+
77+
fetch_graphql_with_retry() {
78+
local query="$1"
79+
shift
80+
81+
local attempt
82+
local backoff
83+
backoff="$BACKOFF_SECS"
84+
85+
for ((attempt = 1; attempt <= MAX_ATTEMPTS; attempt++)); do
86+
if gh api graphql \
87+
-f query="$query" \
88+
-F owner="$OWNER" \
89+
-F repo="$REPO" \
90+
-F pr="$PR_NUMBER" \
91+
"$@"; then
92+
return 0
93+
fi
94+
95+
if [ "$attempt" -eq "$MAX_ATTEMPTS" ]; then
96+
echo "❌ GraphQL query failed after ${MAX_ATTEMPTS} attempts"
97+
return 1
98+
fi
99+
100+
echo "⚠️ GraphQL query failed (attempt ${attempt}/${MAX_ATTEMPTS}); retrying in ${backoff}s..."
101+
sleep "$backoff"
102+
backoff=$((backoff * 2))
103+
done
104+
}
105+
106+
COMMENTS_CURSOR=""
107+
ALL_COMMENTS='[]'
108+
109+
while true; do
110+
if [ -n "$COMMENTS_CURSOR" ]; then
111+
COMMENTS_RESULT=$(fetch_graphql_with_retry "$COMMENTS_QUERY" -F cursor="$COMMENTS_CURSOR")
112+
else
113+
COMMENTS_RESULT=$(fetch_graphql_with_retry "$COMMENTS_QUERY")
114+
fi
115+
116+
if [ "$(echo "$COMMENTS_RESULT" | jq -r '.data.repository.pullRequest == null')" = "true" ]; then
117+
echo "❌ PR #${PR_NUMBER} does not exist in ${OWNER}/${REPO}."
118+
exit 1
119+
fi
120+
121+
PAGE_COMMENTS=$(echo "$COMMENTS_RESULT" | jq '.data.repository.pullRequest.comments.nodes')
122+
ALL_COMMENTS=$(jq -cn --argjson all "$ALL_COMMENTS" --argjson page "$PAGE_COMMENTS" '$all + $page')
123+
124+
HAS_NEXT=$(echo "$COMMENTS_RESULT" | jq -r '.data.repository.pullRequest.comments.pageInfo.hasNextPage')
125+
if [ "$HAS_NEXT" != "true" ]; then
126+
break
127+
fi
128+
129+
COMMENTS_CURSOR=$(echo "$COMMENTS_RESULT" | jq -r '.data.repository.pullRequest.comments.pageInfo.endCursor')
130+
if [ -z "$COMMENTS_CURSOR" ] || [ "$COMMENTS_CURSOR" = "null" ]; then
131+
echo "❌ Assertion failed: comments pagination cursor missing while hasNextPage=true"
132+
exit 1
133+
fi
134+
done
135+
136+
THREADS_CURSOR=""
137+
ALL_THREADS='[]'
138+
139+
while true; do
140+
if [ -n "$THREADS_CURSOR" ]; then
141+
THREADS_RESULT=$(fetch_graphql_with_retry "$THREADS_QUERY" -F cursor="$THREADS_CURSOR")
142+
else
143+
THREADS_RESULT=$(fetch_graphql_with_retry "$THREADS_QUERY")
144+
fi
145+
146+
if [ "$(echo "$THREADS_RESULT" | jq -r '.data.repository.pullRequest == null')" = "true" ]; then
147+
echo "❌ PR #${PR_NUMBER} does not exist in ${OWNER}/${REPO}."
148+
exit 1
149+
fi
150+
151+
PAGE_THREADS=$(echo "$THREADS_RESULT" | jq '.data.repository.pullRequest.reviewThreads.nodes')
152+
ALL_THREADS=$(jq -cn --argjson all "$ALL_THREADS" --argjson page "$PAGE_THREADS" '$all + $page')
153+
154+
HAS_NEXT=$(echo "$THREADS_RESULT" | jq -r '.data.repository.pullRequest.reviewThreads.pageInfo.hasNextPage')
155+
if [ "$HAS_NEXT" != "true" ]; then
156+
break
157+
fi
158+
159+
THREADS_CURSOR=$(echo "$THREADS_RESULT" | jq -r '.data.repository.pullRequest.reviewThreads.pageInfo.endCursor')
160+
if [ -z "$THREADS_CURSOR" ] || [ "$THREADS_CURSOR" = "null" ]; then
161+
echo "❌ Assertion failed: review thread pagination cursor missing while hasNextPage=true"
162+
exit 1
163+
fi
164+
done
165+
166+
# Filter regular comments from bot that aren't minimized, excluding:
167+
# - "Didn't find any major issues" (no issues found)
168+
# - "usage limits have been reached" (rate limit error, not a real review)
169+
REGULAR_COMMENTS=$(echo "$ALL_COMMENTS" | jq "[.[] | select(.author.login == \"${BOT_LOGIN_GRAPHQL}\" and .isMinimized == false and (.body | test(\"Didn't find any major issues|usage limits have been reached\") | not))]")
170+
REGULAR_COUNT=$(echo "$REGULAR_COMMENTS" | jq 'length')
171+
172+
# Filter unresolved review threads from bot
173+
UNRESOLVED_THREADS=$(echo "$ALL_THREADS" | jq "[.[] | select(.isResolved == false and .comments.nodes[0].author.login == \"${BOT_LOGIN_GRAPHQL}\")]")
174+
UNRESOLVED_COUNT=$(echo "$UNRESOLVED_THREADS" | jq 'length')
175+
176+
TOTAL_UNRESOLVED=$((REGULAR_COUNT + UNRESOLVED_COUNT))
177+
178+
echo "Found ${REGULAR_COUNT} unminimized regular comment(s) from bot"
179+
echo "Found ${UNRESOLVED_COUNT} unresolved review thread(s) from bot"
180+
181+
if [ "$TOTAL_UNRESOLVED" -gt 0 ]; then
182+
echo ""
183+
echo "❌ Found ${TOTAL_UNRESOLVED} unresolved comment(s) from Codex in PR #${PR_NUMBER}"
184+
echo ""
185+
echo "Codex comments:"
186+
187+
if [ "$REGULAR_COUNT" -gt 0 ]; then
188+
echo "$REGULAR_COMMENTS" | jq -r '.[] | " - [\(.createdAt)]\n\(.body)\n"'
189+
fi
190+
191+
if [ "$UNRESOLVED_COUNT" -gt 0 ]; then
192+
echo "$UNRESOLVED_THREADS" | jq -r '.[] | " - [\(.comments.nodes[0].createdAt)] thread=\(.id) \(.comments.nodes[0].path // "comment"):\(.comments.nodes[0].line // "")\n\(.comments.nodes[0].body)\n"'
193+
echo ""
194+
echo "Resolve review threads with: ./scripts/resolve_pr_comment.sh <thread_id>"
195+
fi
196+
197+
echo ""
198+
echo "Please address or resolve all Codex comments before merging."
199+
exit 1
200+
fi
201+
202+
echo "✅ No unresolved Codex comments found"
203+
exit 0

0 commit comments

Comments
 (0)