Skip to content

Commit 0a9e059

Browse files
igerberclaude
andcommitted
Revert codex preflight to notice-only; drop content/prompt/abort spiral
The codex secret-preflight grew across 6 review rounds (R3-R8) into a non-trivial security gate: filename scan + content secret-regex scan + prompt-artifact scan + rename/copy parser + skip-prefix exemptions + abort-by-default + --allow-secrets override. Each round closed the prior reviewer finding by adding scan surface. The cumulative result was security theater that: - Required --allow-secrets in any repo containing test fixtures of secret-detection regex (literal AKIA/ghp_/sk- patterns are how you test detection logic). The diff-diff repo itself triggered. - Trained users to always pass --allow-secrets, defeating the gate. - Did not actually add safety beyond what direct `codex` use already has (any `codex login` user already has the same surface). The R9 ⛔ Blocker explicitly demanded removing the only false-positive mitigation (skip prefixes), which would have made every codex invocation in this repo abort on its own test fixtures. Reverting to a sensible level. Codex's read-only-but-broad surface is intrinsic to using it as an agentic reviewer; real secret prevention belongs at gitignore + code review, not at codex invocation. Kept: - Recursive filename scan (`_scan_sensitive_files`) for obvious secret-bearing patterns: .env, id_rsa, *.pem, *.key, secrets.{yml,yaml,json}, .npmrc, .pypirc, .netrc, etc. (case-insensitive; safe template variants like .env.example excluded; heavy dirs skipped via _SCAN_SKIP_DIRS) - Notice-only stderr block (`_print_sensitive_notice`) printed before invoking codex if any matches exist; codex still runs - Auth.json check for explicit --backend codex (R5 fix retained — real bug if user isn't logged in) - BrokenPipe handling in call_codex (R6 fix retained — real bug) Removed: - Content secret-regex scan (false-positive spiral source) - Prompt-artifact scan (diff body / prev review / changed-files content scanning) - Skip-prefix exemptions (no longer needed without content scan) - Abort-by-default behavior + sys.exit(1) on findings - --allow-secrets argparse flag (nothing aborts → no override needed) - Recursive _list_files_for_scan helper (folded into simpler _scan_sensitive_files) - SECRET_CONTENT_PATTERN, _SCAN_MAX_FILE_BYTES, _SCAN_CONTENT_SUFFIXES, _SCAN_SKIP_CONTENT_PREFIXES constants Net change: -797 test lines, -200 script lines, doc simplified. Tests: 211 pass (kept the 11 informative-scan tests; dropped 5 test classes that exercised the removed enforcement layers). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 842e72f commit 0a9e059

3 files changed

Lines changed: 137 additions & 1141 deletions

File tree

.claude/commands/ai-review-local.md

Lines changed: 21 additions & 61 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
description: Run AI code review locally using Codex CLI or OpenAI API before opening a PR
3-
argument-hint: "[--backend auto|codex|api] [--allow-secrets] [--context minimal|standard|deep] [--include-files <files>] [--token-budget <n>] [--force-fresh] [--full-registry] [--model <model>] [--timeout <seconds>] [--dry-run]"
3+
argument-hint: "[--backend auto|codex|api] [--context minimal|standard|deep] [--include-files <files>] [--token-budget <n>] [--force-fresh] [--full-registry] [--model <model>] [--timeout <seconds>] [--dry-run]"
44
---
55

66
# Local AI Code Review
@@ -36,62 +36,23 @@ Notes:
3636
cleaned up automatically.
3737
- `--context` and `--token-budget` are ignored under the codex backend (Codex
3838
chooses what to load on its own); the script warns if you pass them.
39-
- **Surface area + abort-by-default**: under the codex backend, Codex can read
40-
any file under the repo root via `--cd`, AND consumes the compiled prompt
41-
(criteria + diff + previous review + changed-files list) via stdin. Before
42-
invoking codex, the script runs a preflight scan with TWO layers:
43-
44-
1. **Filename patterns**: `.env`, `.env.local`, `id_rsa`, `*.pem`, `*.key`,
45-
`secrets.{yml,yaml,json}`, `.netrc`, `.npmrc`, `.pypirc`, etc. Safe
46-
template variants (`.env.example`, `.env.sample`, `.env.template`)
47-
excluded.
48-
2. **Content secret-regex**: same canonical pattern used for the api
49-
backend's pre-upload scan (Step 3b) — catches AWS keys (`AKIA…`),
50-
GitHub tokens (`ghp_…`, `gho_…`), OpenAI keys (`sk-…`), `api_key=`,
51-
`secret_key=`, `password=`, `token=`, bearer tokens, and
52-
`PRIVATE_KEY` strings — applied recursively to repo files. Catches
53-
secrets stored under innocuous filenames like `notes.txt`. Limited
54-
to common text suffixes (`.py`, `.js`, `.yml`, `.json`, `.env`, `.md`,
55-
etc.); files >1MB skipped (likely binaries/generated assets). Common
56-
false-positive directories (`tests/`, `.github/`, `docs/`, `examples/`,
57-
`fixtures/`) are skipped for content scan only — those locations
58-
legitimately contain literal pattern matches as test fixtures, regex
59-
definitions, or doc examples. Filename scan still applies to those
60-
dirs (so a real `.env` checked into `tests/` would still be caught).
61-
62-
Heavy dirs (`.venv`, `node_modules`, `__pycache__`, etc.) are skipped from
63-
the walk to avoid vendored test fixtures. The `.claude/` subtree is NOT
64-
skipped wholesale — `.claude/settings.local.json` (gitignored, common
65-
place for OAuth tokens) is content-scanned. Only `.claude/reviews/` and
66-
`.claude/paper-review/` (agent-generated review artifacts that may quote
67-
literal pattern strings) are excluded from the content scan; their
68-
filenames are still scanned. Both scans include gitignored files —
69-
gitignored `.env` files are exactly what we want to catch.
70-
71-
Layer 2 — **prompt-artifact scan**: applies the same content secret-regex
72-
to the diff body, the optional delta-diff body, and the previous-review
73-
text (re-review mode), and applies the filename pattern set to entries
74-
in the changed-files list / delta-changed-files list. Catches secrets
75-
that go to Codex via stdin even when the repo on disk is clean — for
76-
example a reverted commit deleting a `.env` (the diff body still
77-
contains the secret), or a deleted/renamed sensitive file that no longer
78-
exists but appears in the file-status list.
79-
80-
All matching is case-insensitive (Linux + CI are case-sensitive
81-
filesystems where `.ENV` would otherwise slip past `.env`).
82-
83-
If EITHER layer finds matches, codex is **NOT invoked** and the script
84-
exits 1 unless you pass `--allow-secrets` to acknowledge the surface.
85-
This is enforcement, not just a warning.
39+
- **Surface area (informational, non-blocking)**: under the codex backend,
40+
Codex reads any file under the repo root via `--cd`. This is intrinsic to
41+
using `codex` as an agentic reviewer — the same surface anyone running
42+
`codex` directly already accepts via `codex login`. Before invoking codex
43+
the script does a quick recursive filename scan for obvious secret-bearing
44+
patterns (`.env`, `.env.local`, `id_rsa`, `*.pem`, `*.key`,
45+
`secrets.{yml,yaml,json}`, `.netrc`, `.npmrc`, `.pypirc`, etc.; safe
46+
template variants like `.env.example` excluded; case-insensitive). If any
47+
matches exist, a stderr notice lists them and codex still runs. CTRL-C and
48+
switch to `--backend api` (or sanitize the worktree) if you don't want
49+
Codex to see those files. This is a notice, not a gate — real secret
50+
prevention belongs at gitignore + code review, not at codex invocation.
8651

8752
## Arguments
8853

8954
`$ARGUMENTS` may contain optional flags:
9055
- `--backend {auto,codex,api}`: Reviewer backend (default: `auto`). See above.
91-
- `--allow-secrets`: Codex backend only. By default, the script aborts before
92-
invoking codex if it detects sensitive files anywhere under the repo root
93-
(recursive filename + content-regex scan including gitignored files). Pass
94-
this flag to acknowledge and proceed.
9556
- `--context {minimal,standard,deep}`: Context depth (default: `standard`).
9657
*Api backend only.*
9758
- `minimal`: Diff only (original behavior)
@@ -134,11 +95,11 @@ Step 5 invokes the chosen backend:
13495
- **api backend**: single external HTTP call to OpenAI Responses API. Step 3b/3c
13596
run the canonical pre-upload secret scan before any data is sent.
13697
- **codex backend**: spawns `codex exec` as a subprocess, which talks to
137-
OpenAI iteratively under a read-only sandbox. The script runs its own
138-
recursive sensitive-file + content-secret preflight before invoking codex
139-
(aborts unless `--allow-secrets` is passed); the api-backend's Step 3b/3c
140-
scans don't apply because Codex's read surface is the whole repo, not just
141-
the diff.
98+
OpenAI iteratively under a read-only sandbox. The script prints a stderr
99+
notice if obvious sensitive-filename patterns are present in the repo
100+
(informational; codex still runs). The api-backend's Step 3b/3c scans
101+
don't apply Codex's read surface is the whole repo, intrinsic to using
102+
it as an agentic reviewer.
142103

143104
## Instructions
144105

@@ -442,7 +403,6 @@ python3 .claude/scripts/openai_review.py \
442403
[--include-files "$include_files"] \
443404
[--token-budget "$token_budget"] \
444405
[--full-registry] \
445-
[--allow-secrets] \
446406
[--model <model>] \
447407
[--timeout <seconds>] \
448408
[--dry-run]
@@ -637,9 +597,9 @@ runs `--force-fresh` or when a rebase invalidates the tracked commit.
637597
context (if present) to OpenAI via the Responses API.
638598
- **codex backend**: the compiled prompt (criteria + diff + previous review)
639599
is piped to `codex exec`'s stdin, and Codex itself reads additional repo
640-
files agentically (read-only sandbox) and talks to OpenAI iteratively. The
641-
secret preflight (filename + content scan) gates this path; see "Surface
642-
area + abort-by-default" above.
600+
files agentically (read-only sandbox) and talks to OpenAI iteratively. A
601+
one-off stderr notice surfaces obvious sensitive-filename matches before
602+
invoking codex (see "Surface area" above) — informational only.
643603

644604
Use `--dry-run` to preview the compiled prompt without invoking either backend.
645605
- This skill pairs naturally with the iterative workflow:

0 commit comments

Comments
 (0)