Skip to content

feat: add single-task repeat mode and code cleanup#166

Open
KLIEBHAN wants to merge 20 commits intomichaelshimeles:mainfrom
KLIEBHAN:main
Open

feat: add single-task repeat mode and code cleanup#166
KLIEBHAN wants to merge 20 commits intomichaelshimeles:mainfrom
KLIEBHAN:main

Conversation

@KLIEBHAN
Copy link
Copy Markdown

@KLIEBHAN KLIEBHAN commented Mar 9, 2026

Summary

  • Single-task repeat mode: New --repeat N and --continue-on-failure flags allow repeating a single task multiple times with configurable failure behavior
  • Bash CLI parity: Extended ralphy.sh to support the same features as the npm CLI
  • Code cleanup: Deduplicated error handling in task runner, simplified args parsing, added resolveBrowserEnabled helper
  • Examples & docs: Added PRD.example.md, tasks.example.yaml, and updated README with repeat mode usage

Changes

  • cli/src/cli/commands/single-task-loop.ts — New repeat loop orchestrator with fail-fast and continue-on-failure modes
  • cli/src/cli/args.ts--repeat and --continue-on-failure options with validation
  • cli/src/cli/commands/task.ts — Extracted failWith() helper to deduplicate error handling
  • cli/src/config/types.ts — Added repeatCount and continueOnFailure to RuntimeOptions
  • cli/src/index.ts — Integrated single-task loop with summary notifications
  • cli/src/engines/codex.ts — Minor adjustment for repeat mode compatibility
  • ralphy.sh — Extended bash CLI with repeat mode and other missing features
  • Docs and example files updated

Test plan

  • All 127 existing tests pass (bun test)
  • New tests for runSingleTaskLoop (fail-fast, continue-on-failure, fatal error handling)
  • New tests for parseArgs repeat option validation
  • New test for buildActiveSettings repeat display
  • Biome lint clean on changed files

🤖 Generated with Claude Code

Note

Add single-task repeat mode with --repeat and --continue-on-failure flags

  • Adds --repeat <n> (1–10000) and --continue-on-failure CLI flags for single-task mode in both the Node CLI (args.ts) and shell script (ralphy.sh).
  • Introduces runSingleTaskLoop to execute a task N times, stopping on fatal errors or first failure unless --continue-on-failure is set, and logging a summary when repeatCount > 1.
  • runTask in task.ts now returns a structured result instead of calling process.exit, and skips desktop/webhook notifications when in repeat mode (delegating aggregated notifications to index.ts).
  • Removes 'auto' browser mode — browserEnabled is now strictly 'true' | 'false' with a default of 'false' across all execution paths.
  • Behavioral Change: browserEnabled default changes from 'auto' to 'false', so browser automation is disabled unless explicitly requested with --browser.

Macroscope summarized 66ceb9f.

KLIEBHAN and others added 5 commits March 9, 2026 07:39
…lure

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Extract failWith() to eliminate duplicated error handling in task.ts
- Extract resolveBrowserEnabled() to replace nested ternary in args.ts
- Extract taskSourceFlags and hasRepeatOptions to reduce repetition
- Preserve --sonnet priority guard over other engine flags

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel bot commented Mar 9, 2026

@KLIEBHAN is attempting to deploy a commit to the Goshen Labs Team on Vercel.

A member of the Team first needs to authorize it.

@dosubot
Copy link
Copy Markdown

dosubot bot commented Mar 9, 2026

Related Documentation

1 document(s) may need updating based on files changed in this PR:

Goshen Labs's Space

Copilot AI Engine Integration
View Suggested Changes
@@ -155,6 +155,14 @@
 - Windows improvements: better error handling for .cmd wrappers
 
 # Changelog v4.5.1 Highlights
+### v4.7.2
+- **Single-task repeat mode**: Added `--repeat <n>` flag to repeat a single task N times, with `--continue-on-failure` flag for continuing after non-fatal task failures. Fatal errors (auth, config, CLI issues) abort immediately with fail-fast default behavior.
+  - Example usage: `ralphy --repeat 3 "find and fix bugs"`
+  - Example usage: `ralphy --repeat 5 --continue-on-failure "harden edge cases"`
+- **Bash CLI parity**: Extended `ralphy.sh` to support the same features as the npm CLI, including all new repeat mode flags.
+- **Code cleanup**: Deduplicated error handling in task runner, simplified args parsing, added `resolveBrowserEnabled` helper function.
+- **Examples & docs**: Added `PRD.example.md` and `tasks.example.yaml` template files, updated README with repeat mode usage examples.
+
 ### v4.7.1
 - **Copilot engine improvements**: non-interactive mode (`--yolo`), conservative error detection for authentication/rate-limit errors, token usage parsing, temp file-based prompts for markdown preservation
 - **Fixed infinite retry loop**: tasks now properly abort on fatal configuration/authentication errors

[Accept] [Decline]

Note: You must be authenticated to accept/decline updates.

How did I do? Any feedback?  Join Discord

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 9, 2026

Greptile Summary

This PR introduces a well-scoped --repeat N / --continue-on-failure feature for single-task (brownfield) mode, refactors runTask to return a structured TaskRunResult instead of calling process.exit, and extends ralphy.sh with matching functionality. The TypeScript side is architecturally clean: runSingleTaskLoop is properly dependency-injectable for testing, failWith() cleanly deduplicates error handling, and resolveBrowserEnabled correctly removes the ambiguous 'auto' state. The bash additions follow the same overall design, though several parity gaps relative to the TypeScript CLI remain.

Key observations:

  • The SingleTaskLoopResult interface is missing a skipped field despite the skipped count being computed inside the function and re-derived independently in two call sites in index.ts. Adding it to the interface would eliminate the duplication.
  • logInfo("Running task with …") and the browser-enabled banner fire unconditionally on every runTask call. In repeat mode this produces N identical log lines — the existing [i/total] counter already provides per-iteration context, making the inner banners redundant noise.
  • The ralphy.sh bash port has several accumulated parity gaps that have been flagged in prior review rounds: BROWSER_ENABLED still defaults to "auto" while the TS CLI now defaults to "false"; the is_fatal_error_output() helper scans the full AI output rather than a bounded error field; the run_engine inner function leaks into the global bash namespace; full AI output is tee'd to the terminal on every repeat iteration; and the bash summary omits the skipped-iteration count shown by the npm CLI.
  • The webhook sendNotifications call is now aggregated once after the loop rather than per-iteration — a behavioral change for repeatCount > 1 that should be documented for webhook consumers (e.g. CI dashboards, Slack integrations).
  • The --continue-on-failure warning fires before the subsequent error throw when no task argument is provided, giving users a misleading "this flag is silently accepted" signal when the real problem is the missing task.

Confidence Score: 3/5

  • Mergeable with caveats — the TypeScript core is correct and well-tested, but the bash CLI has multiple parity gaps and the webhook behavior change is undocumented
  • The repeat loop logic, fail-fast/continue-on-failure semantics, and the TaskRunResult refactor are sound and backed by tests. However, ralphy.sh accumulates several unresolved issues (browser default mismatch, false-positive fatal detection, global namespace leak, missing skipped count in summary, full output tee in repeat mode) that were flagged in prior review rounds and remain. The webhook-per-batch change is a silent breaking change for existing integrations. These are not blockers for correct single-task npm CLI usage, but the bash script is meaningfully less reliable in repeat mode than the TypeScript counterpart.
  • ralphy.sh requires the most attention due to accumulated parity gaps with the TypeScript CLI; cli/src/cli/commands/task.ts warrants a second look for per-iteration engine availability overhead and log noise in high-repeat scenarios

Important Files Changed

Filename Overview
cli/src/cli/commands/single-task-loop.ts New repeat orchestrator; dry-run returns total:N with completed:0 (inflated count), skipped field missing from interface (callers recompute manually), and dryRun still calls runTaskFn which triggers engine availability check
cli/src/cli/args.ts Added --repeat/--continue-on-failure with validation; warning fires before subsequent error throw when no task is given; --repeat/--prd guard order previously discussed; regex correctly rejects scientific notation
cli/src/cli/commands/task.ts Extracted failWith() helper; createEngine/isEngineAvailable called on every repeat iteration; notifySingleTask suppresses per-task notifications for any repeatCount > 1 including --repeat 2; logInfo emitted per-iteration creating noisy output in repeat mode
cli/src/index.ts Integrates single-task loop; webhook now fires once per batch (breaking change for repeat mode); skipped count correctly computed and shown in desktop notification
ralphy.sh Extended with repeat mode; BROWSER_ENABLED still defaults to 'auto' while TS CLI defaults to 'false'; is_fatal_error_output scans full AI output (false-positive risk); run_engine inner function leaks to global scope; bash summary omits skipped count; --model not forwarded in codex conflict-resolution path; full AI output streamed to terminal per iteration
cli/src/cli/commands/single-task-loop.test.ts Covers fail-fast, continue-on-failure, fatal error, and dry-run paths; missing test asserting dry-run calls runTaskFn exactly once and that console.warn fires for --continue-on-failure without --repeat

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["CLI: ralphy --repeat N task"] --> B["parseArgs()\nValidate --repeat, --continue-on-failure"]
    B --> C["runSingleTaskLoop(task, options)"]
    C --> D{dryRun?}
    D -->|yes| E["runTask once\n(show prompt only)"]
    E --> F["Return {total:N, completed:0, failed:0}"]
    D -->|no| G["for i=1..N"]
    G --> H{total > 1?}
    H -->|yes| I["logInfo [i/N] Executing: task"]
    H -->|no| J
    I --> J["runTask(task, options)"]
    J --> K["checkEngineAvailable\nbuildPrompt\nwithRetry(engine.execute)"]
    K --> L{result.success?}
    L -->|yes| M["completed++\ncontinue loop"]
    M --> G
    L -->|no| N["failed++\nfailWith(error)"]
    N --> O{fatal OR\n!continueOnFailure?}
    O -->|yes| P["break loop"]
    O -->|no| Q["continue loop"]
    Q --> G
    P --> R["Log summary if total>1"]
    G -->|loop done| R
    R --> S["Return {total, completed, failed}"]
    S --> T["index.ts: sendNotifications\n(webhook, once)"]
    T --> U{repeatCount > 1?}
    U -->|yes| V["notify() aggregate\ndesktop notification"]
    U -->|no| W["notifyTaskComplete/Failed\nalready called inside runTask"]
    V --> X{failed > 0?}
    W --> X
    X -->|yes| Y["process.exitCode = 1"]
    X -->|no| Z["exit 0"]
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: cli/src/cli/commands/single-task-loop.ts
Line: 8-11

Comment:
**`skipped` missing from `SingleTaskLoopResult` interface**

The `skipped` count is computed inside the function body (line 52: `const skipped = total - completed - failed`) and also re-computed by callers — `index.ts` repeats `result.total - result.completed - result.failed` twice. Storing the formula in two independent places is fragile: if the semantics of `skipped` ever change, both sites need updating.

Consider adding `skipped` directly to the interface so callers can use it without re-deriving it:

```suggestion
export interface SingleTaskLoopResult {
	total: number;
	completed: number;
	failed: number;
	skipped: number;
}
```

Then populate it in the return statement (`skipped: total - completed - failed`) and remove the duplicate computation in `index.ts`.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: cli/src/cli/commands/task.ts
Line: 25-27

Comment:
**Per-iteration log noise in repeat mode**

`logInfo("Running task with ...")` and the browser-enabled log fire unconditionally on every call to `runTask`. For `--repeat 100`, the user sees these lines 100 times, providing no additional value after the first iteration — especially since `single-task-loop.ts` already prints an `[i/total]` counter before each call.

Consider guarding these messages to only the first iteration:

```suggestion
	if (options.repeatCount === 1) {
		logInfo(`Running task with ${engine.name}...`);

		if (isBrowserAvailable(options.browserEnabled)) {
			logInfo("Browser automation enabled (agent-browser)");
		}
	}
```

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: 66ceb9f

Comment on lines 26 to 27
const engine = createEngine(options.aiEngine as AIEngineName);
const available = await isEngineAvailable(options.aiEngine as AIEngineName);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

createEngine() and isEngineAvailable() are both invoked at the top of runTask() every time it is called. When --repeat N is used with a large N, the repeat loop in runSingleTaskLoop() calls runTask() N times, incurring N redundant availability checks after the first one confirms the engine is present.

Consider hoisting the availability check to runSingleTaskLoop's caller (index.ts) or caching the result, so the engine is validated only once rather than on every iteration.

// Current (lines 26–27): re-checked on every call
const engine = createEngine(options.aiEngine as AIEngineName);
const available = await isEngineAvailable(options.aiEngine as AIEngineName);
Prompt To Fix With AI
This is a comment left during a code review.
Path: cli/src/cli/commands/task.ts
Line: 26-27

Comment:
`createEngine()` and `isEngineAvailable()` are both invoked at the top of `runTask()` every time it is called. When `--repeat N` is used with a large N, the repeat loop in `runSingleTaskLoop()` calls `runTask()` N times, incurring N redundant availability checks after the first one confirms the engine is present.

Consider hoisting the availability check to `runSingleTaskLoop`'s caller (`index.ts`) or caching the result, so the engine is validated only once rather than on every iteration.

```ts
// Current (lines 26–27): re-checked on every call
const engine = createEngine(options.aiEngine as AIEngineName);
const available = await isEngineAvailable(options.aiEngine as AIEngineName);
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +1058 to +1060
json)
if [[ ! -f "$PRD_FILE" ]]; then
log_error "$PRD_FILE not found in current directory"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium ralphy.sh:1058

In the json case of check_requirements(), jq is invoked at line 1065 to validate the JSON file structure before the global jq availability check at line 1156. If jq is not installed, the command fails silently and the user receives the misleading error "Invalid JSON task file: top-level 'tasks' array is required" instead of being told that jq is not installed. Consider adding a jq availability check in the json case before the validation, similar to how the yaml case checks for yq at line 1053.

    json)
+      if ! command -v jq &>/dev/null; then
+        log_error "jq is required for JSON parsing. On Linux, install with: apt-get install jq (Debian/Ubuntu) or yum install jq (RHEL/CentOS)"
+        exit 1
+      fi
       if [[ ! -f "$PRD_FILE" ]]; then
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file ralphy.sh around lines 1058-1060:

In the `json` case of `check_requirements()`, `jq` is invoked at line 1065 to validate the JSON file structure before the global `jq` availability check at line 1156. If `jq` is not installed, the command fails silently and the user receives the misleading error "Invalid JSON task file: top-level 'tasks' array is required" instead of being told that `jq` is not installed. Consider adding a `jq` availability check in the `json` case before the validation, similar to how the `yaml` case checks for `yq` at line 1053.

- codex.ts: document --dangerously-bypass-approvals-and-sandbox flag origin
- args.ts: warn when --continue-on-failure is used without --repeat
- ralphy.sh: limit fatal error scan to last 20 lines to prevent false positives
- ralphy.sh: check jq availability before JSON validation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
KLIEBHAN and others added 2 commits March 10, 2026 03:55
- args.ts: cap --repeat at 10000 to prevent near-infinite loops (e.g. --repeat 1e100)
- args.test.ts: update error message expectations for new bound
- ralphy.sh: warn when --continue-on-failure is used without --repeat (matches TS CLI)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- ralphy.sh: reject --repeat > 10000 (matches TS CLI validation)
- args.test.ts: add test for --repeat 10001 boundary case

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- args.ts: only warn about --continue-on-failure no-op when task is present
- ralphy.sh: forward --model override to codex engine in brownfield mode
- ralphy.sh: suppress misleading warning when task argument is missing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Comment on lines 26 to 27
const engine = createEngine(options.aiEngine as AIEngineName);
const available = await isEngineAvailable(options.aiEngine as AIEngineName);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When --repeat N is used with large N, createEngine() and isEngineAvailable() are invoked N times (once per iteration), even though the engine availability only needs to be checked once.

Consider hoisting the availability check to the caller in index.ts or caching the result in runSingleTaskLoop, so the engine is validated only once rather than on every iteration.

Prompt To Fix With AI
This is a comment left during a code review.
Path: cli/src/cli/commands/task.ts
Line: 26-27

Comment:
When `--repeat N` is used with large N, `createEngine()` and `isEngineAvailable()` are invoked N times (once per iteration), even though the engine availability only needs to be checked once.

Consider hoisting the availability check to the caller in `index.ts` or caching the result in `runSingleTaskLoop`, so the engine is validated only once rather than on every iteration.

How can I resolve this? If you propose a fix, please make it concise.

When the loop exits early (fail-fast or fatal), the summary now shows
"Done: 0 succeeded, 1 failed, 9 skipped of 10" instead of hiding
skipped iterations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Aligns bash CLI output with TypeScript version when loop exits early.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- args.test.ts: test that --continue-on-failure warns without --repeat
- ralphy.sh: add --model override to parallel, worktree, and conflict-resolution codex calls

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Comment on lines 35 to 38
logInfo(`Running task with ${engine.name}...`);

// Check browser availability
if (isBrowserAvailable(options.browserEnabled)) {
logInfo("Browser automation enabled (agent-browser)");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repetitive per-iteration log noise in repeat mode

logInfo("Running task with...") (lines 35–38) is emitted inside runTask, which is called once per iteration in the repeat loop. For --repeat 100, the user sees these messages printed 100 times with no added value after the first iteration, especially since single-task-loop.ts already prints [i/total] iteration counters.

Consider guarding these informational messages to the first iteration only, e.g. by checking if this is the first run:

if (options.repeatCount === 1) {
  logInfo(`Running task with ${engine.name}...`);

  if (isBrowserAvailable(options.browserEnabled)) {
    logInfo("Browser automation enabled (agent-browser)");
  }
}

This keeps the initial feedback intact without flooding repeat-mode output.

Prompt To Fix With AI
This is a comment left during a code review.
Path: cli/src/cli/commands/task.ts
Line: 35-38

Comment:
Repetitive per-iteration log noise in repeat mode

`logInfo("Running task with...")` (lines 35–38) is emitted inside `runTask`, which is called once per iteration in the repeat loop. For `--repeat 100`, the user sees these messages printed 100 times with no added value after the first iteration, especially since `single-task-loop.ts` already prints `[i/total]` iteration counters.

Consider guarding these informational messages to the first iteration only, e.g. by checking if this is the first run:

```ts
if (options.repeatCount === 1) {
  logInfo(`Running task with ${engine.name}...`);

  if (isBrowserAvailable(options.browserEnabled)) {
    logInfo("Browser automation enabled (agent-browser)");
  }
}
```

This keeps the initial feedback intact without flooding repeat-mode output.

How can I resolve this? If you propose a fix, please make it concise.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Prevents printing the prompt N times and misleading "N succeeded" summary
when --dry-run is combined with --repeat.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Prevents run_brownfield_task from invoking the AI engine when --dry-run
is set, matching the TypeScript task.ts behavior.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Comment on lines +31 to +34
if (options.dryRun) {
await runTaskFn(task, options);
return { total, completed: 0, failed: 0 };
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dry-run result returns total: N with completed: 0

When dryRun is true and repeatCount is e.g. 5, the returned object { total: 5, completed: 0, failed: 0 } accurately conveys that nothing was executed (good), but total: 5 implies 5 runs were attempted while only 1 prompt was shown. Callers in index.ts are guarded by !options.dryRun, so there's no runtime bug today, but a future caller that uses result.total for stats (e.g. a progress display or test assertion) would see an inflated count.

Consider returning total: 1 to represent the single dry-run invocation:

Suggested change
if (options.dryRun) {
await runTaskFn(task, options);
return { total, completed: 0, failed: 0 };
}
if (options.dryRun) {
await runTaskFn(task, options);
return { total: 1, completed: 0, failed: 0 };
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: cli/src/cli/commands/single-task-loop.ts
Line: 31-34

Comment:
**Dry-run result returns `total: N` with `completed: 0`**

When `dryRun` is true and `repeatCount` is e.g. 5, the returned object `{ total: 5, completed: 0, failed: 0 }` accurately conveys that nothing was executed (good), but `total: 5` implies 5 runs were attempted while only 1 prompt was shown. Callers in `index.ts` are guarded by `!options.dryRun`, so there's no runtime bug today, but a future caller that uses `result.total` for stats (e.g. a progress display or test assertion) would see an inflated count.

Consider returning `total: 1` to represent the single dry-run invocation:

```suggestion
	if (options.dryRun) {
		await runTaskFn(task, options);
		return { total: 1, completed: 0, failed: 0 };
	}
```

How can I resolve this? If you propose a fix, please make it concise.

- args.ts: use regex /^[1-9][0-9]*$/ to reject inputs like "1e4"
  (matches bash is_positive_integer behavior)
- single-task-loop.test.ts: add test for dry-run early-return path

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Comment on lines +38 to +52

for (let i = 1; i <= total; i++) {
if (total > 1) {
logInfoFn(`[${i}/${total}] Executing: ${task}`);
}

const result = await runTaskFn(task, options);
if (result.success) {
completed++;
continue;
}

failed++;
if (result.fatal || !options.continueOnFailure) {
break;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loop counter printed even on i = 1 when total = 1 is impossible but guard allows i = 1, total > 1

When total = 1, no counter is printed (guarded by total > 1) — this is correct. However, notice that the logInfoFn call at line 40 prints [i/total] before runTaskFn is called. If runTaskFn itself also logs a banner (e.g. "Running task with claude…" in task.ts line 35), the output for --repeat 3 looks like:

[1/3] Executing: find and fix bugs
Running task with claude...       ← emitted inside runTask
...
[2/3] Executing: find and fix bugs
Running task with claude...

The inner logInfo in task.ts fires unconditionally for every iteration. Combined with the iteration counter here, this creates duplicate/noisy output for repeat runs. The "Repetitive per-iteration log noise" concern is still unaddressed in task.ts — but the interplay with this counter makes it more visible.

Prompt To Fix With AI
This is a comment left during a code review.
Path: cli/src/cli/commands/single-task-loop.ts
Line: 38-52

Comment:
**Loop counter printed even on `i = 1` when `total = 1` is impossible but guard allows `i = 1, total > 1`**

When `total = 1`, no counter is printed (guarded by `total > 1`) — this is correct. However, notice that the `logInfoFn` call at line 40 prints `[i/total]` before `runTaskFn` is called. If `runTaskFn` itself also logs a banner (e.g. `"Running task with claude…"` in `task.ts` line 35), the output for `--repeat 3` looks like:

```
[1/3] Executing: find and fix bugs
Running task with claude...       ← emitted inside runTask
...
[2/3] Executing: find and fix bugs
Running task with claude...
```

The inner `logInfo` in `task.ts` fires unconditionally for every iteration. Combined with the iteration counter here, this creates duplicate/noisy output for repeat runs. The "Repetitive per-iteration log noise" concern is still unaddressed in `task.ts` — but the interplay with this counter makes it more visible.

How can I resolve this? If you propose a fix, please make it concise.

@KLIEBHAN
Copy link
Copy Markdown
Author

Review Feedback Summary

Thanks for the thorough automated review! Here's a summary of how each comment was addressed:

Implemented (11 fixes across 6 commits)

# Fix Commit
1 Inline comment on --dangerously-bypass-approvals-and-sandbox 45bff6d
2 Warn when --continue-on-failure is used without --repeat (TS + Bash) 45bff6d, a0bf467
4 is_fatal_error_output() scoped to tail -20 to prevent false positives 45bff6d
5 jq availability check before JSON validation 45bff6d
6 Upper bound (10,000) for --repeat in both CLIs a0bf467, e6a3e22
7 Bash parity: --continue-on-failure warning after arg parsing a0bf467
8 Warning guard: only warn when task is present (avoids misleading output before error) a5f2cdd
9 --model forwarded to all 4 codex invocations (brownfield, parallel, worktree, conflict-resolution) a5f2cdd, a20b9c5
11 "skipped" count shown in log summary, desktop notification, and bash output 71f5ca2, f387ef0
18 --dry-run --repeat N runs once, not N times (TS + Bash) 44f6a83, baf3f56
19 Dry-run guard added to run_brownfield_task in bash baf3f56
20 Test coverage for dry-run early-return path 4e05f85
22 Scientific notation (1e4) rejected via regex parity with bash 4e05f85

Intentionally skipped

# Comment Reason
3, 10 Engine hoisting (createEngine/isEngineAvailable called per iteration) The check is a which lookup (<1ms). Hoisting would require changing runTask's public interface or adding a caching layer — not worth the complexity for negligible perf gain.
12 Webhook fires once per batch sendNotifications was never called inside runTask — it was always in index.ts. Repeat mode is a new feature, not a breaking change.
16, 24 Per-iteration log noise ("Running task with...") 1 line per iteration confirms the engine is starting. Combined with the [i/total] counter, this gives users clear progress feedback. Suppressing it would make repeat runs feel unresponsive.
21 total: N vs total: 1 in dry-run result total represents the configured repeat count, not executed iterations. Callers are guarded by !options.dryRun. Consistent with skip semantics.
23 notifySingleTask threshold The suggestion (repeatCount === 1) is exactly what's already implemented. Per-task notifications at repeatCount > 1 would collide with the summary notification.

The auto-detection injected a large prompt block whenever agent-browser
was installed, even when not needed. Browser is now opt-in only.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- BROWSER_ENABLED defaults to "false" instead of "auto"
- load_browser_setting returns "false" when no config exists
- is_browser_available treats unknown values as disabled
- Matches types.ts change removing "auto" from browserEnabled type

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment on lines +1349 to +1353
get_next_task_json() {
jq -r '.tasks[] | select(.completed != true) | .title' "$PRD_FILE" 2>/dev/null | head -1 | cut -c1-50 || echo ""
}

count_remaining_json() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_next_task_json truncates titles at 50 chars — sequential mode passes truncated descriptions to the AI

get_next_task_json applies | cut -c1-50, so for any task whose title exceeds 50 characters the AI receives a truncated prompt. In PRD-sequential mode the AI is expected to operate on the full task description; a truncated title can lead to incomplete or misaligned execution.

The get_next_task_markdown and get_next_task_yaml variants carry the same cut -c1-50 (pre-existing), so this mirrors existing behaviour — but it's worth flagging since JSON is freshly added and the 50-char cap is not documented or validated at input time.

In parallel mode this is not an issue because get_tasks_in_group_json (which is not truncated) is used instead. Consider removing | cut -c1-50 from all three get_next_task_* functions, or at least from the JSON one where the exact title is critical for the jq-based mark_task_complete_json exact-match filter if the sequential flow ever calls it.

Suggested change
get_next_task_json() {
jq -r '.tasks[] | select(.completed != true) | .title' "$PRD_FILE" 2>/dev/null | head -1 | cut -c1-50 || echo ""
}
count_remaining_json() {
get_next_task_json() {
jq -r '.tasks[] | select(.completed != true) | .title' "$PRD_FILE" 2>/dev/null | head -1 || echo ""
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: ralphy.sh
Line: 1349-1353

Comment:
**`get_next_task_json` truncates titles at 50 chars — sequential mode passes truncated descriptions to the AI**

`get_next_task_json` applies `| cut -c1-50`, so for any task whose title exceeds 50 characters the AI receives a truncated prompt. In PRD-sequential mode the AI is expected to operate on the full task description; a truncated title can lead to incomplete or misaligned execution.

The `get_next_task_markdown` and `get_next_task_yaml` variants carry the same `cut -c1-50` (pre-existing), so this mirrors existing behaviour — but it's worth flagging since JSON is freshly added and the 50-char cap is not documented or validated at input time.

In parallel mode this is not an issue because `get_tasks_in_group_json` (which is not truncated) is used instead. Consider removing `| cut -c1-50` from all three `get_next_task_*` functions, or at least from the JSON one where the exact title is critical for the jq-based `mark_task_complete_json` exact-match filter if the sequential flow ever calls it.

```suggestion
get_next_task_json() {
  jq -r '.tasks[] | select(.completed != true) | .title' "$PRD_FILE" 2>/dev/null | head -1 || echo ""
}
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +31 to +34
if (options.dryRun) {
await runTaskFn(task, options);
return { total, completed: 0, failed: 0 };
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dryRun path still calls runTaskFn, which checks engine availability

When dryRun is true, runSingleTaskLoop calls runTaskFn(task, options) once before returning. Inside runTask, engine availability is checked before the dry-run guard:

const available = await isEngineAvailable(...);
if (!available) { return { success: false, fatal: true }; } // ← fires first
// ...
if (options.dryRun) { /* never reached if engine missing */ }

So ralphy --dry-run "task" fails with a fatal error if the configured engine binary is not installed, even though no actual execution was requested. This makes --dry-run less useful as a preview/CI tool when the engine isn't present.

Consider checking dryRun before the availability check in runTask, or having runSingleTaskLoop skip the runTaskFn call entirely in dry-run mode and just build/print the prompt directly.

Prompt To Fix With AI
This is a comment left during a code review.
Path: cli/src/cli/commands/single-task-loop.ts
Line: 31-34

Comment:
**`dryRun` path still calls `runTaskFn`, which checks engine availability**

When `dryRun` is `true`, `runSingleTaskLoop` calls `runTaskFn(task, options)` once before returning. Inside `runTask`, engine availability is checked _before_ the dry-run guard:

```ts
const available = await isEngineAvailable(...);
if (!available) { return { success: false, fatal: true }; } // ← fires first
// ...
if (options.dryRun) { /* never reached if engine missing */ }
```

So `ralphy --dry-run "task"` fails with a fatal error if the configured engine binary is not installed, even though no actual execution was requested. This makes `--dry-run` less useful as a preview/CI tool when the engine isn't present.

Consider checking `dryRun` before the availability check in `runTask`, or having `runSingleTaskLoop` skip the `runTaskFn` call entirely in dry-run mode and just build/print the prompt directly.

How can I resolve this? If you propose a fix, please make it concise.

Extract run_engine() to avoid duplicating engine commands. In repeat
mode (--repeat N > 1), output goes only to file instead of streaming
through tee, preventing a wall of AI text.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment on lines 644 to 703
log_info "Browser automation enabled (agent-browser)"
fi

# Run the AI engine (tee to show output while saving for parsing)
case "$AI_ENGINE" in
claude)
claude --dangerously-skip-permissions \
${MODEL_OVERRIDE:+--model "$MODEL_OVERRIDE"} \
-p "$prompt" 2>&1 | tee "$output_file"
;;
opencode)
opencode --output-format stream-json \
--approval-mode full-auto \
${MODEL_OVERRIDE:+--model "$MODEL_OVERRIDE"} \
"$prompt" 2>&1 | tee "$output_file"
;;
cursor)
agent --dangerously-skip-permissions \
-p "$prompt" 2>&1 | tee "$output_file"
;;
qwen)
qwen --output-format stream-json \
--approval-mode yolo \
-p "$prompt" 2>&1 | tee "$output_file"
;;
droid)
droid exec --output-format stream-json \
--auto medium \
"$prompt" 2>&1 | tee "$output_file"
;;
copilot)
copilot -p "$prompt" \
${MODEL_OVERRIDE:+--model "$MODEL_OVERRIDE"} \
2>&1 | tee "$output_file"
;;
codex)
codex exec --full-auto \
--json \
"$prompt" 2>&1 | tee "$output_file"
;;
esac
# Run the AI engine
run_engine() {
case "$AI_ENGINE" in
claude)
claude --dangerously-skip-permissions \
${MODEL_OVERRIDE:+--model "$MODEL_OVERRIDE"} \
-p "$prompt" 2>&1
;;
opencode)
opencode --output-format stream-json \
--approval-mode full-auto \
${MODEL_OVERRIDE:+--model "$MODEL_OVERRIDE"} \
"$prompt" 2>&1
;;
cursor)
agent --dangerously-skip-permissions \
-p "$prompt" 2>&1
;;
qwen)
qwen --output-format stream-json \
--approval-mode yolo \
-p "$prompt" 2>&1
;;
droid)
droid exec --output-format stream-json \
--auto medium \
"$prompt" 2>&1
;;
copilot)
copilot -p "$prompt" \
${MODEL_OVERRIDE:+--model "$MODEL_OVERRIDE"} \
2>&1
;;
gemini)
gemini --output-format stream-json \
--yolo \
${MODEL_OVERRIDE:+--model "$MODEL_OVERRIDE"} \
-p "$prompt" 2>&1
;;
codex)
codex exec --dangerously-bypass-approvals-and-sandbox \
--json \
${MODEL_OVERRIDE:+--model "$MODEL_OVERRIDE"} \
"$prompt" 2>&1
;;
esac
}

# In repeat mode, suppress streaming output to avoid a wall of text
if [[ "$REPEAT_COUNT" -gt 1 ]]; then
run_engine > "$output_file"
else
run_engine | tee "$output_file"
fi

local exit_code=$?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run_engine inner function becomes a persistent global

In bash, functions declared inside another function are not lexically scoped — they become global after the outer function is first called. In repeat mode run_brownfield_task is invoked N times, and each call re-declares run_engine, overwriting the previous definition with the new value of prompt captured from the enclosing scope. While this works correctly in the current single-threaded flow, it:

  • Leaks run_engine into the global namespace, where it could clash with any future utility function of the same name.
  • Makes the dependency on the local $prompt variable implicit and fragile — any future refactor that renames or relocates the prompt local will silently break run_engine.

Consider inlining the engine dispatch back into run_brownfield_task or renaming to something clearly internal like _ralphy_run_engine_for_task.

Prompt To Fix With AI
This is a comment left during a code review.
Path: ralphy.sh
Line: 644-703

Comment:
**`run_engine` inner function becomes a persistent global**

In bash, functions declared inside another function are not lexically scoped — they become global after the outer function is first called. In repeat mode `run_brownfield_task` is invoked `N` times, and each call re-declares `run_engine`, overwriting the previous definition with the new value of `prompt` captured from the enclosing scope. While this works correctly in the current single-threaded flow, it:

- Leaks `run_engine` into the global namespace, where it could clash with any future utility function of the same name.
- Makes the dependency on the local `$prompt` variable implicit and fragile — any future refactor that renames or relocates the `prompt` local will silently break `run_engine`.

Consider inlining the engine dispatch back into `run_brownfield_task` or renaming to something clearly internal like `_ralphy_run_engine_for_task`.

How can I resolve this? If you propose a fix, please make it concise.

- task.ts: call logTaskProgress on engine-not-found path for consistent
  task history tracking
- args.ts: check task source flag conflict before missing task argument
  so users get the more specific error message first

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment on lines +8 to +11
export interface SingleTaskLoopResult {
total: number;
completed: number;
failed: number;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

skipped missing from SingleTaskLoopResult interface

The skipped count is computed inside the function body (line 52: const skipped = total - completed - failed) and also re-computed by callers — index.ts repeats result.total - result.completed - result.failed twice. Storing the formula in two independent places is fragile: if the semantics of skipped ever change, both sites need updating.

Consider adding skipped directly to the interface so callers can use it without re-deriving it:

Suggested change
export interface SingleTaskLoopResult {
total: number;
completed: number;
failed: number;
export interface SingleTaskLoopResult {
total: number;
completed: number;
failed: number;
skipped: number;
}

Then populate it in the return statement (skipped: total - completed - failed) and remove the duplicate computation in index.ts.

Prompt To Fix With AI
This is a comment left during a code review.
Path: cli/src/cli/commands/single-task-loop.ts
Line: 8-11

Comment:
**`skipped` missing from `SingleTaskLoopResult` interface**

The `skipped` count is computed inside the function body (line 52: `const skipped = total - completed - failed`) and also re-computed by callers — `index.ts` repeats `result.total - result.completed - result.failed` twice. Storing the formula in two independent places is fragile: if the semantics of `skipped` ever change, both sites need updating.

Consider adding `skipped` directly to the interface so callers can use it without re-deriving it:

```suggestion
export interface SingleTaskLoopResult {
	total: number;
	completed: number;
	failed: number;
	skipped: number;
}
```

Then populate it in the return statement (`skipped: total - completed - failed`) and remove the duplicate computation in `index.ts`.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 25 to 27

// Check engine availability
const engine = createEngine(options.aiEngine as AIEngineName);
const available = await isEngineAvailable(options.aiEngine as AIEngineName);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per-iteration log noise in repeat mode

logInfo("Running task with ...") and the browser-enabled log fire unconditionally on every call to runTask. For --repeat 100, the user sees these lines 100 times, providing no additional value after the first iteration — especially since single-task-loop.ts already prints an [i/total] counter before each call.

Consider guarding these messages to only the first iteration:

Suggested change
// Check engine availability
const engine = createEngine(options.aiEngine as AIEngineName);
const available = await isEngineAvailable(options.aiEngine as AIEngineName);
if (options.repeatCount === 1) {
logInfo(`Running task with ${engine.name}...`);
if (isBrowserAvailable(options.browserEnabled)) {
logInfo("Browser automation enabled (agent-browser)");
}
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: cli/src/cli/commands/task.ts
Line: 25-27

Comment:
**Per-iteration log noise in repeat mode**

`logInfo("Running task with ...")` and the browser-enabled log fire unconditionally on every call to `runTask`. For `--repeat 100`, the user sees these lines 100 times, providing no additional value after the first iteration — especially since `single-task-loop.ts` already prints an `[i/total]` counter before each call.

Consider guarding these messages to only the first iteration:

```suggestion
	if (options.repeatCount === 1) {
		logInfo(`Running task with ${engine.name}...`);

		if (isBrowserAvailable(options.browserEnabled)) {
			logInfo("Browser automation enabled (agent-browser)");
		}
	}
```

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant