Problem
The test suite has 276 unit tests covering individual components (circuit breaker state transitions, rate limit counters, response parsing) but zero end-to-end integration tests that validate safeguards working together with real Claude Code execution.
From TESTING.md, E2E tests are listed as "0 of 10+ target".
Impact
Safeguards could individually pass all unit tests while failing in production scenarios:
- Circuit breaker may not trigger correctly when Claude produces realistic output
- Rate limiter + circuit breaker interaction untested
- Exit conditions not validated against real Claude response formats
- Permission denial handling not tested with actual Claude CLI
Suggested Fix
Add a mock Claude CLI mode for CI testing:
```bash
In ralph_loop.sh, support RALPH_MOCK_CLAUDE for testing
if [[ "${RALPH_MOCK_CLAUDE:-false}" == "true" ]]; then
CLAUDE_CODE_CMD="$RALPH_DIR/../tests/mock_claude.sh"
fi
```
`tests/mock_claude.sh` can simulate realistic scenarios:
- Normal completion (outputs
EXIT_SIGNAL: true after N loops)
- Stuck loop (never outputs progress)
- Permission denial
- High token usage response
- False-positive completion keywords in documentation
This allows full E2E tests in CI without requiring a real Claude API key or incurring costs.
Problem
The test suite has 276 unit tests covering individual components (circuit breaker state transitions, rate limit counters, response parsing) but zero end-to-end integration tests that validate safeguards working together with real Claude Code execution.
From
TESTING.md, E2E tests are listed as "0 of 10+ target".Impact
Safeguards could individually pass all unit tests while failing in production scenarios:
Suggested Fix
Add a mock Claude CLI mode for CI testing:
```bash
In ralph_loop.sh, support RALPH_MOCK_CLAUDE for testing
if [[ "${RALPH_MOCK_CLAUDE:-false}" == "true" ]]; then
CLAUDE_CODE_CMD="$RALPH_DIR/../tests/mock_claude.sh"
fi
```
`tests/mock_claude.sh` can simulate realistic scenarios:
EXIT_SIGNAL: trueafter N loops)This allows full E2E tests in CI without requiring a real Claude API key or incurring costs.