Skip to content
This repository was archived by the owner on Mar 15, 2026. It is now read-only.

feat: retry planning and implementation agent runs on error (up to 2 times)#4

Draft
JOORVIS wants to merge 1 commit intomainfrom
feature/retry-on-agent-errors
Draft

feat: retry planning and implementation agent runs on error (up to 2 times)#4
JOORVIS wants to merge 1 commit intomainfrom
feature/retry-on-agent-errors

Conversation

@JOORVIS
Copy link

@JOORVIS JOORVIS commented Feb 24, 2026

Summary

  • Automatically retry planning and implementation agent runs up to 2 times (3 total attempts) when they fail with an error outcome or unhandled exception
  • Retries are transparent on GitHub — the issue stays in planning-ongoing / implementation-ongoing label state throughout; only the final outcome (success or error) surfaces
  • On each retry a new run claim is created with a fresh runId and a new message is dispatched with retryCount + 1; if tryCreateRunClaim fails during retry setup, falls back to immediate final-failure handling

Test plan

  • All 30 unit tests in PlanningRunExecutorTest and ImplementationRunExecutorTest pass (verified locally)
  • Existing ImplementIssueHandlerTest tests still pass
  • Verify that a planning/implementation agent run that fails on attempt 1 or 2 does not post an error comment or apply error labels — only the issue label stays as ongoing
  • Verify that after 3 consecutive failures the error comment is posted and planning-errored / implementation-errored label is applied
  • Verify that a run that succeeds on a retry attempt produces the normal success outcome (plan comment, PR opened, etc.)

Closes #1

…times)

When a planning or implementation agent run fails (either an Error outcome
from the agent or an unhandled Throwable), automatically retry up to 2 times
(3 total attempts) before applying error labels and posting an error comment.

During retries the issue stays in its ongoing label state, so retries are
transparent on GitHub — only the final outcome surfaces. On each retry a new
run claim is created and a new message is dispatched with retryCount+1.

Changes:
- Add retryCount field (default 0) to PlanIssueMessage and ImplementIssueMessage
- Restructure PlanningRunExecutor.doExecute to capture errors after finally
  blocks complete, then retry or apply final error handling
- Restructure ImplementationRunExecutor.doExecute the same way; also change
  handlePrCreated to throw RuntimeException on missing PR metadata so the
  error propagates to the retry path
- Inject MessageBusInterface into both executors for retry dispatch
- Add MAX_RETRIES = 2 constant to both executors
- Update PlanningRunExecutorTest: add fake message bus, configurable
  tryCreateRunClaim result, and full retry scenario coverage
- Add ImplementationRunExecutorTest with equivalent retry test coverage

Closes #1

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Recover from planning or implementation errors

1 participant