Skip to content

fix(auto-drive): retry on empty stream output#363

Closed
zemaj wants to merge 2 commits intojust-every:mainfrom
zemaj:fix/agent-stream-retry
Closed

fix(auto-drive): retry on empty stream output#363
zemaj wants to merge 2 commits intojust-every:mainfrom
zemaj:fix/agent-stream-retry

Conversation

@zemaj
Copy link

@zemaj zemaj commented Oct 28, 2025

Summary

  • guard coordinator stream results: if both output_text and response_items are empty, treat it as a transient stream failure
  • returning this error routes through existing retry/backoff logic so Auto Drive no longer aborts turns when SSE finishes early

Fixes #299

Testing

  • ./build-fast.sh (1.90.0)

Risks

  • minimal: only affects empty-response edge case, reuses existing retry classification

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +1459 to +1461
if output_text.trim().is_empty() && response_items.is_empty() {
return Err(anyhow!("coordinator stream ended without producing output (possible transient error)"));
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Empty coordinator output never enters retry path

The new guard returns Err(anyhow!("coordinator stream ended without producing output …")) before parsing. However, the retry logic only kicks in when classify_recoverable_decision_error recognizes the error text as recoverable, and the new message does not match any classifier patterns. Previously, empty output caused parse_decision to fail with "model response was not valid JSON", which is classified as recoverable and retried. With this change an empty SSE response now emits a non‑recoverable error and the coordinator immediately reports Failed instead of retrying, so the commit does not achieve its stated goal and regresses the prior retry behavior.

Useful? React with 👍 / 👎.

@zemaj
Copy link
Author

zemaj commented Oct 28, 2025

CI note: musl/darwin legs are failing with , same root cause as #356. We’ll rerun preview build once that toolchain fix merges.

@zemaj
Copy link
Author

zemaj commented Oct 28, 2025

Preview Build rerun triggered after #364 merged; watching for green checks.

@zemaj
Copy link
Author

zemaj commented Oct 28, 2025

Fresh Preview Build queued with updated toolchain after #364 merged; watching for green checks.

@zemaj
Copy link
Author

zemaj commented Oct 28, 2025

Preview Build rerun queued with Rust 1.90.0 plus MUSL targets after #370 merged. Tracking run: https://github.com/just-every/code/actions/runs/18870463673

@zemaj zemaj force-pushed the main branch 2 times, most recently from 804b5ad to b7927a2 Compare October 28, 2025 23:57
@zemaj zemaj closed this in #364 Oct 29, 2025
@zemaj
Copy link
Author

zemaj commented Oct 29, 2025

Note: PR merged manually via commit 37d208d (Merge branch pr363 into main). Leaving PR closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Running Claude manually works, but executing via agents throws an error.

1 participant

Comments