-
Notifications
You must be signed in to change notification settings - Fork 647
Open
Description
Summary
The agent gets stuck with an active goal in executing when a task remains assigned to a dead local worker (local://...).
The parent loop then repeatedly tries create_goal, receives BLOCKED, and goes to backoff sleep, while progress does not move.
Observed behavior
- Active goal remains
active(example:01KJXDGJ2212GXQDGVGE4H9WD8) orchestrator.state.phase = executingtask_graphfor active goal:assigned: 1(assigned to deadlocal://...)blocked: 9
childrentable still has the samelocal://worker asrunning- Logs repeat patterns like:
Recovering stale task from dead worker- then parent does
create_goal->BLOCKED - then sleeps with exponential backoff
Expected behavior
- Dead
local://worker assignment is recovered once and does not re-enter the same stale loop. - Stale local worker rows are marked dead/unhealthy and excluded from reassignment.
- Parent loop should not attempt
create_goalwhile active goal is in progress. - Goal execution should resume (reassign to live worker or self-assignment fallback), not repeatedly sleep.
Reproduction (high-level)
- Start with active goal in
executing. - Ensure one task is assigned to a
local://worker that no longer exists. - Restart runtime.
- Observe repeated stale-recovery +
create_goal BLOCKED+ backoff sleep loop.
Notes
This appears separate from planning/classifying fallback logic; here tasks exist but assignment targets stale local worker state.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels