Summary
Persist agent/workflow run state, telemetry, child lineage, and resumable checkpoints so OpenHuman can reliably render and resume parallel work after navigation, app restart, or interrupted runs.
This is the cross-cutting storage foundation for the other #3370 child issues.
Parent: #3370
Problem
OpenHuman currently has process-local orchestration state and live-only subagent transcript deltas. Some worker transcripts are persisted as conversation threads, and some tool timeline entries are persisted, but there is no single durable run index for background agents, teams, workflows, and child lineage.
Claude Code's workflow/agent surfaces work because every run/session has inspectable state: status, task/phase, child agent results, prompt/script, logs, and management commands. OpenHuman needs the same kind of durable run ledger, adapted to our existing memory/thread/task-board stores.
Implementation plan
-
Define a durable run ledger.
AgentRun: id, kind (subagent, worker_thread, background_agent, team_member, workflow_child), parent run/thread, agent id, status, prompt ref, worker thread id, task board/card refs, started/updated/completed timestamps.
WorkflowRun: id, definition id, parent thread, input, phase states, child run ids, status, summary.
RunEvent: run id, sequence, event type, payload, timestamp.
RunTelemetry: token counts, cost estimate, elapsed ms, tool counts, model/provider, error.
-
Store the ledger under workspace state.
- Use existing workspace storage patterns instead of inventing a separate DB unless current thread storage is insufficient.
- Keep large transcript text in conversation/thread storage; run ledger should reference it.
-
Rehydrate app state.
- Add API endpoints for list/get runs and recent run events.
chatRuntimeSlice should rehydrate historical subagent/tool rows from persisted metadata, accepting that live streamed prose is not replayed unless it exists in a worker thread.
-
Add resume semantics.
- Persist checkpoints for awaiting-user and paused workers, extending the existing
continue_subagent checkpoint path.
- Workflows resume by reusing completed child results and launching missing/failed phases according to policy.
-
Add tests.
- Rust tests for append/list/get ordering, schema compatibility, and restart-style rehydrate.
- Vitest for historical run rendering.
- E2E for starting a run, navigating away, and reopening it.
Reference code
Current orchestration state is explicitly process-local:
// src/openhuman/agent_orchestration/README.md
// The first implementation is process-local. The state shape is serializable so a
// later PR can persist orchestration sessions across app restart, cron resumes, and
// thread continuation without changing callers.
Current live/persisted split in frontend runtime:
// app/src/store/chatRuntimeSlice.ts
export interface SubagentActivity {
taskId: string;
agentId: string;
workerThreadId?: string;
status: ToolTimelineEntryStatus;
toolCalls: SubagentToolCallEntry[];
transcript?: SubagentTranscriptItem[];
}
Files to build from:
src/openhuman/agent_orchestration/README.md
src/openhuman/agent_orchestration/ops.rs
src/openhuman/agent_orchestration/types.rs
src/openhuman/agent_orchestration/tools/continue_subagent.rs
src/openhuman/agent_orchestration/tools/worker_thread.rs
src/openhuman/memory_conversations/
app/src/store/chatRuntimeSlice.ts
app/src/types/turnState.ts
Claude references:
Relevant Claude ideas to adapt:
- Background work remains visible after detaching from the interactive session.
- Workflow runtime tracks child results separately from the main conversation context.
- Progress views need phase/agent state, elapsed time, token/cost totals, and stop/resume controls.
- Resume should avoid rerunning completed child work where possible.
Acceptance criteria
Related
Summary
Persist agent/workflow run state, telemetry, child lineage, and resumable checkpoints so OpenHuman can reliably render and resume parallel work after navigation, app restart, or interrupted runs.
This is the cross-cutting storage foundation for the other #3370 child issues.
Parent: #3370
Problem
OpenHuman currently has process-local orchestration state and live-only subagent transcript deltas. Some worker transcripts are persisted as conversation threads, and some tool timeline entries are persisted, but there is no single durable run index for background agents, teams, workflows, and child lineage.
Claude Code's workflow/agent surfaces work because every run/session has inspectable state: status, task/phase, child agent results, prompt/script, logs, and management commands. OpenHuman needs the same kind of durable run ledger, adapted to our existing memory/thread/task-board stores.
Implementation plan
Define a durable run ledger.
AgentRun: id, kind (subagent,worker_thread,background_agent,team_member,workflow_child), parent run/thread, agent id, status, prompt ref, worker thread id, task board/card refs, started/updated/completed timestamps.WorkflowRun: id, definition id, parent thread, input, phase states, child run ids, status, summary.RunEvent: run id, sequence, event type, payload, timestamp.RunTelemetry: token counts, cost estimate, elapsed ms, tool counts, model/provider, error.Store the ledger under workspace state.
Rehydrate app state.
chatRuntimeSliceshould rehydrate historical subagent/tool rows from persisted metadata, accepting that live streamed prose is not replayed unless it exists in a worker thread.Add resume semantics.
continue_subagentcheckpoint path.Add tests.
Reference code
Current orchestration state is explicitly process-local:
Current live/persisted split in frontend runtime:
Files to build from:
src/openhuman/agent_orchestration/README.mdsrc/openhuman/agent_orchestration/ops.rssrc/openhuman/agent_orchestration/types.rssrc/openhuman/agent_orchestration/tools/continue_subagent.rssrc/openhuman/agent_orchestration/tools/worker_thread.rssrc/openhuman/memory_conversations/app/src/store/chatRuntimeSlice.tsapp/src/types/turnState.tsClaude references:
Relevant Claude ideas to adapt:
Acceptance criteria
Related