You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
OpenChrome currently exposes several long-running tools — crawl (src/tools/crawl.ts registered at src/tools/index.ts:85), crawl_sitemap (src/tools/crawl-sitemap.ts, registered line 86), recording (src/tools/recording.ts, registered line 193), oc_evidence_bundle (registered line 215), and oc_session_snapshot (registered line 179) — but every handler is synchronous-returning. The MCP client must hold its single in-flight request open for the entire duration; if the host LLM times out, drops the connection, or simply wants to interleave another tool call, there is no recovery — the work is lost and there is no record of partial progress.
mcp-browser-use solves this by treating long-running operations as background tasks with a persistent ledger and explicit start / list / get / cancel / wait verbs. The host LLM fires-and-forgets, polls when convenient, and can cancel without tearing down the MCP session. This pattern also unlocks #D (progress notifications) and #F (dashboard task view).
OpenChrome already has the right primitives to implement this without new native deps: the JSONL+meta.json+proper-lockfile pattern in src/core/trace/storage.ts (lines 1-493) is directly reusable, and src/utils/atomic-file.ts provides acquireLock / writeFileAtomicSafe. The oc_reap_orphans PID-lock pattern (recently extended in commit 334fa4d) already handles crash-resilient reaping.
What
Add a persistent task ledger and five new MCP tools — oc_task_start, oc_task_list, oc_task_get, oc_task_cancel, oc_task_wait — that wrap the five long-running tools listed above so they may run as background jobs.
Boundary:
src/core/task-ledger/{storage.ts,types.ts,runner.ts,index.ts} (new directory). Storage layer mirrors src/core/trace/storage.ts line-for-line on disk semantics.
src/tools/oc-task-{start,list,get,cancel,wait}.ts (new) registered in src/tools/index.ts.
Modify the five existing tool handlers to expose a thin "wrappable inner runner" so oc_task_start can invoke them without code duplication. No behavioural change when the tools are called directly.
Default ledger root: ~/.openchrome/tasks/ (sibling to ~/.openchrome/traces/).
Core tier (P1 host-pluggable, P2 OS-portable, P5 no native deps).
State machine: PENDING → RUNNING → (COMPLETED | FAILED | CANCELLED). Terminal states are immutable.
On startup, any task whose meta.json says RUNNING and whose pid is no longer alive (per process.kill(pid, 0)) is reaped to FAILED with error.code = "orphaned". This must run before any new task is accepted.
Disk layout: <root>/<task_id>/{meta.json, events.jsonl, result.json?, lock}. meta.json is updated via writeFileAtomicSafe; events are appended; the lock is held only during meta.json mutation, never for the long body of work.
oc_task_cancel is best-effort: it sets cancel_requested_at in meta.json and emits a cancel_requested event; the runner cooperates by checking meta.json between work units. Cancellation latency ≤ one work unit (≤ one crawled page for crawl).
oc_task_wait(task_id, timeout_ms?) returns the final TaskMeta once status is terminal, or throws a typed timeout error. Default timeout_ms = 60_000. Wait does NOT block other tool calls (uses fs.watch + bounded poll, not a CPU spin).
Calling crawl etc. directly still works and returns synchronously; oc_task_start({ kind: "crawl", args }) is the opt-in path.
Result payload is the same shape as the synchronous tool's response. oc_task_get(task_id, { include_result: true }) resolves the file; without that flag, only meta.json is returned (bounds response size).
All five new tools registered in src/tools/index.ts and surfaced in the tool catalog.
Five long-running tools refactored to share an inner runner with oc_task_start; existing direct-call behaviour byte-identical (regression fixture in tests/tools/crawl.parity.test.ts).
Crash-resilience test in tests/core/task-ledger/orphan-reap.test.ts: spawn child process that creates a RUNNING task and process.exit(1)s mid-run; on next ledger open, task reaps to FAILED with error.code = "orphaned".
Cancel test in tests/core/task-ledger/cancel.test.ts: oc_task_start({kind: "crawl", args: {url: "file://<fixture>", maxPages: 5}}), then oc_task_cancel after ≥1 page completes; final status CANCELLED, result.json contains the partial pages already crawled (no data loss).
oc_task_wait returns within 200 ms of terminal transition (timing test on a fast fixture).
Lock file is released even when the worker throws — verified by a fault-injection test (failOnPage = 2 in crawl args).
Default ledger root resolves via os.homedir() (Windows-compatible per CLAUDE.md).
npm run build && npm test green.
Real verification (post-merge, via openchrome MCP)
mcp__openchrome__navigate to https://example.com to confirm the daemon is healthy.
mcp__openchrome__oc_task_list → assert the new task_id appears with status in {PENDING, RUNNING} and kind = "crawl".
After ≥1 page completes (poll with oc_task_get every 250 ms), call mcp__openchrome__oc_task_cancel({ task_id }) → assert returned status CANCELLED within ≤2 s.
Restart the openchrome daemon (mcp__openchrome__oc_stop then re-launch). Re-issue oc_task_list → previous CANCELLED task is still listed (persistence across restart).
Negative path: launch a second crawl task, then kill -9 the daemon mid-run, restart, and assert that task's status is FAILED with error.code = "orphaned" (orphan reaper).
mcp__openchrome__oc_task_wait({ task_id: <a fast fixture task id>, timeout_ms: 30000 }) → returns terminal meta within bound.
A reproducer script lives at scripts/verify/oc-task-ledger.mjs.
Out of scope
Cross-process priority queueing or fair scheduling.
Resuming RUNNING → RUNNING after restart (orphans always reap to FAILED; resume is a separate concern).
Web-UI / dashboard wiring (covered by issue F).
MCP notifications/progress emission (covered by issue D).
Dependencies
None blocking. Soft dep on oc_reap_orphans patterns (already merged in 334fa4d).
Hard prerequisite for issues D and F.
Effort
L (~6+ dev days). New storage subsystem, five new tools, refactor of five existing handlers, crash-resilience tests.
Recent reap-orphans PID-lock work (commit 334fa4d).
Revision history
2026-05-12 r1: Original draft.
2026-05-12 r2: Critic-driven revision. Tightened cancel semantics to "≤ one work unit" with concrete fixture. Added explicit byte-parity regression test for direct-call path. Required oc_task_wait to use fs.watch+bounded poll (not CPU spin). Constrained args_summary to ≤2 KiB redacted. Removed earlier ambiguous "tasks SHOULD survive restart" wording; replaced with explicit "persisted; orphans reap to FAILED" invariant.
Curated scope, overlap handling, and verification checklist
Scope classification
Canonical lane: persistent async task ledger.
Primary deliverable:oc_task_ledger persistent async task table with start/list/status/wait/cancel semantics for long-running tools.
Why
OpenChrome currently exposes several long-running tools —
crawl(src/tools/crawl.tsregistered atsrc/tools/index.ts:85),crawl_sitemap(src/tools/crawl-sitemap.ts, registered line 86),recording(src/tools/recording.ts, registered line 193),oc_evidence_bundle(registered line 215), andoc_session_snapshot(registered line 179) — but every handler is synchronous-returning. The MCP client must hold its single in-flight request open for the entire duration; if the host LLM times out, drops the connection, or simply wants to interleave another tool call, there is no recovery — the work is lost and there is no record of partial progress.mcp-browser-usesolves this by treating long-running operations as background tasks with a persistent ledger and explicitstart / list / get / cancel / waitverbs. The host LLM fires-and-forgets, polls when convenient, and can cancel without tearing down the MCP session. This pattern also unlocks #D (progress notifications) and #F (dashboard task view).OpenChrome already has the right primitives to implement this without new native deps: the JSONL+
meta.json+proper-lockfilepattern insrc/core/trace/storage.ts(lines 1-493) is directly reusable, andsrc/utils/atomic-file.tsprovidesacquireLock/writeFileAtomicSafe. Theoc_reap_orphansPID-lock pattern (recently extended in commit334fa4d) already handles crash-resilient reaping.What
Add a persistent task ledger and five new MCP tools —
oc_task_start,oc_task_list,oc_task_get,oc_task_cancel,oc_task_wait— that wrap the five long-running tools listed above so they may run as background jobs.Boundary:
src/core/task-ledger/{storage.ts,types.ts,runner.ts,index.ts}(new directory). Storage layer mirrorssrc/core/trace/storage.tsline-for-line on disk semantics.src/tools/oc-task-{start,list,get,cancel,wait}.ts(new) registered insrc/tools/index.ts.oc_task_startcan invoke them without code duplication. No behavioural change when the tools are called directly.~/.openchrome/tasks/(sibling to~/.openchrome/traces/).Contract
Invariants:
PENDING → RUNNING → (COMPLETED | FAILED | CANCELLED). Terminal states are immutable.meta.jsonsaysRUNNINGand whosepidis no longer alive (perprocess.kill(pid, 0)) is reaped toFAILEDwitherror.code = "orphaned". This must run before any new task is accepted.<root>/<task_id>/{meta.json, events.jsonl, result.json?, lock}.meta.jsonis updated viawriteFileAtomicSafe; events are appended; the lock is held only during meta.json mutation, never for the long body of work.oc_task_cancelis best-effort: it setscancel_requested_atinmeta.jsonand emits acancel_requestedevent; the runner cooperates by checkingmeta.jsonbetween work units. Cancellation latency ≤ one work unit (≤ one crawled page forcrawl).oc_task_wait(task_id, timeout_ms?)returns the finalTaskMetaonce status is terminal, or throws a typed timeout error. Defaulttimeout_ms = 60_000. Wait does NOT block other tool calls (usesfs.watch+ bounded poll, not a CPU spin).crawletc. directly still works and returns synchronously;oc_task_start({ kind: "crawl", args })is the opt-in path.oc_task_get(task_id, { include_result: true })resolves the file; without that flag, onlymeta.jsonis returned (bounds response size).oc_task_listsupports{ status?, kind?, limit?, since? }; defaultlimit = 50, default sort bycreated_atdescending.Acceptance criteria
src/tools/index.tsand surfaced in the tool catalog.oc_task_start; existing direct-call behaviour byte-identical (regression fixture intests/tools/crawl.parity.test.ts).tests/core/task-ledger/orphan-reap.test.ts: spawn child process that creates aRUNNINGtask andprocess.exit(1)s mid-run; on next ledger open, task reaps toFAILEDwitherror.code = "orphaned".tests/core/task-ledger/cancel.test.ts:oc_task_start({kind: "crawl", args: {url: "file://<fixture>", maxPages: 5}}), thenoc_task_cancelafter ≥1 page completes; final statusCANCELLED,result.jsoncontains the partial pages already crawled (no data loss).oc_task_waitreturns within 200 ms of terminal transition (timing test on a fast fixture).failOnPage = 2in crawl args).os.homedir()(Windows-compatible perCLAUDE.md).npm run build && npm testgreen.Real verification (post-merge, via openchrome MCP)
mcp__openchrome__navigatetohttps://example.comto confirm the daemon is healthy.mcp__openchrome__oc_task_startwith{ kind: "crawl", args: { url: "https://example.com", maxPages: 5, sameOrigin: true } }→ assert response shape{ task_id: <16-hex>, status: "PENDING" | "RUNNING" }.mcp__openchrome__oc_task_list→ assert the newtask_idappears withstatusin{PENDING, RUNNING}andkind = "crawl".oc_task_getevery 250 ms), callmcp__openchrome__oc_task_cancel({ task_id })→ assert returned statusCANCELLEDwithin ≤2 s.mcp__openchrome__oc_task_get({ task_id, include_result: true })→ assertresult.jsoncontains a non-emptypages[]array (partial output retained).mcp__openchrome__oc_stopthen re-launch). Re-issueoc_task_list→ previousCANCELLEDtask is still listed (persistence across restart).crawltask, thenkill -9the daemon mid-run, restart, and assert that task's status isFAILEDwitherror.code = "orphaned"(orphan reaper).mcp__openchrome__oc_task_wait({ task_id: <a fast fixture task id>, timeout_ms: 30000 })→ returns terminal meta within bound.A reproducer script lives at
scripts/verify/oc-task-ledger.mjs.Out of scope
RUNNING → RUNNINGafter restart (orphans always reap toFAILED; resume is a separate concern).notifications/progressemission (covered by issue D).Dependencies
oc_reap_orphanspatterns (already merged in334fa4d).Effort
L (~6+ dev days). New storage subsystem, five new tools, refactor of five existing handlers, crash-resilience tests.
References
src/core/trace/storage.ts(reusable JSONL+meta.json+lock pattern).src/utils/atomic-file.ts(acquireLock,writeFileAtomicSafe).334fa4d).Revision history
oc_task_waitto usefs.watch+bounded poll (not CPU spin). Constrainedargs_summaryto ≤2 KiB redacted. Removed earlier ambiguous "tasks SHOULD survive restart" wording; replaced with explicit "persisted; orphans reap to FAILED" invariant.Curated scope, overlap handling, and verification checklist
Scope classification
oc_task_ledgerpersistent async task table with start/list/status/wait/cancel semantics for long-running tools.feat/855-task-ledger, merged) — verify merged implementation still matches this issue before closing. Continue there; avoid duplicate work.Overlap and conflict resolution
Implementation checklist
Success criteria
Post-merge OpenChrome live verification checklist