Skip to content

feat(handoff): add secret-safe resume checkpoints (#1040)#1110

Merged
shaun0927 merged 5 commits into
feat/1039-task-run-lifecyclefrom
feat/1040-handoff-resume
May 13, 2026
Merged

feat(handoff): add secret-safe resume checkpoints (#1040)#1110
shaun0927 merged 5 commits into
feat/1039-task-run-lifecyclefrom
feat/1040-handoff-resume

Conversation

@shaun0927
Copy link
Copy Markdown
Owner

@shaun0927 shaun0927 commented May 12, 2026

Progress / Review status

Auto-refreshed 2026-05-13 — owner comments cleaned up to reduce review noise.

Field Value
Branch feat/1040-handoff-resumefeat/1039-task-run-lifecycle
Draft no
CI
Mergeable ❌ CONFLICTING
Review decision
Codex (latest)
Other reviewers (latest)
Head 5efe0bc — Enable bounded human takeover recovery
Commits 2

Owner comment cleanup: 0 issue + 0 inline review comments deleted. Outstanding feedback from automated/external reviewers above is unchanged.


Summary

  • Adds opt-in oc_handoff_start/status/finish/cancel tools for secret-safe human takeover checkpoints.
  • Persists only caller-supplied safe snapshot fields (url, title, origin, counts/keys, fingerprint, screenshot ref), with redaction before metadata/events are written.
  • On oc_handoff_finish, computes a bounded before/after delta and appends exactly one kind: "handoff" evidence pointer to the linked TaskRun when run_id is supplied.

Closes #1040.
Stacked on #1083 / feat/1039-task-run-lifecycle; merge #1083 first.

Direction / duplicate check before implementation

Success criteria

  • A handoff can be started for a TaskRun, queried, finished, cancelled, or timed out deterministically.
  • Handoff artifacts never persist raw cookies, storage values, DOM text, passwords, tokens, or inline screenshots.
  • Finishing a linked handoff updates the TaskRun with one handoff evidence pointer that can be used for resume context.
  • Terminal handoffs cannot be modified.

Verification performed

  • npm test -- --runTestsByPath tests/core/handoff/storage.test.ts tests/tools/handoff-tools.test.ts tests/core/task-run/storage.test.ts tests/tools/task-run-tools.test.ts --runInBand
  • npm run build
  • npm run lint:changed

Real OpenChrome verification after merge

  1. Start OpenChrome from the merged branch and call oc_task_run_start with goal: "manual login resume smoke".
  2. Call oc_handoff_start with the returned run_id, reason: "manual login required", and a safe before snapshot such as { "url": "https://example.test/login", "title": "Login", "cookie_count": 0 }.
  3. Simulate human completion, then call oc_handoff_finish with an after snapshot such as { "url": "https://example.test/account", "title": "Account", "cookie_count": 1, "local_storage_keys": ["session_state"] }.
  4. Call oc_task_run_get for the original run and verify last_evidence[0].kind === "handoff", ref equals the handoff_id, and summary contains the URL/title/cookie-count delta.
  5. Inspect persisted files under $OPENCHROME_HOME/handoffs/<handoff_id>/ and confirm no raw token/password/cookie/storage values are present.
  6. Start another handoff with ttl_ms: 1000, wait over one second, call oc_handoff_status, and verify it transitions to TIMED_OUT and cannot be finished.

Out of scope

shaun0927 and others added 2 commits May 13, 2026 00:14
Introduce an opt-in TaskRun store and MCP tool surface so long-running user goals can persist progress, help state, checkpoints, and evidence without changing existing browser tool behavior.

Constraint: Builds on issue #1039 while avoiding duplication of open task-ledger PR #911.

Rejected: Requiring Postgres or a Bytebot-style desktop task service | conflicts with OpenChrome's local MCP/CDP-first design.

Confidence: high

Scope-risk: moderate

Directive: Keep TaskRun as goal-level metadata; do not move browser execution or #855 async task scheduling into this layer.

Tested: npm test -- --runTestsByPath tests/core/task-run/storage.test.ts tests/tools/task-run-tools.test.ts --runInBand; npm run build; npm run lint:changed

Not-tested: Live MCP round-trip against a running Chrome daemon.

Co-authored-by: OmX <omx@oh-my-codex.dev>
Add a storage-only handoff lifecycle that records secret-safe before/after state, timeout/cancel transitions, and a single TaskRun evidence pointer when linked. This keeps manual intervention resumable without adding desktop/noVNC infrastructure or changing existing browser tool behavior.

Constraint: Stack on #1039 TaskRun evidence rather than duplicate async task ledger/dashboard work from #855/#865/#863.
Rejected: Persist raw DOM, cookies, storage values, or screenshots inline | secret exposure and payload bloat would harm long-running harness safety.
Confidence: high
Scope-risk: narrow
Directive: Keep handoff artifacts caller-supplied and redacted unless a later browser-capture PR can prove secret-safe extraction.
Tested: npm test -- --runTestsByPath tests/core/handoff/storage.test.ts tests/tools/handoff-tools.test.ts tests/core/task-run/storage.test.ts tests/tools/task-run-tools.test.ts --runInBand
Tested: npm run build
Tested: npm run lint:changed
Co-authored-by: OmX <omx@oh-my-codex.dev>
@gemini-code-assist
Copy link
Copy Markdown

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Repo admins can enable using credits for code reviews in their settings.

@qodo-code-review
Copy link
Copy Markdown

ⓘ You've reached your Qodo monthly free-tier limit. Reviews pause until next month — upgrade your plan to continue now, or link your paid account if you already have one.

@shaun0927
Copy link
Copy Markdown
Owner Author

Merge rationale (stack consolidation)

Intent. Closes #1040 — adds opt-in oc_handoff_start / oc_handoff_status / oc_handoff_finish / oc_handoff_cancel tools for secret-safe human-takeover checkpoints.

Why this is correct.

  • Secret safety by design: persists only caller-supplied safe snapshot fields (url, title, origin, counts/keys, fingerprint, screenshot ref). Redaction runs before any metadata/event is written.
  • oc_handoff_finish computes a bounded before/after delta and appends exactly one kind: "handoff" evidence pointer to the linked TaskRun when run_id is supplied — observable, auditable, idempotent.
  • Opt-in tool surface; no behavior change for callers that don't invoke handoff.
  • Prerequisite feat(task-run): persist goal-level lifecycle state (#1039) #1083 (task-run lifecycle) is merged.
  • Scope contained: 11 files, +732/-223. No Codex P0/P1/P2 outstanding.

CI. Targets the task-run lifecycle feature branch; CI workflow only runs on main/develop PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant