You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Bytebot's most useful recovery pattern is not its virtual desktop itself; it is the ability to pause automation, let a human intervene, capture enough context, and resume the task. OpenChrome already supports real Chrome profiles, headed fallback, CAPTCHA/2FA handoff guidance, Ralph HITL escalation, and session persistence. What is missing is a structured, secret-safe human handoff checkpoint that records browser state deltas after manual intervention so the LLM does not wander or retry stale login flows.
What
Add an opt-in human handoff/resume checkpoint toolset:
oc_handoff_start
oc_handoff_status
oc_handoff_finish
oc_handoff_cancel
The handoff records state deltas, not raw user input. It supports these bounded cases through the reason enum: login, 2FA, CAPTCHA, permission, manual recovery, and other.
Proposed contract
exportinterfaceHandoffStartArgs{sessionId?: string;tabId?: string;run_id?: string;// optional TaskRun link from the TaskRun issuereason: 'login'|'2fa'|'captcha'|'permission'|'manual_recovery'|'other';instruction?: string;// redacted, <= 1 KiBtimeout_ms?: number;// default 10 min, max 60 min}exportinterfaceHandoffSnapshot{url: string;title?: string;origin?: string;timestamp: number;screenshot_ref?: string;// optional, stored only if capture_screenshot=truecookie_count?: number;local_storage_keys?: string[];// key names only, values never storeddom_fingerprint?: string;}exportinterfaceHandoffResult{handoff_id: string;status: 'RUNNING'|'FINISHED'|'CANCELLED'|'TIMED_OUT';before: HandoffSnapshot;after?: HandoffSnapshot;delta_summary?: string;resume_hint?: string;}
Implementation notes
Do not log key presses, typed text, password values, cookie values, localStorage values, or request bodies.
Capture only this safe state: URL/title/origin, counts, key names, DOM fingerprint/hash, optional screenshot reference.
If run_id is supplied, oc_handoff_finish appends exactly one handoff evidence pointer to that TaskRun.
Integrate Ralph HITL output so S7_HITL can recommend starting a handoff instead of leaving the agent with unstructured text only.
Handoff state is stored under ~/.openchrome/handoffs/<handoff_id>/ with atomic metadata writes.
Acceptance criteria
Four MCP tools registered and documented.
oc_handoff_start records a before snapshot and returns a handoff id plus a one-line instruction to the host that includes the reason and timeout deadline.
oc_handoff_finish records an after snapshot and produces a deterministic delta summary: URL changed, title changed, origin changed, cookie count changed, storage key set changed, DOM fingerprint changed.
Secret-safety tests prove cookie values, localStorage values, typed text, and password-like strings never appear in meta.json, events.jsonl, MCP response text, logs, or screenshots metadata.
Timeout test: handoff transitions to TIMED_OUT after configured timeout and can no longer be finished.
TaskRun integration test: if run_id is supplied, finishing handoff appends an evidence pointer to the TaskRun.
Ralph integration test: a forced S7_HITL result includes a machine-readable suggestion to call oc_handoff_start with reason manual_recovery.
Existing headed fallback/login behavior remains unchanged when no handoff is active.
npm run build && npm test green.
Real verification after merge using OpenChrome
Launch OpenChrome headed with a persistent temporary profile.
Navigate to https://the-internet.herokuapp.com/login.
Call oc_task_run_start for a login/manual recovery verification run, or omit run_id if TaskRun is not merged yet.
Call oc_handoff_start({reason:'login', instruction:'Manually complete the demo login', capture_screenshot:true}).
Manually enter the demo credentials in Chrome and submit.
Call oc_handoff_finish({handoff_id}).
Verify the result reports URL/title/DOM fingerprint changed and contains no typed credential values.
Call read_page and verify the page is authenticated/advanced enough for the next automated step.
Restart OpenChrome and call oc_handoff_status({handoff_id}); verify the finished handoff metadata persists.
If TaskRun is available, call oc_task_run_get and verify the handoff evidence pointer is attached.
Merge is successful when a manual login or recovery step can be represented as a safe, durable checkpoint that helps the agent resume without exposing secrets or changing normal OpenChrome tool behavior.
Curated scope, overlap handling, and verification checklist
Scope classification
Canonical lane: human handoff checkpointing / secret-safe resume evidence.
Primary deliverable: opt-in oc_handoff_start/status/finish/cancel flow that records state deltas and resume context after human intervention.
Why
Bytebot's most useful recovery pattern is not its virtual desktop itself; it is the ability to pause automation, let a human intervene, capture enough context, and resume the task. OpenChrome already supports real Chrome profiles, headed fallback, CAPTCHA/2FA handoff guidance, Ralph HITL escalation, and session persistence. What is missing is a structured, secret-safe human handoff checkpoint that records browser state deltas after manual intervention so the LLM does not wander or retry stale login flows.
What
Add an opt-in human handoff/resume checkpoint toolset:
oc_handoff_startoc_handoff_statusoc_handoff_finishoc_handoff_cancelThe handoff records state deltas, not raw user input. It supports these bounded cases through the
reasonenum: login, 2FA, CAPTCHA, permission, manual recovery, and other.Proposed contract
Implementation notes
run_idis supplied,oc_handoff_finishappends exactly onehandoffevidence pointer to that TaskRun.S7_HITLcan recommend starting a handoff instead of leaving the agent with unstructured text only.~/.openchrome/handoffs/<handoff_id>/with atomic metadata writes.Acceptance criteria
oc_handoff_startrecords a before snapshot and returns a handoff id plus a one-line instruction to the host that includes the reason and timeout deadline.oc_handoff_finishrecords an after snapshot and produces a deterministic delta summary: URL changed, title changed, origin changed, cookie count changed, storage key set changed, DOM fingerprint changed.meta.json,events.jsonl, MCP response text, logs, or screenshots metadata.TIMED_OUTafter configured timeout and can no longer be finished.run_idis supplied, finishing handoff appends an evidence pointer to the TaskRun.S7_HITLresult includes a machine-readable suggestion to calloc_handoff_startwith reasonmanual_recovery.npm run build && npm testgreen.Real verification after merge using OpenChrome
https://the-internet.herokuapp.com/login.oc_task_run_startfor a login/manual recovery verification run, or omitrun_idif TaskRun is not merged yet.oc_handoff_start({reason:'login', instruction:'Manually complete the demo login', capture_screenshot:true}).oc_handoff_finish({handoff_id}).read_pageand verify the page is authenticated/advanced enough for the next automated step.oc_handoff_status({handoff_id}); verify the finished handoff metadata persists.oc_task_run_getand verify the handoff evidence pointer is attached.Out of scope
Dependencies
run_idlinkage.Success definition
Merge is successful when a manual login or recovery step can be represented as a safe, durable checkpoint that helps the agent resume without exposing secrets or changing normal OpenChrome tool behavior.
Curated scope, overlap handling, and verification checklist
Scope classification
oc_handoff_start/status/finish/cancelflow that records state deltas and resume context after human intervention.feat/1040-handoff-resume). Continue there; do not split a competing handoff toolset.Overlap and conflict resolution
Implementation checklist
Success criteria
Post-merge OpenChrome live verification checklist
oc_handoff_startwith a non-secret reason, manually change a fixture page state, then calloc_handoff_finish.oc_handoff_statusreports reason, state delta/resume hint, and linked run/session/tab without raw typed input.oc_handoff_canceland verify no resume checkpoint is marked successful.