Skip to content

Bridge risky browser actions to pilot irreversible-action confirmation #1003

@shaun0927

Description

@shaun0927

Background

Open Interpreter emits a confirmation chunk before executing generated code. OpenChrome already has a pilot beforeIrreversibleAction hook, but the hook needs a clearer path from real browser tools and risky browser actions to human/operator confirmation.

Scope

Bridge risky browser actions into the existing pilot irreversible-action contract.

Candidate action classes:

  • Authenticated form submit when the page appears to perform purchase, transfer, delete, publish, send, or account/security change.
  • file_upload to non-localhost authenticated domains.
  • javascript_tool or plan steps that intentionally mutate app state.
  • Batch operations that close/delete many tabs or submit many forms.

Expected behavior when enabled:

  • Before the risky action executes, evaluate precondition evidence.
  • If a registered hook denies or defers, return a clear aborted_by_hook / await-human result with an external token when available.
  • If no hook is registered, default behavior remains pass-through for backward compatibility.

Non-goals

  • Do not block ordinary navigation, reading, screenshots, or harmless form filling by default.
  • Do not add a hard-coded product-specific policy list beyond conservative keyword/action classification.
  • Do not require pilot users to configure credentials or external approval systems.
  • Do not change core behavior when --pilot / contract runtime is not enabled.

Implementation checkpoints

  • Define deterministic risk classification for candidate tool calls.
  • Route classified critical actions through runWithContract / beforeIrreversibleAction only when pilot contract runtime is enabled.
  • Include pre-action evidence: URL, domain, visible submit text, form/action metadata, and relevant DOM snippet.
  • Return machine-readable denial/defer status and visible fallback text.
  • Add unit tests for classification false positives/false negatives.
  • Add integration tests proving disabled/default path is unchanged.

Acceptance criteria

  • With pilot runtime disabled, existing tool behavior is unchanged.
  • With pilot runtime enabled and a deny hook registered, a classified critical action does not execute.
  • With an await-human hook response, the result includes a token or equivalent resumable reference.
  • The audit trail records the decision and evidence without leaking obvious secrets.
  • Non-critical actions are not routed through the hook.

Required verification before merge

  • npm run build
  • Unit tests for risk classification and hook decisions.
  • Integration test with a local HTML page containing a “Delete account” or “Submit payment” button that verifies the click is blocked under a deny hook.
  • npm run lint:tier

Post-merge real OpenChrome verification

  1. Serve a local test page with a button labeled “Delete account” that increments a visible counter only if clicked.
  2. Start OpenChrome with pilot contract runtime enabled:
    npm run build
    node dist/cli/index.js serve --http 3100 --auto-launch --server-mode --pilot
  3. Register or configure a test hook that returns proceed: false.
  4. Use OpenChrome to navigate to the local page and attempt the critical click.
  5. Verify the counter did not change, the response reports hook abortion, and audit/log output contains the decision.
  6. Repeat with the hook reset/defaulted and verify the click can proceed.

Direction-fit review

This uses an existing pilot safety boundary rather than adding a new permission model. It is appropriate only as opt-in pilot behavior because it can affect side-effecting browser actions.

Curated scope, overlap handling, and verification checklist

Scope classification

  • Canonical lane: pilot irreversible-action safety bridge.
  • Primary deliverable: routing risky browser actions through the existing pilot beforeIrreversibleAction confirmation hook before execution.
  • Open PR: feat(pilot): gate risky interact clicks with irreversible-action hook #1096 (feat/1003-irreversible-confirmation). Continue there; do not create duplicate implementation work.
  • Non-goal: new policy engine, blanket blocking, CAPTCHA/login automation, or requiring confirmation for low-risk actions.

Overlap and conflict resolution

  • Use the existing pilot hook; do not invent a second irreversible-action API.
  • Coordinate with action tools and plan execution but keep this issue limited to pre-execution confirmation wiring.
  • Keep default behavior unchanged when no hook or opt-in policy is enabled, unless the pilot contract already specifies otherwise.

Implementation checklist

  • Identify risky actions and risky action classes: purchase/transfer/delete/publish/send/account changes, file uploads to authenticated non-localhost domains, mutating javascript_tool/plan steps, and batch destructive operations.
  • Evaluate precondition evidence before execution and call beforeIrreversibleAction when risk criteria match.
  • Return clear aborted_by_hook or await-human results with external token/handle when denied or deferred.
  • Add tests for allowed action, denied action, deferred action, no-hook behavior, false-positive low-risk click, and redaction of sensitive context.
  • Document hook payload shape and live verification procedure.

Success criteria

  • Risky actions are intercepted before side effects when the hook is enabled.
  • Hook denial/defer returns a clear result without performing the action.
  • Low-risk actions and no-hook default behavior remain compatible.
  • Risk evidence is sufficient for operator review without leaking secrets.

Post-merge OpenChrome live verification checklist

  • Configure a test hook, open a local fixture with a destructive-looking submit, and verify execution is deferred/denied before click side effect.
  • Run a safe fixture click and verify it proceeds normally.
  • Run with no hook and verify documented default behavior.
  • Capture hook payload/result and fixture side-effect evidence in merge notes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    harnessExecution harness, run lifecycle, recovery, and verificationtaskImplementation taskverificationRequires explicit verification evidence

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions