Skip to content

feat(perf): cache-friendly compact responses for high-frequency state tools (browser-use adoption G1) #973

@shaun0927

Description

@shaun0927

Background

browser-use gets much of its speed from keeping repeated history/cacheable context stable and putting fresh browser state at the end of the prompt. OpenChrome cannot control every MCP host prompt, but it can make high-frequency tool outputs smaller, more stable, and easier for host-side prompt caching to reuse.

Related existing work: #893 (unified state header), #887 (large-output handles), #841 (tool descriptions), #829 (capability-gated tool surface). This issue is narrower: response-shape optimization for high-frequency browser-state tools.

Goal

Add a compact, cache-friendly response profile for high-frequency state tools without changing the default semantics of existing tool calls.

Scope

Implement a shared response envelope for at least:

  • read_page
  • inspect
  • query_dom
  • find
  • tabs_context

The compact envelope should separate stable identifiers from volatile state:

{
  "ok": true,
  "tool": "read_page",
  "tabId": "...",
  "url": "...",
  "stateVersion": "...",
  "domFingerprint": "...",
  "mode": "compact",
  "summary": "...",
  "refs": ["@e1", "@e2"],
  "counts": { "nodesVisited": 1234, "nodesEmitted": 120 },
  "next": { "full": "call read_page with mode=full", "delta": true }
}

Required behavior:

  • Add mode or verbosity option with values compact | normal | debug.
  • Preserve backward-compatible default behavior unless a tool already has a compact default.
  • Do not remove data required for recovery; expose omitted details through debug mode or an artifact/handle if feat(core): 2-stage fetch with output handles for large-output tools (apify-mcp adoption A) #887 is available.
  • Avoid repeating long static guidance text in each response.
  • Keep errors actionable but short; detailed stack/debug data should be gated behind debug.

Non-goals

  • Do not implement a new agent loop.
  • Do not depend on browser-use hosted models.
  • Do not change ref semantics or canonical interaction entrypoints.

Implementation notes

Success criteria

  • compact mode returns materially fewer characters than normal on the same page while preserving enough refs for the next interaction.
  • Existing tests for normal/default mode continue to pass.
  • At least one unit test asserts that compact mode includes tabId, url, stateVersion or equivalent fingerprint, mode, and actionable next-step hints.
  • At least one regression test asserts that debug mode still exposes diagnostic detail omitted from compact mode.

Real OpenChrome validation after implementation

Use a built local OpenChrome server connected to real Chrome.

  1. Build and start OpenChrome:
    npm run build
    node dist/cli/index.js serve
  2. In an MCP client connected to this server, run:
    • navigate to https://github.com/browser-use/browser-use
    • read_page with mode=normal
    • read_page with mode=compact
    • inspect with query="repository stars, latest commit, and main navigation" and mode=compact
  3. Capture evidence:
    • compact output character count
    • normal output character count
    • at least 3 usable element refs or explicit reason refs are absent
    • successful follow-up interaction using a compact-mode ref or selector
  4. Pass condition:
    • compact output is at least 40% smaller than normal on this page, or the PR explains why the page is already below threshold.
    • follow-up tool call succeeds without requiring a full read first.

Review checklist

Self-review clarifications (added before implementation)

  • Use a single argument name across tools: verbosity: "compact" | "normal" | "debug". If a tool already has a conflicting mode, do not overload it.
  • stateVersion is required in compact responses. It may be a documented hash/counter composed from URL + document/update counter + DOM fingerprint, but it must change when refs may be stale.
  • domFingerprint is required for DOM-derived responses; non-DOM tools may set it to null with a reason.
  • The first PR should implement compact mode for read_page and one focused tool (inspect or query_dom) before broad rollout. Additional tools may follow once the shape is stable.
  • The merge gate is not “40% smaller everywhere”; the required gate is: benchmark evidence is recorded, and if the page is already small, the PR explains why the threshold is not meaningful.

Curated scope, overlap handling, and verification checklist

Scope classification

  • Canonical lane: interaction/action reliability.
  • Primary deliverable: cache-friendly compact responses for high-frequency state tools (browser-use adoption G1).
  • Open PR: none currently linked in the active priority map; verify GitHub again before implementation.
  • Detected labels: enhancement, P1, performance, host-integration.
  • Affected OpenChrome surfaces from issue text: read_page, act, interact, find, query_dom, navigate.
  • Non-goal: breaking existing tool response compatibility, changing defaults without opt-in, or adding server-side autonomous planning.

Overlap and conflict resolution

Implementation checklist

  • Restate the exact contract for cache-friendly compact responses for high-frequency state tools (browser-use adoption G1) in code/docs before changing behavior.
  • Implement the narrow surface named by this issue before broadening to adjacent systems.
  • Preserve existing behavior by default and gate new behavior with explicit config/tool arguments where appropriate.
  • Add targeted unit/integration tests for success, failure, compatibility, and bounded output.
  • Add regression coverage for the issue-specific happy path, failure path, default/disabled path, and artifact/output bounds.
  • Update user-facing docs or inline tool descriptions when hosts must choose a new flag, mode, policy, or workflow.

Success criteria

  • The implementation satisfies the primary deliverable without broadening into non-goals.
  • Existing default behavior remains backward-compatible or the issue explicitly documents the compatibility break.
  • Failure cases return bounded, actionable diagnostics rather than silent fallback or unbounded dumps.
  • Tests/benchmarks cover the concrete surface named in this issue, not only helper utilities.
  • Any produced artifact is deterministic, redacted, and small enough for merge review or stored behind handles.

Post-merge OpenChrome live verification checklist

  • Run the documented local OpenChrome fixture or smoke path for cache-friendly compact responses for high-frequency state tools (browser-use adoption G1) and capture the exact command/tool calls.
  • Verify read_page behavior matches the issue goal in both the enabled path and the default/disabled compatibility path.
  • Inspect generated artifacts/logs/responses for bounded size, redaction, source links, and clear failure diagnostics.
  • Record sanitized output excerpts, artifact paths, and any benchmark/latency/payload numbers in merge verification notes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1P1 highenhancementNew feature or requesthost-integrationWires module cores into host (CDP, MCP, tools, transports, OS APIs)performancePerformance, latency, throughput, or resource-use improvement

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions