-
Notifications
You must be signed in to change notification settings - Fork 36
docs(agent): auto-generated capability map for LLM preamble (#826) #927
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
75a1e68
8263516
0331ad5
0a61d05
a8e1506
819add5
30b9e91
a701e13
a3f7e59
72ce3e5
5a04aa2
a051d07
f48f835
98baf59
ffed354
fc3b0d3
b127ed9
292ebd2
83a554e
656bdca
5d6fdcc
5dccc23
9269f59
c7e0c35
018b943
cbadcf5
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,28 @@ | ||
| name: capability-map | ||
| on: | ||
| pull_request: | ||
| branches: [develop, main] | ||
| paths: | ||
| - 'src/tools/**' | ||
| - 'src/types/mcp.ts' | ||
| - 'src/pilot/handoff/tool.ts' | ||
| - 'src/pilot/handoff/definitions.ts' | ||
| - 'scripts/gen-capability-map.ts' | ||
|
Comment on lines
+6
to
+10
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Broaden the Useful? React with 👍 / 👎. |
||
| - 'docs/agent/capability-map.md' | ||
|
Comment on lines
+6
to
+11
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The Useful? React with 👍 / 👎. |
||
| jobs: | ||
| drift-check: | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - uses: actions/checkout@v4 | ||
| - uses: actions/setup-node@v4 | ||
| with: | ||
| node-version: 20 | ||
| cache: npm | ||
| - run: npm ci --prefer-offline --no-audit | ||
| - run: npm run gen:capability-map | ||
| - name: Verify no drift | ||
| run: | | ||
| if ! git diff --exit-code docs/agent/capability-map.md; then | ||
| echo "::error::capability-map drift — run 'npm run gen:capability-map' and commit the result" | ||
| exit 1 | ||
| fi | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,47 @@ | ||
| # openchrome agent preamble | ||
|
|
||
| `capability-map.md` is an auto-generated, drift-guarded summary of every MCP | ||
| tool exposed by openchrome. It is designed to be prepended to an agent's system | ||
| prompt so the agent knows what tools are available without calling `tools/list`. | ||
|
|
||
| ## Loading the capability map with the Anthropic SDK | ||
|
|
||
| ```typescript | ||
| import Anthropic from '@anthropic-ai/sdk'; | ||
| import * as fs from 'fs'; | ||
| import * as path from 'path'; | ||
|
|
||
| const capabilityMap = fs.readFileSync( | ||
| path.join(__dirname, 'capability-map.md'), | ||
| 'utf8' | ||
| ); | ||
|
|
||
| const client = new Anthropic(); | ||
|
|
||
| const response = await client.messages.create({ | ||
| model: 'claude-sonnet-4-5', | ||
| max_tokens: 4096, | ||
| system: `You are a browser-automation agent with access to the openchrome MCP server.\n\n${capabilityMap}`, | ||
| messages: [{ role: 'user', content: 'Navigate to https://example.com and return the page title.' }], | ||
| }); | ||
| ``` | ||
|
|
||
| ## Keeping the map up to date | ||
|
|
||
| The map is regenerated from the live tool registry: | ||
|
|
||
| ```bash | ||
| npm run gen:capability-map | ||
| ``` | ||
|
|
||
| A CI workflow (`.github/workflows/capability-map.yml`) fails any PR that | ||
| modifies tool files without regenerating the map, preventing drift between | ||
| source and documentation. | ||
|
|
||
| ## File constraints | ||
|
|
||
| - Maximum size: **6 144 bytes** (fits comfortably in a system-prompt slot). | ||
| - If params lines push the file over the limit, the generator automatically | ||
| drops them and retains tool names + descriptions only. | ||
| - `expand_tools` is intentionally excluded — it is a server-injected | ||
| progressive-disclosure hint, not a stable registered tool. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,134 @@ | ||
| <!-- generated by scripts/gen-capability-map.ts from src/tools/index.ts — do not edit --> | ||
| # openchrome MCP tools (auto-generated) | ||
|
|
||
| ## dom | ||
| - `find`: Find elements by query. Retur… | ||
| - `page_content`: Get HTML content from page or… | ||
| - `query_dom`: Query DOM elements via CSS se… | ||
| - `vision_find`: Find elements using vision-ba… | ||
|
|
||
| ## evidence | ||
| - `oc_assert`: Evaluate a single Outcome Con… | ||
| - `oc_evidence_bundle`: Capture a snapshot of the cur… | ||
| - `oc_skill_recall`: Retrieve skills from the JSON… | ||
| - `oc_skill_record`: Record a skill (domain, name,… | ||
|
|
||
| ## forms | ||
| - `file_upload`: Upload files to a file input … | ||
| - `fill_form`: Fill form fields and optional… | ||
| - `form_input`: Set one form element value by… | ||
|
|
||
| ## interact | ||
| - `act`: Execute multi-step browser ac… | ||
| - `computer`: Mouse, keyboard, and screensh… | ||
| - `drag_drop`: Drag and drop by selector or … | ||
| - `interact`: Find an element by natural la… | ||
| - `lightweight_scroll`: Scroll page via JS. Returns n… | ||
|
|
||
| ## js | ||
| - `javascript_tool`: Execute JavaScript in page co… | ||
|
|
||
| ## lifecycle | ||
| - `oc_checkpoint`: Save or load an automation ch… | ||
| - `oc_connection_health`: Get CDP connection health met… | ||
| - `oc_journal`: Query the tool call journal. … | ||
| - `oc_reap_orphans`: Manually sweep and terminate … | ||
| - `oc_session_resume`: Restore working context after… | ||
| - `oc_session_snapshot`: Save browser state snapshot f… | ||
| - `oc_stop`: Shut down OpenChrome and clos… | ||
| - `page_reload`: Reload the current page. | ||
| - `wait_for`: Wait for a condition. Strongl… | ||
|
|
||
| ## misc | ||
| - `batch_execute`: Execute JS across multiple ta… | ||
| - `batch_paginate`: Extract content from paginate… | ||
| - `crawl`: Recursively crawl a website v… | ||
| - `crawl_cancel`: Mark a crawl job as cancelled… | ||
| - `crawl_sitemap`: Crawl a website using its sit… | ||
| - `crawl_start`: Initialise a resumable crawl … | ||
| - `crawl_status`: Advance a crawl job by up to … | ||
| - `execute_plan`: Execute a cached plan by ID, … | ||
| - `extract_data`: Extract structured data from … | ||
| - `network_capture_full`: Capture network requests with… | ||
| - `network_capture_lite`: Capture network request metad… | ||
| - `oc_context_export`: Export the active tab's auth-… | ||
| - `oc_context_import`: Strict-replace import of a `C… | ||
| - `oc_copy_to_clipboard`: Copy text to the system clipb… | ||
| - `oc_devtools_url`: Get the Chrome DevTools inspe… | ||
| - `oc_doctor_report`: Read the most recent openchro… | ||
| - `oc_get_connection_info`: Get connection configuration … | ||
| - `oc_normalize_action`: Validate and normalize a near… | ||
| - `oc_observe`: Deterministic, numbered list … | ||
| - `oc_open_host_settings`: Open the MCP connector settin… | ||
| - `oc_performance_analyze`: Drill into one named insight … | ||
| - `oc_performance_insights`: Capture a CDP performance tra… | ||
| - `oc_progress_status`: Read-only diagnostics for whe… | ||
| - `oc_recording_status`: Report whether session record… | ||
| - `oc_reflect`: Create, get, or list structur… | ||
| - `oc_run_events`: Return recent events for an o… | ||
| - `oc_run_finish`: Finish an opt-in OpenChrome r… | ||
| - `oc_run_start`: Start an opt-in OpenChrome ru… | ||
| - `oc_run_status`: Return the current status and… | ||
| - `oc_task_cancel`: Request cancellation of a bac… | ||
| - `oc_task_get`: Fetch a single task by task_i… | ||
| - `oc_task_list`: List background tasks in the … | ||
| - `oc_task_run_checkpoint`: Write a compact caller-provid… | ||
| - `oc_task_run_complete`: Enter a terminal TaskRun stat… | ||
| - `oc_task_run_get`: Read a TaskRun meta record an… | ||
| - `oc_task_run_list`: List recent TaskRuns sorted b… | ||
| - `oc_task_run_needs_help`: Move a non-terminal TaskRun t… | ||
| - `oc_task_run_start`: Start an opt-in goal-level Ta… | ||
| - `oc_task_run_update`: Update a non-terminal TaskRun… | ||
| - `oc_task_start`: Launch a long-running tool as… | ||
| - `oc_task_wait`: Block until the task reaches … | ||
| - `oc_totp_generate`: Generate a current TOTP 2FA c… | ||
| - `read_page`: Get page as DOM, accessibilit… | ||
| - `worker`: Manage workers. Actions: "cre… | ||
| - `worker_complete`: Mark a worker as complete wit… | ||
| - `worker_update`: Report worker progress to the… | ||
| - `workflow_cleanup`: Clean up workflow resources (… | ||
| - `workflow_collect`: Collect and aggregate results… | ||
| - `workflow_collect_partial`: Collect results from complete… | ||
| - `workflow_init`: Initialize a workflow with mu… | ||
| - `workflow_status`: Get current workflow status a… | ||
|
|
||
| ## navigation | ||
| - `navigate`: Navigate to URL or go forward… | ||
|
|
||
| ## observability | ||
| - `console_capture`: Capture browser console outpu… | ||
| - `inspect`: Extract focused page state by… | ||
| - `network`: Simulate network conditions. | ||
| - `page_pdf`: Generate PDF from page. Saves… | ||
| - `page_screenshot`: Save page screenshot to file … | ||
| - `performance_metrics`: Get page performance metrics. | ||
| - `request_intercept`: Intercept network requests (l… | ||
| - `validate_page`: Composite health check: navig… | ||
|
|
||
| ## pilot | ||
| - `oc_pilot_handoff_create` — pilot: Pilot-tier: mint a single-use… | ||
| - `oc_pilot_handoff_redeem` — pilot: Pilot-tier: redeem a single-u… | ||
|
|
||
| ## profile | ||
| - `emulate_device`: Emulate device viewport and U… | ||
| - `geolocation`: Set or clear geolocation over… | ||
| - `http_auth`: Set or clear HTTP auth creden… | ||
| - `list_profiles`: List available Chrome profile… | ||
| - `oc_profile_status`: Check browser profile type an… | ||
| - `user_agent`: Set or reset browser user age… | ||
|
|
||
| ## recording | ||
| - `oc_recording_export`: Export a recording as JSON or… | ||
| - `oc_recording_list`: List available session record… | ||
| - `oc_recording_start`: Start a new session recording… | ||
| - `oc_recording_stop`: Stop the active session recor… | ||
|
|
||
| ## storage | ||
| - `cookies`: Manage browser cookies (get, … | ||
| - `memory`: Manage domain knowledge. Acti… | ||
| - `storage`: Manage browser localStorage a… | ||
|
|
||
| ## tabs | ||
| - `tabs_close`: Close one or more tabs by tab… | ||
| - `tabs_context`: Get session tab IDs grouped b… | ||
| - `tabs_create`: Create a new tab with URL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expand this
pull_request.pathslist to cover the run-harness tool sources (at leastsrc/run-harness/tools.ts). The capability map generator callsregisterAllTools()(src/tools/index.ts), which conditionally registers run-harness tools viaisRunHarnessEnabled(), and that flag defaults to enabled whenOPENCHROME_RUN_HARNESSis unset (src/run-harness/flags.ts). As written, a PR that changes run-harness tool definitions can changedocs/agent/capability-map.mdwithout triggering this drift-check workflow, so stale capability-map output can merge undetected.Useful? React with 👍 / 👎.