Tier: core (additive metadata; default tools/list unchanged)
PR target: develop
Background
src/tools/index.ts registers ~70 tools at server start (exact count depends on progressive-disclosure state via expand_tools). The tools/list MCP response is large, and agents frequently mis-select low-value tools (workflow_*, oc_recording_*, crawl_sitemap); today Hint Engine corrects this at runtime token cost.
playwright-mcp solves the same problem with a per-tool capability field and --caps=... opt-in (microsoft/playwright-mcp v0.0.75, filteredTools() in playwright-core/src/tools/utils/mcp/server.ts).
P2 (zero-impact extension) compliant: tool definitions and behavior unchanged, only exposure is gated. Default npm start produces a byte-identical tools/list to v1.11.0.
Interaction with expand_tools (progressive disclosure)
expand_tools is preserved. Filter order:
- Capability filter (
--tools-only / --disable-tools) defines the maximum set the agent can ever see.
expand_tools operates within that set: it can reveal capability-allowed tools that are hidden by default, but never tools the capability filter excluded.
Conflict rule: capability filter wins. --tools-only=core + expand_tools(workflow_init) → tool stays hidden, structured error CAPABILITY_DISABLED.
Proposed Implementation
- Extend
ToolDefinition with capability: 'core' | 'crawl' | 'recording' | 'workflow' | 'storage' | 'profile' | 'totp' | 'pilot'. Absent → 'core' for backward compat (P1).
- Tag every registered tool in
src/tools/*.ts. Initial grouping (the PR must validate against the actual registry):
- core: navigate, read_page, query_dom, find, inspect, interact, computer, act, fill_form, form_input, javascript_tool, page_, screenshot, journal, tabs_, oc_connection_health, oc_session_resume/snapshot, oc_assert, oc_evidence_bundle, oc_checkpoint, wait_for, console_capture, validate_page, request_intercept, network*, emulate_device, file_upload, drag_drop, http_auth, user_agent, geolocation, lightweight_scroll, memory, extract_data, performance_metrics, page_reload, expand_tools
- storage: cookies, storage
- profile: list_profiles, oc_profile_status
- crawl: crawl, crawl_sitemap, batch_execute, batch_paginate, worker_update, worker_complete
- recording: oc_recording_start, oc_recording_stop, oc_recording_list, oc_recording_export
- workflow: workflow_init, workflow_status, workflow_collect, workflow_collect_partial, workflow_cleanup, execute_plan
- totp: oc_totp_generate
- CLI flags (
src/index.ts, commander chain — current flags at ~line 72–98):
--tools-only <csv> — exposes only listed capabilities
--disable-tools <csv> — removes listed capabilities
- New lint:
npm run lint:tools-capabilities asserts every registered tool has a capability tag (CI-enforced).
- Filtering at
registerTools(): apply capability filter; expand_tools enforces the gate per rule above.
- Backward compat: when neither flag is set,
tools/list is byte-identical to v1.11.0. Snapshot committed at src/tools/__tests__/__snapshots__/tools-list.v1.11.snap.json.
Acceptance Criteria
Verification (post-merge, executable scripts)
Setup
git checkout develop && git pull && npm ci && npm run build
node dist/index.js --http 9876 & PID=$!
mcp() { curl -s -H "content-type: application/json" \
-d "{\"jsonrpc\":\"2.0\",\"id\":1,\"method\":\"tools/list\"}" \
http://localhost:9876/mcp ; }
Scenario 1 — default surface unchanged (regression)
mcp | jq -S '.result.tools | map(.name) | sort' > /tmp/tools-default.json
diff /tmp/tools-default.json \
<(jq -S '.tools | map(.name) | sort' \
src/tools/__tests__/__snapshots__/tools-list.v1.11.snap.json)
Pass: diff is empty.
Scenario 2 — --tools-only core
Restart server with --tools-only core, capture tools/list.
Pass: jq '.result.tools[].name' | grep -E '^"(workflow_|oc_recording_|crawl)' returns no matches.
Scenario 3 — --disable-tools workflow,recording
Restart server with --disable-tools workflow,recording.
Pass: no tool starts with workflow_ or oc_recording_; all core tools still present (count matches default minus the two groups exactly).
Scenario 4 — expand_tools respects capability gate
With --tools-only core active, call expand_tools({name: "workflow_init"}).
Pass: response is structured error { code: "CAPABILITY_DISABLED", capability: "workflow" }; subsequent tools/list does not include workflow_init.
Scenario 5 — tools/list byte reduction (explicit methodology)
# Default
DEFAULT_BYTES=$(mcp | wc -c)
# Core only (restart with --tools-only core)
CORE_BYTES=$(mcp | wc -c)
echo "default=$DEFAULT_BYTES core=$CORE_BYTES reduction=$((100 - 100*CORE_BYTES/DEFAULT_BYTES))%"
Pass: reduction ≥ 25%. Both numbers documented in the PR description.
Scenario 6 — synthetic skill replay
Setup script (committed to repo at scripts/verify/cap-filter-skill.ts): record a 3-step skill using core-only tools (navigate https://example.com → read_page → interact "More information…"). Then replay it under --tools-only core.
Pass: skill completes; zero MISSING_TOOL / CAPABILITY_DISABLED errors; outcome contract verdict matches the recording.
Issue closure criteria
All 6 scenarios pass + CI green + snapshot + setup script committed.
Out of scope (deferred)
--tool-allowlist / --tool-blocklist per-tool overrides — capability grouping is sufficient for v1; file follow-up if needed.
- Removing/renaming any tool (P1 violation)
- Auto-detecting capability needs (agent-side concern)
References
- playwright-mcp filter:
microsoft/playwright/packages/playwright-core/src/tools/utils/mcp/server.ts
- Repo:
src/tools/index.ts, src/index.ts:72-98
OpenChrome 실검증 체크리스트
2026-05-14 재검증 완료. 최신 origin/develop 코드, targeted Jest/lint, OpenChrome CLI 실호출, localhost fixture 산출물로 직접 확인 가능한 항목만 close 근거로 사용했다.
검증 대상
검증 증거
이슈별 코드/테스트 근거
산출물
Tier:
core(additive metadata; defaulttools/listunchanged)PR target:
developBackground
src/tools/index.tsregisters ~70 tools at server start (exact count depends on progressive-disclosure state viaexpand_tools). Thetools/listMCP response is large, and agents frequently mis-select low-value tools (workflow_*,oc_recording_*,crawl_sitemap); today Hint Engine corrects this at runtime token cost.playwright-mcp solves the same problem with a per-tool
capabilityfield and--caps=...opt-in (microsoft/playwright-mcpv0.0.75,filteredTools()inplaywright-core/src/tools/utils/mcp/server.ts).P2 (zero-impact extension) compliant: tool definitions and behavior unchanged, only exposure is gated. Default
npm startproduces a byte-identicaltools/listto v1.11.0.Interaction with
expand_tools(progressive disclosure)expand_toolsis preserved. Filter order:--tools-only/--disable-tools) defines the maximum set the agent can ever see.expand_toolsoperates within that set: it can reveal capability-allowed tools that are hidden by default, but never tools the capability filter excluded.Conflict rule: capability filter wins.
--tools-only=core+expand_tools(workflow_init)→ tool stays hidden, structured errorCAPABILITY_DISABLED.Proposed Implementation
ToolDefinitionwithcapability: 'core' | 'crawl' | 'recording' | 'workflow' | 'storage' | 'profile' | 'totp' | 'pilot'. Absent →'core'for backward compat (P1).src/tools/*.ts. Initial grouping (the PR must validate against the actual registry):src/index.ts, commander chain — current flags at ~line 72–98):--tools-only <csv>— exposes only listed capabilities--disable-tools <csv>— removes listed capabilitiesnpm run lint:tools-capabilitiesasserts every registered tool has a capability tag (CI-enforced).registerTools(): apply capability filter;expand_toolsenforces the gate per rule above.tools/listis byte-identical to v1.11.0. Snapshot committed atsrc/tools/__tests__/__snapshots__/tools-list.v1.11.snap.json.Acceptance Criteria
capabilityfield (lint:tools-capabilitiesCI-green)--tools-only,--disable-toolsflags implemented insrc/index.tsexpand_toolsrejects capability-excluded tools withCAPABILITY_DISABLEDerrorsrc/tools/__tests__/capability-filter.spec.tscover the 6 verification scenarios belowtools/listbyte-identical to v1.11.0 baseline (snapshot diff is empty)npm run lint:tierpassesdevelopVerification (post-merge, executable scripts)
Setup
Scenario 1 — default surface unchanged (regression)
Pass: diff is empty.
Scenario 2 —
--tools-only coreRestart server with
--tools-only core, capture tools/list.Pass:
jq '.result.tools[].name' | grep -E '^"(workflow_|oc_recording_|crawl)'returns no matches.Scenario 3 —
--disable-tools workflow,recordingRestart server with
--disable-tools workflow,recording.Pass: no tool starts with
workflow_oroc_recording_; all core tools still present (count matches default minus the two groups exactly).Scenario 4 —
expand_toolsrespects capability gateWith
--tools-only coreactive, callexpand_tools({name: "workflow_init"}).Pass: response is structured error
{ code: "CAPABILITY_DISABLED", capability: "workflow" }; subsequenttools/listdoes not includeworkflow_init.Scenario 5 —
tools/listbyte reduction (explicit methodology)Pass: reduction ≥ 25%. Both numbers documented in the PR description.
Scenario 6 — synthetic skill replay
Setup script (committed to repo at
scripts/verify/cap-filter-skill.ts): record a 3-step skill using core-only tools (navigate https://example.com→read_page→interact"More information…"). Then replay it under--tools-only core.Pass: skill completes; zero
MISSING_TOOL/CAPABILITY_DISABLEDerrors; outcome contract verdict matches the recording.Issue closure criteria
All 6 scenarios pass + CI green + snapshot + setup script committed.
Out of scope (deferred)
--tool-allowlist/--tool-blocklistper-tool overrides — capability grouping is sufficient for v1; file follow-up if needed.References
microsoft/playwright/packages/playwright-core/src/tools/utils/mcp/server.tssrc/tools/index.ts,src/index.ts:72-98OpenChrome 실검증 체크리스트
검증 대상
검증 증거
npm run build통과.npm run lint:tier통과: 521 modules / 1239 dependencies, no dependency violations.npm run lint:tool-schemas통과: 82 baselined violations, 0 new.oc_connection_healthconnected, localhost fixturenavigate성공.이슈별 코드/테스트 근거
산출물
.omx/reverify-evidence/targeted-jest.log.omx/reverify-evidence/lint-tier.log.omx/reverify-evidence/lint-tool-schemas.log.omx/reverify-evidence/openchrome-live-smoke.log