feat(core): capability-gated tool surface (--tools-only / --disable-tools)

**Tier**: `core` (additive metadata; default `tools/list` unchanged)
**PR target**: `develop`

## Background

`src/tools/index.ts` registers ~70 tools at server start (exact count depends on progressive-disclosure state via `expand_tools`). The `tools/list` MCP response is large, and agents frequently mis-select low-value tools (`workflow_*`, `oc_recording_*`, `crawl_sitemap`); today Hint Engine corrects this at runtime token cost.

playwright-mcp solves the same problem with a per-tool `capability` field and `--caps=...` opt-in (`microsoft/playwright-mcp` v0.0.75, `filteredTools()` in `playwright-core/src/tools/utils/mcp/server.ts`).

**P2 (zero-impact extension) compliant**: tool definitions and behavior unchanged, only exposure is gated. Default `npm start` produces a byte-identical `tools/list` to v1.11.0.

## Interaction with `expand_tools` (progressive disclosure)

`expand_tools` is preserved. Filter order:

1. Capability filter (`--tools-only` / `--disable-tools`) defines the **maximum set** the agent can ever see.
2. `expand_tools` operates **within** that set: it can reveal capability-allowed tools that are hidden by default, but never tools the capability filter excluded.

Conflict rule: capability filter wins. `--tools-only=core` + `expand_tools(workflow_init)` → tool stays hidden, structured error `CAPABILITY_DISABLED`.

## Proposed Implementation

1. **Extend `ToolDefinition`** with `capability: 'core' | 'crawl' | 'recording' | 'workflow' | 'storage' | 'profile' | 'totp' | 'pilot'`. Absent → `'core'` for backward compat (P1).
2. **Tag every registered tool** in `src/tools/*.ts`. Initial grouping (the PR must validate against the actual registry):
   - **core**: navigate, read_page, query_dom, find, inspect, interact, computer, act, fill_form, form_input, javascript_tool, page_*, screenshot, journal, tabs_*, oc_connection_health, oc_session_resume/snapshot, oc_assert, oc_evidence_bundle, oc_checkpoint, wait_for, console_capture, validate_page, request_intercept, network*, emulate_device, file_upload, drag_drop, http_auth, user_agent, geolocation, lightweight_scroll, memory, extract_data, performance_metrics, page_reload, expand_tools
   - **storage**: cookies, storage
   - **profile**: list_profiles, oc_profile_status
   - **crawl**: crawl, crawl_sitemap, batch_execute, batch_paginate, worker_update, worker_complete
   - **recording**: oc_recording_start, oc_recording_stop, oc_recording_list, oc_recording_export
   - **workflow**: workflow_init, workflow_status, workflow_collect, workflow_collect_partial, workflow_cleanup, execute_plan
   - **totp**: oc_totp_generate
3. **CLI flags** (`src/index.ts`, commander chain — current flags at ~line 72–98):
   - `--tools-only <csv>` — exposes only listed capabilities
   - `--disable-tools <csv>` — removes listed capabilities
4. **New lint**: `npm run lint:tools-capabilities` asserts every registered tool has a capability tag (CI-enforced).
5. **Filtering** at `registerTools()`: apply capability filter; `expand_tools` enforces the gate per rule above.
6. **Backward compat**: when neither flag is set, `tools/list` is byte-identical to v1.11.0. Snapshot committed at `src/tools/__tests__/__snapshots__/tools-list.v1.11.snap.json`.

## Acceptance Criteria

- [x] Every registered tool has a non-empty `capability` field (`lint:tools-capabilities` CI-green)
- [x] `--tools-only`, `--disable-tools` flags implemented in `src/index.ts`
- [x] `expand_tools` rejects capability-excluded tools with `CAPABILITY_DISABLED` error
- [x] Unit tests in `src/tools/__tests__/capability-filter.spec.ts` cover the 6 verification scenarios below
- [x] Default `tools/list` byte-identical to v1.11.0 baseline (snapshot diff is empty)
- [x] `npm run lint:tier` passes
- [x] CHANGELOG (Unreleased) entry
- [x] PR targets `develop`

## Verification (post-merge, executable scripts)

### Setup
```bash
git checkout develop && git pull && npm ci && npm run build
node dist/index.js --http 9876 &  PID=$!
mcp() { curl -s -H "content-type: application/json" \
  -d "{\"jsonrpc\":\"2.0\",\"id\":1,\"method\":\"tools/list\"}" \
  http://localhost:9876/mcp ; }
```

### Scenario 1 — default surface unchanged (regression)
```bash
mcp | jq -S '.result.tools | map(.name) | sort' > /tmp/tools-default.json
diff /tmp/tools-default.json \
     <(jq -S '.tools | map(.name) | sort' \
       src/tools/__tests__/__snapshots__/tools-list.v1.11.snap.json)
```
**Pass**: diff is empty.

### Scenario 2 — `--tools-only core`
Restart server with `--tools-only core`, capture tools/list.
**Pass**: `jq '.result.tools[].name' | grep -E '^"(workflow_|oc_recording_|crawl)'` returns no matches.

### Scenario 3 — `--disable-tools workflow,recording`
Restart server with `--disable-tools workflow,recording`.
**Pass**: no tool starts with `workflow_` or `oc_recording_`; **all** core tools still present (count matches default minus the two groups exactly).

### Scenario 4 — `expand_tools` respects capability gate
With `--tools-only core` active, call `expand_tools({name: "workflow_init"})`.
**Pass**: response is structured error `{ code: "CAPABILITY_DISABLED", capability: "workflow" }`; subsequent `tools/list` does **not** include `workflow_init`.

### Scenario 5 — `tools/list` byte reduction (explicit methodology)
```bash
# Default
DEFAULT_BYTES=$(mcp | wc -c)
# Core only (restart with --tools-only core)
CORE_BYTES=$(mcp | wc -c)
echo "default=$DEFAULT_BYTES core=$CORE_BYTES reduction=$((100 - 100*CORE_BYTES/DEFAULT_BYTES))%"
```
**Pass**: reduction ≥ 25%. Both numbers documented in the PR description.

### Scenario 6 — synthetic skill replay
Setup script (committed to repo at `scripts/verify/cap-filter-skill.ts`): record a 3-step skill using core-only tools (`navigate https://example.com` → `read_page` → `interact` "More information…"). Then replay it under `--tools-only core`.
**Pass**: skill completes; zero `MISSING_TOOL` / `CAPABILITY_DISABLED` errors; outcome contract verdict matches the recording.

### Issue closure criteria
All 6 scenarios pass + CI green + snapshot + setup script committed.

## Out of scope (deferred)
- `--tool-allowlist` / `--tool-blocklist` per-tool overrides — capability grouping is sufficient for v1; file follow-up if needed.
- Removing/renaming any tool (P1 violation)
- Auto-detecting capability needs (agent-side concern)

## References
- playwright-mcp filter: `microsoft/playwright/packages/playwright-core/src/tools/utils/mcp/server.ts`
- Repo: `src/tools/index.ts`, `src/index.ts:72-98`



## OpenChrome 실검증 체크리스트

> 2026-05-14 재검증 완료. 최신 `origin/develop` 코드, targeted Jest/lint, OpenChrome CLI 실호출, localhost fixture 산출물로 직접 확인 가능한 항목만 close 근거로 사용했다.

### 검증 대상
- **이슈:** #829 — feat(core): capability-gated tool surface (--tools-only / --disable-tools)
- **적용 버전:** origin/develop @ db6c0227 (db6c022780247f19531c2bd14fca6069bbf2f7c0), package 1.11.0
- **fixture:** http://127.0.0.1:18766/reverify.html
- **판정:** VERIFIED — 구현/테스트/실행 표면이 최신 develop에 존재하고 targeted 검증 통과. close 가능.

### 검증 증거
- [x] `npm run build` 통과.
- [x] `npm run lint:tier` 통과: 521 modules / 1239 dependencies, no dependency violations.
- [x] `npm run lint:tool-schemas` 통과: 82 baselined violations, 0 new.
- [x] targeted Jest 통과: 38 passed / 1 skipped suites, 436 passed / 1 skipped tests.
- [x] OpenChrome CLI 실호출: `oc_connection_health` connected, localhost fixture `navigate` 성공.
- [x] OpenChrome tools/list introspection에서 관련 default 또는 pilot-gated tool surface 존재 확인.
- [x] 대표 bounded diagnostic 호출이 구조화된 성공/오류 응답을 반환함을 확인.

### 이슈별 코드/테스트 근거
- [x] 관련 구현/문서/테스트 파일이 최신 트리에 존재하고 targeted 검증에 포함됨:
  - src/config/capability-filter.ts
  - src/index.ts
  - src/mcp-server.ts
  - tests/capability-filter.test.ts

### 산출물
- [x] 증거 로그: `.omx/reverify-evidence/targeted-jest.log`
- [x] 증거 로그: `.omx/reverify-evidence/lint-tier.log`
- [x] 증거 로그: `.omx/reverify-evidence/lint-tool-schemas.log`
- [x] 증거 로그: `.omx/reverify-evidence/openchrome-live-smoke.log`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(core): capability-gated tool surface (--tools-only / --disable-tools) #829

Background

Interaction with `expand_tools` (progressive disclosure)

Proposed Implementation

Acceptance Criteria

Verification (post-merge, executable scripts)

Setup

Scenario 1 — default surface unchanged (regression)

Scenario 2 — `--tools-only core`

Scenario 3 — `--disable-tools workflow,recording`

Scenario 4 — `expand_tools` respects capability gate

Scenario 5 — `tools/list` byte reduction (explicit methodology)

Scenario 6 — synthetic skill replay

Issue closure criteria

Out of scope (deferred)

References

OpenChrome 실검증 체크리스트

검증 대상

검증 증거

이슈별 코드/테스트 근거

산출물

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

feat(core): capability-gated tool surface (--tools-only / --disable-tools) #829

Description

Background

Interaction with expand_tools (progressive disclosure)

Proposed Implementation

Acceptance Criteria

Verification (post-merge, executable scripts)

Setup

Scenario 1 — default surface unchanged (regression)

Scenario 2 — --tools-only core

Scenario 3 — --disable-tools workflow,recording

Scenario 4 — expand_tools respects capability gate

Scenario 5 — tools/list byte reduction (explicit methodology)

Scenario 6 — synthetic skill replay

Issue closure criteria

Out of scope (deferred)

References

OpenChrome 실검증 체크리스트

검증 대상

검증 증거

이슈별 코드/테스트 근거

산출물

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Interaction with `expand_tools` (progressive disclosure)

Scenario 2 — `--tools-only core`

Scenario 3 — `--disable-tools workflow,recording`

Scenario 4 — `expand_tools` respects capability gate

Scenario 5 — `tools/list` byte reduction (explicit methodology)