feat(core): capability-gated tool surface (#829)#924
Conversation
Adds an additive `capability` field to MCPToolDefinition and two CLI flags (--tools-only, --disable-tools) that gate tool visibility without changing the default tools/list output. expand_tools respects the capability gate. - New ToolCapability union: core, crawl, recording, workflow, storage, profile, totp, pilot (P2: absent → defaults to core) - TOOL_CAPABILITY_MAP authoritative mapping in src/tools/index.ts - CapabilityInjectingServer proxy decorates every registerTool() call so individual tool files do not need to know about capabilities - --tools-only <csv> and --disable-tools <csv> (mutually exclusive) - expand_tools rejects capability-excluded tools with CAPABILITY_DISABLED - New lint:tools-capabilities script enforces every tool has a tag - v1.11.0 baseline snapshot at src/tools/__tests__/__snapshots__/ tools-list.v1.11.snap.json prevents accidental default-surface drift - Default behavior (no flags) is P2-compliant: byte-identical tools/list Closes #829 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ⓘ You've reached your Qodo monthly free-tier limit. Reviews pause until next month — upgrade your plan to continue now, or link your paid account if you already have one. |
There was a problem hiding this comment.
Code Review
This pull request introduces a capability-based filtering system for tools, allowing users to restrict tool exposure via new CLI flags. Key changes include a centralized capability mapping, a lint script to enforce tool categorization, and server-side logic to gate tool access. Review feedback suggests using a standard Proxy for tool registration to avoid runtime errors, centralizing hardcoded capability strings to prevent desynchronization, and refactoring duplicated access-control logic into a shared helper.
| class CapabilityInjectingServer { | ||
| constructor(private readonly server: MCPServer) {} | ||
|
|
||
| registerTool( | ||
| name: string, | ||
| handler: ToolHandler, | ||
| definition: MCPToolDefinition, | ||
| options?: { timeoutRecoverable?: boolean }, | ||
| ): void { | ||
| const capability: ToolCapability = TOOL_CAPABILITY_MAP[name] ?? 'core'; | ||
| this.server.registerTool(name, handler, { ...definition, capability }, options); | ||
| } | ||
|
|
||
| // Proxy all other MCPServer methods used by individual register* functions | ||
| getToolNames(): string[] { | ||
| return this.server.getToolNames(); | ||
| } | ||
| } | ||
|
|
||
| export function registerAllTools(server: MCPServer): void { | ||
| // Wrap the real server so every registerTool() call gets a capability tag | ||
| const proxy = new CapabilityInjectingServer(server); |
There was a problem hiding this comment.
The CapabilityInjectingServer class is an incomplete proxy of MCPServer. By only implementing registerTool and getToolNames and then casting it to MCPServer using as unknown as (lines 250-357), you risk runtime TypeError crashes if any of the register* functions attempt to call other methods on the server (e.g., registerResource, expandToolTier). A more robust approach is to use a Proxy object which automatically delegates all other method calls to the original server instance.
/**
* Proxy around MCPServer that injects the capability field from
* TOOL_CAPABILITY_MAP into every MCPToolDefinition at registerTool() time.
*/
const createCapabilityProxy = (server: MCPServer): MCPServer => {
return new Proxy(server, {
get(target, prop, receiver) {
if (prop === 'registerTool') {
return (name: string, handler: ToolHandler, definition: MCPToolDefinition, options?: any) => {
const capability = TOOL_CAPABILITY_MAP[name] ?? 'core';
return target.registerTool(name, handler, { ...definition, capability }, options);
};
}
const value = Reflect.get(target, prop, receiver);
return typeof value === 'function' ? value.bind(target) : value;
},
});
};
export function registerAllTools(server: MCPServer): void {
// Wrap the real server so every registerTool() call gets a capability tag
const proxy = createCapabilityProxy(server);| const hasLiteral = new Set(); | ||
| while ((m = literalRe.exec(src)) !== null) { | ||
| names.add(m[1]); | ||
| hasLiteral.add(m[1]); | ||
| } |
There was a problem hiding this comment.
The hasLiteral set is populated but never used. It should be removed to keep the script clean and avoid confusion.
| const hasLiteral = new Set(); | |
| while ((m = literalRe.exec(src)) !== null) { | |
| names.add(m[1]); | |
| hasLiteral.add(m[1]); | |
| } | |
| literalRe.lastIndex = 0; | |
| let m; | |
| while ((m = literalRe.exec(src)) !== null) { | |
| names.add(m[1]); | |
| } |
| } | ||
|
|
||
| // Capability filter configuration (#829) | ||
| const allCapabilities: ToolCapability[] = ['core', 'crawl', 'recording', 'workflow', 'storage', 'profile', 'totp', 'pilot']; |
There was a problem hiding this comment.
The list of valid capability groups is hardcoded here and duplicated in the CLI option descriptions (lines 100-101) and validation logic. This makes the code harder to maintain and prone to desynchronization when new capabilities are added. Consider centralizing this list in src/types/mcp.ts as a constant array and deriving the ToolCapability type from it.
| if (this.capabilityFilter) { | ||
| const registry = this.tools.get(toolName); | ||
| if (registry && !this.isCapabilityAllowed(registry.definition.capability)) { | ||
| const capability = registry.definition.capability ?? 'core'; | ||
| return { | ||
| content: [{ | ||
| type: 'text', | ||
| text: JSON.stringify({ code: 'CAPABILITY_DISABLED', capability }), | ||
| }], | ||
| isError: true, | ||
| }; | ||
| } | ||
| } |
There was a problem hiding this comment.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c7d2325abc
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| export function setMCPServerOptions(options: MCPServerOptions): void { | ||
| mcpServerOptions = options; | ||
| export function setMCPServerOptions(options: Partial<MCPServerOptions>): void { | ||
| mcpServerOptions = { ...mcpServerOptions, ...options }; |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
Addresses bot review findings on PR #924: * src/mcp-server.ts — codex P1 "Restore option reset behavior in setMCPServerOptions". The PR had switched to a merge semantic (`{...mcpServerOptions, ...options}`), but callers rely on the pre-#829 replace semantic so `setMCPServerOptions({})` fully clears previously set flags. Revert to plain replace; document why. * src/tools/index.ts — gemini high "CapabilityInjectingServer incomplete proxy". The hand-rolled class only implemented `registerTool` + `getToolNames` and was cast to `MCPServer` via `as unknown as`, which would TypeError at runtime if any registerXxxTool helper ever reached for another method. Replace with a real ES `Proxy` (`makeCapabilityInjectingProxy`) that forwards every property automatically and binds methods to the underlying server. Drop the `as unknown as MCPServer` cast at all 50+ call sites — the Proxy returns a typed `MCPServer`. * scripts/lint-tools-capabilities.js — gemini medium: remove the unused `hasLiteral` Set that was populated but never consulted. * tests/cli/admin-keys.test.ts — robust JSON extraction for `admin keys list --json`. Jest's default reporter can interleave buffered console output into stdout while `parseAsync` runs (observed flaky on macos-latest, node 20 CI). Slice from the first `[` to the last `]` before `JSON.parse` so the test no longer trips over reporter noise. Out of scope / deferred (gemini medium, tracked for follow-up): - Centralizing the capability string list as an exported `const` array in `src/types/mcp.ts`. - Refactoring the duplicated capability-gate check in `expand_tools` vs the regular tools/call path into a shared helper. Refs #924 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: df48e08180
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| console.error(`[openchrome] Error: unknown capability group(s): ${invalid.join(', ')}. Valid: ${allCapabilities.join(', ')}`); | ||
| process.exit(2); | ||
| } | ||
| setMCPServerOptions({ capabilityFilter: new Set(requested) }); |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
Merge current develop into the capability-gate PR and preserve the additive tool-surface contract while adding capability tags for newly merged tools. Constraint: PR #924 must keep setMCPServerOptions replace semantics while allowing --all-tools and capability filters to compose. Rejected: restoring merge semantics in setMCPServerOptions | it regresses reset behavior called out by Codex P1. Confidence: high Scope-risk: moderate Directive: Register future tools through the capability proxy and update TOOL_CAPABILITY_MAP plus the default-surface snapshot together. Tested: npm run lint:tools-capabilities; npx jest tests/capability-filter.test.ts --runInBand; npx jest tests/mcp-server.test.ts tests/mcp/tools-list-output-schema.test.ts --runInBand; npm run build; npm run lint:tier; git diff --check Not-tested: full npm test locally.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9397695bae
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| this.isCapabilityAllowed(r.definition.capability) | ||
| ).length; | ||
| if (hiddenCount > 0) { | ||
| tools.push({ |
There was a problem hiding this comment.
Gate
expand_tools by capability filter
When a caller excludes core (for example --tools-only=workflow or --disable-tools=core), tools/list can still expose expand_tools because it is injected solely based on hiddenCount and never checked against the capability filter. In the same flow, tools/call handles expand_tools before the general capability gate, so this core-tagged meta tool remains callable even when core tools are meant to be hidden. That violates the CLI contract that only selected capability groups are exposed.
Useful? React with 👍 / 👎.
Constraint: develop added default-surface tools and isolated-context tab metadata while #924 was open.\nRejected: change runtime registration to satisfy stale tests | would hide valid develop tools or bypass the capability proxy.\nConfidence: high\nScope-risk: narrow\nDirective: keep capability filtering behavior unchanged; adjust tests/snapshots when develop adds default-visible tools.\nTested: npx jest tests/capability-filter.test.ts tests/tools/tabs.test.ts tests/tools/oc-performance-insights-registration.test.ts tests/cross-env/cursor-verification.test.ts --runInBand\nTested: npm run build\nTested: npm run lint:tier\nTested: git diff --check
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0611368659
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const toolsDir = path.join(ROOT, 'src', 'tools'); | ||
| const files = fs.readdirSync(toolsDir).filter(f => f.endsWith('.ts') && f !== 'index.ts'); |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
Merge the current develop surface into the capability-gate PR and resolve stale test/mock conflicts without changing the capability filtering design. This preserves the single MCP options application path and keeps the branch eligible for CI and review. Constraint: GitHub reported PR #924 as CONFLICTING against develop. Rejected: Reworking capability categories during conflict resolution | latest Codex review on the branch had no actionable P1/P2 feedback. Confidence: high Scope-risk: narrow Directive: Keep future changes limited to capability membership or CLI parsing unless review finds a functional gate bug. Tested: npm run build; npx jest tests/capability-filter.test.ts tests/tools/tabs.test.ts tests/tools/oc-performance-insights-registration.test.ts tests/cross-env/cursor-verification.test.ts --runInBand; npm run lint:tier; git diff --check Not-tested: full CI matrix pending on GitHub. Co-authored-by: OmX <omx@oh-my-codex.dev>
|
Codex Review: Didn't find any major issues. Another round soon, please! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Adjust the Cursor smoke expectation to the current develop Tier 1 surface after the capability-gate branch merge. CI showed 45 real Tier 1 tools plus expand_tools; capability filtering behavior itself is unchanged. Constraint: GitHub CI for PR #924 failed only on a stale Tier 1 count assertion. Rejected: Changing capability membership to satisfy the old count | the branch should reflect develop's actual registered surface. Confidence: high Scope-risk: narrow Directive: Keep count-only updates tied to intentional Tier 1 membership changes. Tested: npx jest tests/cross-env/cursor-verification.test.ts --runInBand; npm run build; npm run lint:tier; git diff --check Not-tested: full CI matrix pending on GitHub. Co-authored-by: OmX <omx@oh-my-codex.dev>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 51f1ea674c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // Handle the expand_tools meta-tool before normal tool lookup | ||
| if (toolName === 'expand_tools') { | ||
| // If a specific tool name is requested (capability gate check), verify it is allowed |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
Normalize CRLF line endings when comparing the committed console-capture baseline fixture so Windows CI does not fail on checkout conversion. The capability-gate implementation remains unchanged. Constraint: PR #924 was expected to hit the same Windows fixture failure after merging develop. Rejected: Regenerating the fixture | the mismatch is platform line endings, not response shape drift. Confidence: high Scope-risk: narrow Directive: Normalize text fixtures before comparing them to generated JSON strings across OSes. Tested: npx jest tests/tools/console-capture-regression.test.ts --runInBand; npm run build; npm run lint:tier; git diff --check Not-tested: full CI matrix pending on GitHub. Co-authored-by: OmX <omx@oh-my-codex.dev>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b8890f828c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (this.exposedTier < 3 && this.clientSupportsListChanged) { | ||
| const hiddenCount = Array.from(this.tools.values()).filter( | ||
| r => getToolTier(r.definition.name) > this.exposedTier | ||
| r => getToolTier(r.definition.name) > this.exposedTier && | ||
| this.isCapabilityAllowed(r.definition.capability) | ||
| ).length; |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| // | ||
| // For pattern 2 we also scan the file for MCPToolDefinition `name:` fields. | ||
| // --------------------------------------------------------------------------- | ||
|
|
There was a problem hiding this comment.
Include pilot tool sources in capability lint scan
The linter only scans src/tools/*.ts for registerTool(...) calls, so tools registered from outside that directory (notably oc_proxy_hook from src/pilot/proxy/hook) are invisible to the check. If such a tool is omitted from TOOL_CAPABILITY_MAP, CI will still pass and the tool will silently default to core, breaking intended capability-group filtering for pilot deployments.
Useful? React with 👍 / 👎.
Merge latest develop into PR #924 while preserving the branch-specific cursor and console fixture expectations. Constraint: PR #962 changed shared transport files and CI fixtures after this branch was opened.\nRejected: Replacing branch-specific fixture semantics with develop's exact snapshot | these PRs intentionally alter tool visibility or fixture normalization expectations.\nConfidence: high\nScope-risk: narrow\nDirective: Treat fixture-only conflict refreshes as semantic preservation work; do not rewrite feature behavior while unblocking mergeability.\nTested: npx jest tests/tools/console-capture-regression.test.ts tests/cross-env/cursor-verification.test.ts --runInBand --forceExit (cross-env suite may skip locally by platform guard); git diff --check for touched fixture files.\nNot-tested: Full GitHub Actions matrix after this refresh.\nCo-authored-by: OmX <omx@oh-my-codex.dev>
Constraint: PR #924 must preserve current develop CLI/test additions while keeping disabled capability groups fully hidden and uncallable.\nRejected: only filtering concrete registered tools | virtual expand_tools is core-tagged and must be gated the same way.\nConfidence: high\nScope-risk: moderate\nDirective: Keep capability lint scanning every production source that registers tools, not just src/tools.\nTested: npm test -- --runTestsByPath tests/capability-filter.test.ts --runInBand; earlier targeted admin-keys and expand-tools-schema run passed except the now-fixed capability lint gap.\nNot-tested: full cross-platform CI matrix pending after push.
Capability filtering now tags the newly merged resumable crawl and skill replay surfaces, routes their registrations through the capability-injecting proxy, and refreshes the default-surface snapshot to match current develop. Constraint: PR #924 must preserve default tools/list behavior while enforcing capability gates for every registered tool. Rejected: bypassing the proxy for newly merged tools | direct server registration would omit capability metadata and reopen filter gaps. Confidence: high Scope-risk: moderate Directive: Any new registerTool call must be represented in TOOL_CAPABILITY_MAP and go through the capability-injecting proxy unless it has an explicit tested exception. Tested: npx tsc --noEmit; npm test -- --runTestsByPath tests/capability-filter.test.ts tests/mcp/expand-tools-schema.test.ts --runInBand; npm run lint:tools-capabilities Not-tested: full cross-platform CI matrix pending after push. Co-authored-by: OmX <omx@oh-my-codex.dev>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a98b26e2c9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| mcpOptions.capabilityFilter = new Set(requested); | ||
| console.error(`[openchrome] Capability filter (tools-only): ${requested.join(', ')}`); |
There was a problem hiding this comment.
Raise tier when capability filter excludes core tools
When --tools-only/--disable-tools sets a non-core capability filter, this code only sets capabilityFilter and leaves initialToolTier unset, so known progressive-disclosure clients (e.g. Claude/Cursor) stay at tier 1; combined with handleToolsList only injecting expand_tools when core is allowed, combinations like --tools-only=workflow or --disable-tools=core can produce an empty tools/list even though allowed tools exist at higher tiers (workflow/recording/crawl/totp), making those valid flag combinations effectively unusable unless users also pass --all-tools or know hidden tool names.
Useful? React with 👍 / 👎.
The CI schema linter pipes introspection stdout into the validator, so the CLI must return naturally after writing JSON and the newly merged crawl job schemas must satisfy the required-field description rule. Constraint: build-and-test runs npm run lint:tool-schemas in a shell pipeline on every PR matrix job. Rejected: keeping process.exit(0) after stdout.write | it can terminate before piped stdout flushes and leaves the validator with empty input. Confidence: high Scope-risk: narrow Directive: Introspection paths must avoid hard process exits after asynchronous stdout writes. Tested: npm run build; npm run lint:tool-schemas; npm run lint Not-tested: full GitHub Actions matrix pending after push. Co-authored-by: OmX <omx@oh-my-codex.dev>
The in-process CLI harness can capture log lines from unrelated async work before the admin JSON payload, so the JSON-array extractor now scans for the first slice that actually parses as an array instead of trusting the first '[' byte. Constraint: full CI runs many suites in shared workers, and Windows/Linux logs showed [WorkflowEngine] noise reaching this harness stdout. Rejected: weakening admin CLI assertions | the test still parses and validates the JSON payload and plaintext absence. Confidence: high Scope-risk: narrow Directive: CLI JSON tests should parse the intended payload, not incidental bracketed log prefixes. Tested: npm test -- --runTestsByPath tests/cli/admin-keys.test.ts --runInBand Not-tested: full GitHub Actions matrix pending after push. Co-authored-by: OmX <omx@oh-my-codex.dev>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a576942bd9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (!this.isCapabilityAllowed('core')) { | ||
| return { |
There was a problem hiding this comment.
Allow tier expansion when non-core capabilities are selected
Rejecting expand_tools whenever core is filtered out makes capability groups like workflow unusable under the default tiering flow: workflow_* tools are Tier 3, but --tools-only=workflow can no longer surface expand_tools, so clients cannot ever expand from Tier 1 to reach those allowed tools. In practice this yields an empty/near-empty tool surface for valid capability filters unless operators also know to set --all-tools/OPENCHROME_TOOL_TIER, which contradicts the CLI contract that --tools-only exposes the selected groups.
Useful? React with 👍 / 👎.
The capability-gate branch now includes the latest develop snapshot-ref baseline while preserving capability metadata injection for newly merged tools and the expanded capability linter coverage. Constraint: develop advanced during PR #924 review and branch must remain mergeable while preserving default tools/list behavior. Rejected: bypassing capability proxy for new tools | it would omit capability metadata and reopen filter gaps. Confidence: high Scope-risk: moderate Directive: Any new registerTool call must be represented in TOOL_CAPABILITY_MAP and normally route through the capability-injecting proxy. Tested: npm run build; npm run lint:tool-schemas; npm test -- --runTestsByPath tests/capability-filter.test.ts tests/mcp/expand-tools-schema.test.ts tests/transports/http-bearer-auth.test.ts --runInBand; npm run lint:tools-capabilities Not-tested: full GitHub Actions matrix pending after push. Co-authored-by: OmX <omx@oh-my-codex.dev>
The capability-gate branch now includes bounded tool-output observability metrics while preserving capability metadata injection, proxy routing, and expanded capability linter coverage. Constraint: develop advanced during PR #924 review and branch must remain mergeable while preserving default tools/list behavior. Rejected: leaving the autogenerated merge message | project history requires Lore trailers and OmX co-author attribution. Confidence: high Scope-risk: moderate Directive: Any new registerTool call must keep capability metadata and metrics instrumentation compatible; do not bypass the capability-injecting proxy. Tested: npm run build; npm run lint:tool-schemas; npm test -- --runTestsByPath tests/metrics/tool-output-observability.test.ts --runInBand Not-tested: full GitHub Actions matrix pending after push. Co-authored-by: OmX <omx@oh-my-codex.dev>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3755d703da
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // Capability gate check: reject calls to tools excluded by --tools-only / --disable-tools | ||
| if (this.capabilityFilter) { | ||
| const registry = this.tools.get(toolName); | ||
| if (registry && !this.isCapabilityAllowed(registry.definition.capability)) { | ||
| const capability = registry.definition.capability ?? 'core'; |
There was a problem hiding this comment.
Apply capability filter when resolving internal tool handlers
The new capability gate is only enforced in handleToolsCall, but internal execution paths can still bypass it by resolving handlers directly. In this repo, execute_plan builds a PlanExecutor from mcpServer.getToolHandler(...) (see src/tools/orchestration.ts), so a server started with --disable-tools=crawl can still run crawl tools if a cached plan references them. That violates the new “max tool surface” guarantee and allows excluded capabilities to execute in workflow contexts; the resolver path should apply the same capability check and return null/CAPABILITY_DISABLED for blocked tools.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 42c67a52ff
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| * Absent entry → defaults to 'core' (P1 backward-compat). | ||
| * lint:tools-capabilities enforces that every registered tool appears here. | ||
| */ | ||
| export const TOOL_CAPABILITY_MAP: Record<string, ToolCapability> = { |
There was a problem hiding this comment.
Add missing registered tools to capability map
TOOL_CAPABILITY_MAP is introduced as authoritative, but it omits currently registered tools (e.g. oc_progress_status, oc_reflect, oc_task_*, and oc_run_*), so the new lint:tools-capabilities check fails and the new capability-filter test that expects that lint to pass cannot succeed. This is reproducible on commit c1511083 by running node scripts/lint-tools-capabilities.js, which exits non-zero with missing tool names; it also means these tools silently fall back to 'core' instead of explicit grouping.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d53b70cb61
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| * Absent entry → defaults to 'core' (P1 backward-compat). | ||
| * lint:tools-capabilities enforces that every registered tool appears here. | ||
| */ | ||
| export const TOOL_CAPABILITY_MAP: Record<string, ToolCapability> = { |
There was a problem hiding this comment.
Add all registered tools to TOOL_CAPABILITY_MAP
The capability map is now treated as authoritative, but this object still omits registered tools (oc_progress_status, oc_reflect, oc_task_*, and oc_run_*), so those tools never get explicit capability tags. Fresh evidence: running node scripts/lint-tools-capabilities.js on this commit exits non-zero and reports 11 missing tools, which means the new lint:tools-capabilities check (and the related capability-filter test expectations) cannot pass reliably and these tools silently fall back to core.
Useful? React with 👍 / 👎.
| "dependencies": { | ||
| "argon2": "^0.43.0", | ||
| "cheerio": "1.0.0-rc.12", | ||
| "commander": "^12.0.0", | ||
| "jose": "^5.10.0", | ||
| "proper-lockfile": "^4.1.2", | ||
| "puppeteer-core": "npm:rebrowser-puppeteer-core@23.10.3", |
There was a problem hiding this comment.
Restore cheerio and turndown as direct dependencies
This change drops cheerio and turndown from dependencies, but production code still imports both in src/core/extract/html-to-markdown.ts (lines 1-2). On clean installs where those modules are not present transitively, loading extraction paths will fail with MODULE_NOT_FOUND, causing runtime breakage in markdown/content extraction flows.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 321d53042e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
| function collectRegisteredSourceFiles(dir, out = []) { | ||
| for (const entry of fs.readdirSync(dir, { withFileTypes: true })) { | ||
| if (entry.name === 'index.ts') continue; |
There was a problem hiding this comment.
Scan index.ts files when linting capability coverage
Do not skip every index.ts here: this creates false negatives in the new capability lint. The current walk ignores files like src/pilot/dynamic-skills/index.ts, which already calls state.server.registerTool(...), so missing capability tagging in those registrations can pass CI undetected and silently fall back to 'core' behavior instead of the intended group filtering.
Useful? React with 👍 / 👎.
| if (!this.capabilityFilter) return true; | ||
| return this.capabilityFilter.has(capability ?? 'core'); |
There was a problem hiding this comment.
Avoid defaulting untagged tools to core capability
Treating missing capability metadata as 'core' causes pilot dynamic-skill tools to bypass --disable-tools=pilot: synthesized tools are registered at runtime without a capability field (src/pilot/dynamic-skills/index.ts), so this fallback marks them core and keeps them exposed/callable whenever core is allowed. That breaks the new CLI contract that disabled capability groups are removed from the tool surface.
Useful? React with 👍 / 👎.
The lint:tools-capabilities script regressed once develop landed 12 new tools (#1062 normalize, #1060 progress, #1019 reflect, the #855 task ledger and the run-harness ledger) — each missing from the capability map. The capability-filter test then trips because filteredNames sees develop-era tools that the filter doesn't classify. Add 'core' entries for all 12 new tools so the filter routes them through the default capability and lint stays green.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ccbfd773cf
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // Capability gate check: reject calls to tools excluded by --tools-only / --disable-tools | ||
| if (this.capabilityFilter) { | ||
| const registry = this.tools.get(toolName); | ||
| if (registry && !this.isCapabilityAllowed(registry.definition.capability)) { |
There was a problem hiding this comment.
Enforce capability gate on all tool execution paths
This capability check only runs inside handleToolsCall, so it does not protect internal callers that execute tools via direct handler lookup. In this codebase, execute_plan builds a PlanExecutor with mcpServer.getToolHandler(...) (src/tools/orchestration.ts), which means a server started with --disable-tools=<group> can still execute excluded tools when a plan references them. That breaks the new “maximum tool surface” guarantee and makes capability filtering inconsistent between direct MCP calls and workflow execution.
Useful? React with 👍 / 👎.
| // Only inject expand_tools if the client supports notifications/tools/list_changed — | ||
| // otherwise there's no point since the client can't react to the notification. | ||
| if (this.exposedTier < 3 && this.clientSupportsListChanged) { | ||
| if (this.exposedTier < 3 && this.clientSupportsListChanged && this.isCapabilityAllowed('core')) { |
There was a problem hiding this comment.
Keep tier expansion reachable for non-core capability filters
When core is filtered out (for example --tools-only=workflow), this condition suppresses expand_tools; combined with the default exposedTier = 1, higher-tier allowed tools (like workflow_* and execute_plan, tier 3) can become undiscoverable because tools/list returns no eligible tools and there is no expansion path. In practice, valid capability-filter combinations can yield an empty tool surface unless operators also know to set --all-tools/OPENCHROME_TOOL_TIER.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f42ab9fdda
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| .option('--tools-only <csv>', 'Expose only tools belonging to the specified capability groups (comma-separated). Valid values: core,crawl,recording,workflow,storage,profile,totp,pilot. Default: all groups exposed.') | ||
| .option('--disable-tools <csv>', 'Remove tools belonging to the specified capability groups (comma-separated). Valid values: core,crawl,recording,workflow,storage,profile,totp,pilot.') |
There was a problem hiding this comment.
Honor capability flags in introspection mode
The new --tools-only/--disable-tools options are advertised globally, but when --introspect-tools-list is used the handler returns before capability parsing and before setMCPServerOptions(mcpOptions) runs, so introspection always emits the full unfiltered manifest. This makes automation that relies on --introspect-tools-list to verify filtered tool surfaces report incorrect results (e.g. --introspect-tools-list --tools-only=core still lists workflow/crawl tools).
Useful? React with 👍 / 👎.
# Conflicts: # tests/tools/console-capture-regression.test.ts
…uto-merge-pr-924 # Conflicts: # tests/tools/console-capture-regression.test.ts
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 577b5dafc5
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| * Absent entry → defaults to 'core' (P1 backward-compat). | ||
| * lint:tools-capabilities enforces that every registered tool appears here. | ||
| */ | ||
| export const TOOL_CAPABILITY_MAP: Record<string, ToolCapability> = { |
There was a problem hiding this comment.
Add oc_task_run tools to TOOL_CAPABILITY_MAP
TOOL_CAPABILITY_MAP is still incomplete: the seven oc_task_run_* tools registered by registerTaskRunTools are not listed, so the new capability lint path fails and those tools silently fall back to implicit 'core' classification instead of explicit grouping. Fresh evidence on this commit is that src/tools/task-run.ts registers oc_task_run_start/update/checkpoint/needs_help/complete/get/list (lines 250-256 there), while this map has no corresponding entries, which makes scripts/lint-tools-capabilities.js report missing tools and break the “all registered tools are tagged” invariant.
Useful? React with 👍 / 👎.
Progress / Review status
Auto-refreshed 2026-05-13 — owner comments cleaned up to reduce review noise.
feat/829-capability-gate→develop3755d70— Carry capability gates onto output metrics baselineOwner comment cleanup: 7 issue + 0 inline review comments deleted. Outstanding feedback from automated/external reviewers above is unchanged.
Summary
capabilityfield onMCPToolDefinition(core|crawl|recording|workflow|storage|profile|totp|pilot)--tools-only <csv>and--disable-tools <csv>CLI flags (mutually exclusive) gate tool visibility attools/listexpand_toolsrejects capability-excluded tools with structuredCAPABILITY_DISABLEDerrorlint:tools-capabilitiesscript ensures every tool has a tag (CI-enforceable)TOOL_CAPABILITY_MAP+CapabilityInjectingServerproxy insrc/tools/index.ts— individual tool files do not need capability knowledgesrc/tools/__tests__/__snapshots__/tools-list.v1.11.snap.jsonprevents accidental default-surface driftCloses #829
Tier
core— additive metadata, P2-compliant: with no flags,tools/listis byte-identical to v1.11.0. P1 preserved (no tool renamed/removed).Capability groups (final)
Interaction with
expand_toolsexpand_toolsoperates within that set; calling it on an excluded tool returns{ code: "CAPABILITY_DISABLED", capability: <name> }.Test plan
lint:tools-capabilitiesgreen (every registered tool has a capability tag)tools/listdefault surface matches v1.11.0 baseline snapshot--tools-only=coreremoves workflow/recording/crawl tools--disable-tools=workflow,recordingremoves exactly those groupsexpand_toolsrejects capability-excluded tools with CAPABILITY_DISABLEDnpm run buildgreennpm run lint:tiergreen (no dependency-cruiser violations)tests/capability-filter.test.ts: 12/12 passmcp-server.test.ts,tests/mcp/): 39/39 passVerification scenarios 1–6 from #829 runnable post-merge against the same commands.
Out of scope (deferred)
--tool-allowlist/--tool-blocklist) — capability grouping is sufficient for v1; file follow-up if needed.🤖 Generated with Claude Code