Skip to content

feat(core): capability-gated tool surface (#829)#924

Merged
shaun0927 merged 28 commits into
developfrom
feat/829-capability-gate
May 13, 2026
Merged

feat(core): capability-gated tool surface (#829)#924
shaun0927 merged 28 commits into
developfrom
feat/829-capability-gate

Conversation

@shaun0927
Copy link
Copy Markdown
Owner

@shaun0927 shaun0927 commented May 12, 2026

Progress / Review status

Auto-refreshed 2026-05-13 — owner comments cleaned up to reduce review noise.

Field Value
Branch feat/829-capability-gatedevelop
Draft no
CI ✅ all 9 checks passing
Mergeable ❌ CONFLICTING
Review decision
Codex (latest) 💡 suggestions posted
Other reviewers (latest) gemini-code-assist: commented, chatgpt-codex-connector: commented
Head 3755d70 — Carry capability gates onto output metrics baseline
Commits 14

Owner comment cleanup: 7 issue + 0 inline review comments deleted. Outstanding feedback from automated/external reviewers above is unchanged.


Summary

  • Additive capability field on MCPToolDefinition (core | crawl | recording | workflow | storage | profile | totp | pilot)
  • --tools-only <csv> and --disable-tools <csv> CLI flags (mutually exclusive) gate tool visibility at tools/list
  • expand_tools rejects capability-excluded tools with structured CAPABILITY_DISABLED error
  • New lint:tools-capabilities script ensures every tool has a tag (CI-enforceable)
  • Authoritative TOOL_CAPABILITY_MAP + CapabilityInjectingServer proxy in src/tools/index.ts — individual tool files do not need capability knowledge
  • v1.11.0 baseline snapshot at src/tools/__tests__/__snapshots__/tools-list.v1.11.snap.json prevents accidental default-surface drift

Closes #829

Tier

core — additive metadata, P2-compliant: with no flags, tools/list is byte-identical to v1.11.0. P1 preserved (no tool renamed/removed).

Capability groups (final)

  • core (~52 tools): navigate, read_page, query_dom, find, inspect, interact, computer, act, fill_form, form_input, javascript_tool, page_, screenshot, journal, oc_journal, tabs_, oc_connection_health, oc_session_resume/snapshot, oc_assert, oc_evidence_bundle, oc_checkpoint, wait_for, console_capture, validate_page, request_intercept, network*, emulate_device, file_upload, drag_drop, http_auth, user_agent, geolocation, lightweight_scroll, memory, extract_data, performance_metrics, page_reload, expand_tools
  • storage: cookies, storage
  • profile: list_profiles, oc_profile_status
  • crawl: crawl, crawl_sitemap, batch_execute, batch_paginate, worker_*
  • recording: oc_recording_*
  • workflow: workflow_*, execute_plan
  • totp: oc_totp_generate

Interaction with expand_tools

  • Capability filter defines the maximum set the agent can ever see.
  • expand_tools operates within that set; calling it on an excluded tool returns { code: "CAPABILITY_DISABLED", capability: <name> }.

Test plan

  • lint:tools-capabilities green (every registered tool has a capability tag)
  • tools/list default surface matches v1.11.0 baseline snapshot
  • --tools-only=core removes workflow/recording/crawl tools
  • --disable-tools=workflow,recording removes exactly those groups
  • expand_tools rejects capability-excluded tools with CAPABILITY_DISABLED
  • workflow_init not in tools/list after expand_tools rejection
  • npm run build green
  • npm run lint:tier green (no dependency-cruiser violations)
  • tests/capability-filter.test.ts: 12/12 pass
  • Regression (mcp-server.test.ts, tests/mcp/): 39/39 pass

Verification scenarios 1–6 from #829 runnable post-merge against the same commands.

Out of scope (deferred)

  • Per-tool allowlist/blocklist flags (--tool-allowlist/--tool-blocklist) — capability grouping is sufficient for v1; file follow-up if needed.

🤖 Generated with Claude Code

Adds an additive `capability` field to MCPToolDefinition and two CLI flags
(--tools-only, --disable-tools) that gate tool visibility without changing
the default tools/list output. expand_tools respects the capability gate.

- New ToolCapability union: core, crawl, recording, workflow, storage,
  profile, totp, pilot (P2: absent → defaults to core)
- TOOL_CAPABILITY_MAP authoritative mapping in src/tools/index.ts
- CapabilityInjectingServer proxy decorates every registerTool() call so
  individual tool files do not need to know about capabilities
- --tools-only <csv> and --disable-tools <csv> (mutually exclusive)
- expand_tools rejects capability-excluded tools with CAPABILITY_DISABLED
- New lint:tools-capabilities script enforces every tool has a tag
- v1.11.0 baseline snapshot at src/tools/__tests__/__snapshots__/
  tools-list.v1.11.snap.json prevents accidental default-surface drift
- Default behavior (no flags) is P2-compliant: byte-identical tools/list

Closes #829

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@qodo-code-review
Copy link
Copy Markdown

ⓘ You've reached your Qodo monthly free-tier limit. Reviews pause until next month — upgrade your plan to continue now, or link your paid account if you already have one.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a capability-based filtering system for tools, allowing users to restrict tool exposure via new CLI flags. Key changes include a centralized capability mapping, a lint script to enforce tool categorization, and server-side logic to gate tool access. Review feedback suggests using a standard Proxy for tool registration to avoid runtime errors, centralizing hardcoded capability strings to prevent desynchronization, and refactoring duplicated access-control logic into a shared helper.

Comment thread src/tools/index.ts Outdated
Comment on lines +226 to +247
class CapabilityInjectingServer {
constructor(private readonly server: MCPServer) {}

registerTool(
name: string,
handler: ToolHandler,
definition: MCPToolDefinition,
options?: { timeoutRecoverable?: boolean },
): void {
const capability: ToolCapability = TOOL_CAPABILITY_MAP[name] ?? 'core';
this.server.registerTool(name, handler, { ...definition, capability }, options);
}

// Proxy all other MCPServer methods used by individual register* functions
getToolNames(): string[] {
return this.server.getToolNames();
}
}

export function registerAllTools(server: MCPServer): void {
// Wrap the real server so every registerTool() call gets a capability tag
const proxy = new CapabilityInjectingServer(server);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The CapabilityInjectingServer class is an incomplete proxy of MCPServer. By only implementing registerTool and getToolNames and then casting it to MCPServer using as unknown as (lines 250-357), you risk runtime TypeError crashes if any of the register* functions attempt to call other methods on the server (e.g., registerResource, expandToolTier). A more robust approach is to use a Proxy object which automatically delegates all other method calls to the original server instance.

/**
 * Proxy around MCPServer that injects the capability field from
 * TOOL_CAPABILITY_MAP into every MCPToolDefinition at registerTool() time.
 */
const createCapabilityProxy = (server: MCPServer): MCPServer => {
  return new Proxy(server, {
    get(target, prop, receiver) {
      if (prop === 'registerTool') {
        return (name: string, handler: ToolHandler, definition: MCPToolDefinition, options?: any) => {
          const capability = TOOL_CAPABILITY_MAP[name] ?? 'core';
          return target.registerTool(name, handler, { ...definition, capability }, options);
        };
      }
      const value = Reflect.get(target, prop, receiver);
      return typeof value === 'function' ? value.bind(target) : value;
    },
  });
};

export function registerAllTools(server: MCPServer): void {
  // Wrap the real server so every registerTool() call gets a capability tag
  const proxy = createCapabilityProxy(server);

Comment thread scripts/lint-tools-capabilities.js Outdated
Comment on lines +77 to +81
const hasLiteral = new Set();
while ((m = literalRe.exec(src)) !== null) {
names.add(m[1]);
hasLiteral.add(m[1]);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The hasLiteral set is populated but never used. It should be removed to keep the script clean and avoid confusion.

Suggested change
const hasLiteral = new Set();
while ((m = literalRe.exec(src)) !== null) {
names.add(m[1]);
hasLiteral.add(m[1]);
}
literalRe.lastIndex = 0;
let m;
while ((m = literalRe.exec(src)) !== null) {
names.add(m[1]);
}

Comment thread src/index.ts Outdated
}

// Capability filter configuration (#829)
const allCapabilities: ToolCapability[] = ['core', 'crawl', 'recording', 'workflow', 'storage', 'profile', 'totp', 'pilot'];
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The list of valid capability groups is hardcoded here and duplicated in the CLI option descriptions (lines 100-101) and validation logic. This makes the code harder to maintain and prone to desynchronization when new capabilities are added. Consider centralizing this list in src/types/mcp.ts as a constant array and deriving the ToolCapability type from it.

Comment thread src/mcp-server.ts
Comment on lines +914 to +926
if (this.capabilityFilter) {
const registry = this.tools.get(toolName);
if (registry && !this.isCapabilityAllowed(registry.definition.capability)) {
const capability = registry.definition.capability ?? 'core';
return {
content: [{
type: 'text',
text: JSON.stringify({ code: 'CAPABILITY_DISABLED', capability }),
}],
isError: true,
};
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This capability gate check logic is duplicated from the expand_tools handler (lines 870-881). Duplicated logic for security/access control is risky as updates might not be applied consistently. Consider refactoring this into a private helper method to ensure consistency and reduce code duplication.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c7d2325abc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/mcp-server.ts Outdated
export function setMCPServerOptions(options: MCPServerOptions): void {
mcpServerOptions = options;
export function setMCPServerOptions(options: Partial<MCPServerOptions>): void {
mcpServerOptions = { ...mcpServerOptions, ...options };

This comment was marked as outdated.

Addresses bot review findings on PR #924:

* src/mcp-server.ts — codex P1 "Restore option reset behavior in
  setMCPServerOptions". The PR had switched to a merge semantic
  (`{...mcpServerOptions, ...options}`), but callers rely on the
  pre-#829 replace semantic so `setMCPServerOptions({})` fully clears
  previously set flags. Revert to plain replace; document why.

* src/tools/index.ts — gemini high "CapabilityInjectingServer
  incomplete proxy". The hand-rolled class only implemented
  `registerTool` + `getToolNames` and was cast to `MCPServer` via
  `as unknown as`, which would TypeError at runtime if any
  registerXxxTool helper ever reached for another method. Replace with
  a real ES `Proxy` (`makeCapabilityInjectingProxy`) that forwards
  every property automatically and binds methods to the underlying
  server. Drop the `as unknown as MCPServer` cast at all 50+ call
  sites — the Proxy returns a typed `MCPServer`.

* scripts/lint-tools-capabilities.js — gemini medium: remove the unused
  `hasLiteral` Set that was populated but never consulted.

* tests/cli/admin-keys.test.ts — robust JSON extraction for `admin keys
  list --json`. Jest's default reporter can interleave buffered console
  output into stdout while `parseAsync` runs (observed flaky on
  macos-latest, node 20 CI). Slice from the first `[` to the last `]`
  before `JSON.parse` so the test no longer trips over reporter noise.

Out of scope / deferred (gemini medium, tracked for follow-up):
- Centralizing the capability string list as an exported `const` array
  in `src/types/mcp.ts`.
- Refactoring the duplicated capability-gate check in `expand_tools`
  vs the regular tools/call path into a shared helper.

Refs #924

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: df48e08180

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/index.ts Outdated
console.error(`[openchrome] Error: unknown capability group(s): ${invalid.join(', ')}. Valid: ${allCapabilities.join(', ')}`);
process.exit(2);
}
setMCPServerOptions({ capabilityFilter: new Set(requested) });

This comment was marked as outdated.

Merge current develop into the capability-gate PR and preserve the additive tool-surface contract while adding capability tags for newly merged tools.

Constraint: PR #924 must keep setMCPServerOptions replace semantics while allowing --all-tools and capability filters to compose.

Rejected: restoring merge semantics in setMCPServerOptions | it regresses reset behavior called out by Codex P1.

Confidence: high

Scope-risk: moderate

Directive: Register future tools through the capability proxy and update TOOL_CAPABILITY_MAP plus the default-surface snapshot together.

Tested: npm run lint:tools-capabilities; npx jest tests/capability-filter.test.ts --runInBand; npx jest tests/mcp-server.test.ts tests/mcp/tools-list-output-schema.test.ts --runInBand; npm run build; npm run lint:tier; git diff --check

Not-tested: full npm test locally.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9397695bae

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/mcp-server.ts
this.isCapabilityAllowed(r.definition.capability)
).length;
if (hiddenCount > 0) {
tools.push({
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Gate expand_tools by capability filter

When a caller excludes core (for example --tools-only=workflow or --disable-tools=core), tools/list can still expose expand_tools because it is injected solely based on hiddenCount and never checked against the capability filter. In the same flow, tools/call handles expand_tools before the general capability gate, so this core-tagged meta tool remains callable even when core tools are meant to be hidden. That violates the CLI contract that only selected capability groups are exposed.

Useful? React with 👍 / 👎.

Constraint: develop added default-surface tools and isolated-context tab metadata while #924 was open.\nRejected: change runtime registration to satisfy stale tests | would hide valid develop tools or bypass the capability proxy.\nConfidence: high\nScope-risk: narrow\nDirective: keep capability filtering behavior unchanged; adjust tests/snapshots when develop adds default-visible tools.\nTested: npx jest tests/capability-filter.test.ts tests/tools/tabs.test.ts tests/tools/oc-performance-insights-registration.test.ts tests/cross-env/cursor-verification.test.ts --runInBand\nTested: npm run build\nTested: npm run lint:tier\nTested: git diff --check
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0611368659

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread scripts/lint-tools-capabilities.js Outdated
Comment on lines +59 to +60
const toolsDir = path.join(ROOT, 'src', 'tools');
const files = fs.readdirSync(toolsDir).filter(f => f.endsWith('.ts') && f !== 'index.ts');

This comment was marked as outdated.

Merge the current develop surface into the capability-gate PR and resolve stale test/mock conflicts without changing the capability filtering design. This preserves the single MCP options application path and keeps the branch eligible for CI and review.

Constraint: GitHub reported PR #924 as CONFLICTING against develop.

Rejected: Reworking capability categories during conflict resolution | latest Codex review on the branch had no actionable P1/P2 feedback.

Confidence: high

Scope-risk: narrow

Directive: Keep future changes limited to capability membership or CLI parsing unless review finds a functional gate bug.

Tested: npm run build; npx jest tests/capability-filter.test.ts tests/tools/tabs.test.ts tests/tools/oc-performance-insights-registration.test.ts tests/cross-env/cursor-verification.test.ts --runInBand; npm run lint:tier; git diff --check

Not-tested: full CI matrix pending on GitHub.

Co-authored-by: OmX <omx@oh-my-codex.dev>
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Another round soon, please!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Adjust the Cursor smoke expectation to the current develop Tier 1 surface after the capability-gate branch merge. CI showed 45 real Tier 1 tools plus expand_tools; capability filtering behavior itself is unchanged.

Constraint: GitHub CI for PR #924 failed only on a stale Tier 1 count assertion.

Rejected: Changing capability membership to satisfy the old count | the branch should reflect develop's actual registered surface.

Confidence: high

Scope-risk: narrow

Directive: Keep count-only updates tied to intentional Tier 1 membership changes.

Tested: npx jest tests/cross-env/cursor-verification.test.ts --runInBand; npm run build; npm run lint:tier; git diff --check

Not-tested: full CI matrix pending on GitHub.

Co-authored-by: OmX <omx@oh-my-codex.dev>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 51f1ea674c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/mcp-server.ts Outdated
Comment on lines +1095 to +1097
// Handle the expand_tools meta-tool before normal tool lookup
if (toolName === 'expand_tools') {
// If a specific tool name is requested (capability gate check), verify it is allowed

This comment was marked as outdated.

Normalize CRLF line endings when comparing the committed console-capture baseline fixture so Windows CI does not fail on checkout conversion. The capability-gate implementation remains unchanged.

Constraint: PR #924 was expected to hit the same Windows fixture failure after merging develop.

Rejected: Regenerating the fixture | the mismatch is platform line endings, not response shape drift.

Confidence: high

Scope-risk: narrow

Directive: Normalize text fixtures before comparing them to generated JSON strings across OSes.

Tested: npx jest tests/tools/console-capture-regression.test.ts --runInBand; npm run build; npm run lint:tier; git diff --check

Not-tested: full CI matrix pending on GitHub.

Co-authored-by: OmX <omx@oh-my-codex.dev>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b8890f828c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/mcp-server.ts Outdated
Comment on lines 910 to 914
if (this.exposedTier < 3 && this.clientSupportsListChanged) {
const hiddenCount = Array.from(this.tools.values()).filter(
r => getToolTier(r.definition.name) > this.exposedTier
r => getToolTier(r.definition.name) > this.exposedTier &&
this.isCapabilityAllowed(r.definition.capability)
).length;

This comment was marked as outdated.

//
// For pattern 2 we also scan the file for MCPToolDefinition `name:` fields.
// ---------------------------------------------------------------------------

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3 Badge Include pilot tool sources in capability lint scan

The linter only scans src/tools/*.ts for registerTool(...) calls, so tools registered from outside that directory (notably oc_proxy_hook from src/pilot/proxy/hook) are invisible to the check. If such a tool is omitted from TOOL_CAPABILITY_MAP, CI will still pass and the tool will silently default to core, breaking intended capability-group filtering for pilot deployments.

Useful? React with 👍 / 👎.

shaun0927 and others added 3 commits May 13, 2026 07:01
Merge latest develop into PR #924 while preserving the branch-specific cursor and console fixture expectations.

Constraint: PR #962 changed shared transport files and CI fixtures after this branch was opened.\nRejected: Replacing branch-specific fixture semantics with develop's exact snapshot | these PRs intentionally alter tool visibility or fixture normalization expectations.\nConfidence: high\nScope-risk: narrow\nDirective: Treat fixture-only conflict refreshes as semantic preservation work; do not rewrite feature behavior while unblocking mergeability.\nTested: npx jest tests/tools/console-capture-regression.test.ts tests/cross-env/cursor-verification.test.ts --runInBand --forceExit (cross-env suite may skip locally by platform guard); git diff --check for touched fixture files.\nNot-tested: Full GitHub Actions matrix after this refresh.\nCo-authored-by: OmX <omx@oh-my-codex.dev>
Constraint: PR #924 must preserve current develop CLI/test additions while keeping disabled capability groups fully hidden and uncallable.\nRejected: only filtering concrete registered tools | virtual expand_tools is core-tagged and must be gated the same way.\nConfidence: high\nScope-risk: moderate\nDirective: Keep capability lint scanning every production source that registers tools, not just src/tools.\nTested: npm test -- --runTestsByPath tests/capability-filter.test.ts --runInBand; earlier targeted admin-keys and expand-tools-schema run passed except the now-fixed capability lint gap.\nNot-tested: full cross-platform CI matrix pending after push.
Capability filtering now tags the newly merged resumable crawl and skill replay surfaces, routes their registrations through the capability-injecting proxy, and refreshes the default-surface snapshot to match current develop.

Constraint: PR #924 must preserve default tools/list behavior while enforcing capability gates for every registered tool.
Rejected: bypassing the proxy for newly merged tools | direct server registration would omit capability metadata and reopen filter gaps.
Confidence: high
Scope-risk: moderate
Directive: Any new registerTool call must be represented in TOOL_CAPABILITY_MAP and go through the capability-injecting proxy unless it has an explicit tested exception.
Tested: npx tsc --noEmit; npm test -- --runTestsByPath tests/capability-filter.test.ts tests/mcp/expand-tools-schema.test.ts --runInBand; npm run lint:tools-capabilities
Not-tested: full cross-platform CI matrix pending after push.
Co-authored-by: OmX <omx@oh-my-codex.dev>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a98b26e2c9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/index.ts
Comment on lines +399 to +400
mcpOptions.capabilityFilter = new Set(requested);
console.error(`[openchrome] Capability filter (tools-only): ${requested.join(', ')}`);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Raise tier when capability filter excludes core tools

When --tools-only/--disable-tools sets a non-core capability filter, this code only sets capabilityFilter and leaves initialToolTier unset, so known progressive-disclosure clients (e.g. Claude/Cursor) stay at tier 1; combined with handleToolsList only injecting expand_tools when core is allowed, combinations like --tools-only=workflow or --disable-tools=core can produce an empty tools/list even though allowed tools exist at higher tiers (workflow/recording/crawl/totp), making those valid flag combinations effectively unusable unless users also pass --all-tools or know hidden tool names.

Useful? React with 👍 / 👎.

shaun0927 and others added 2 commits May 13, 2026 10:21
The CI schema linter pipes introspection stdout into the validator, so the CLI must return naturally after writing JSON and the newly merged crawl job schemas must satisfy the required-field description rule.

Constraint: build-and-test runs npm run lint:tool-schemas in a shell pipeline on every PR matrix job.
Rejected: keeping process.exit(0) after stdout.write | it can terminate before piped stdout flushes and leaves the validator with empty input.
Confidence: high
Scope-risk: narrow
Directive: Introspection paths must avoid hard process exits after asynchronous stdout writes.
Tested: npm run build; npm run lint:tool-schemas; npm run lint
Not-tested: full GitHub Actions matrix pending after push.
Co-authored-by: OmX <omx@oh-my-codex.dev>
The in-process CLI harness can capture log lines from unrelated async work before the admin JSON payload, so the JSON-array extractor now scans for the first slice that actually parses as an array instead of trusting the first '[' byte.

Constraint: full CI runs many suites in shared workers, and Windows/Linux logs showed [WorkflowEngine] noise reaching this harness stdout.
Rejected: weakening admin CLI assertions | the test still parses and validates the JSON payload and plaintext absence.
Confidence: high
Scope-risk: narrow
Directive: CLI JSON tests should parse the intended payload, not incidental bracketed log prefixes.
Tested: npm test -- --runTestsByPath tests/cli/admin-keys.test.ts --runInBand
Not-tested: full GitHub Actions matrix pending after push.
Co-authored-by: OmX <omx@oh-my-codex.dev>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a576942bd9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/mcp-server.ts
Comment on lines +1312 to +1313
if (!this.isCapabilityAllowed('core')) {
return {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Allow tier expansion when non-core capabilities are selected

Rejecting expand_tools whenever core is filtered out makes capability groups like workflow unusable under the default tiering flow: workflow_* tools are Tier 3, but --tools-only=workflow can no longer surface expand_tools, so clients cannot ever expand from Tier 1 to reach those allowed tools. In practice this yields an empty/near-empty tool surface for valid capability filters unless operators also know to set --all-tools/OPENCHROME_TOOL_TIER, which contradicts the CLI contract that --tools-only exposes the selected groups.

Useful? React with 👍 / 👎.

shaun0927 and others added 2 commits May 13, 2026 11:01
The capability-gate branch now includes the latest develop snapshot-ref baseline while preserving capability metadata injection for newly merged tools and the expanded capability linter coverage.

Constraint: develop advanced during PR #924 review and branch must remain mergeable while preserving default tools/list behavior.
Rejected: bypassing capability proxy for new tools | it would omit capability metadata and reopen filter gaps.
Confidence: high
Scope-risk: moderate
Directive: Any new registerTool call must be represented in TOOL_CAPABILITY_MAP and normally route through the capability-injecting proxy.
Tested: npm run build; npm run lint:tool-schemas; npm test -- --runTestsByPath tests/capability-filter.test.ts tests/mcp/expand-tools-schema.test.ts tests/transports/http-bearer-auth.test.ts --runInBand; npm run lint:tools-capabilities
Not-tested: full GitHub Actions matrix pending after push.
Co-authored-by: OmX <omx@oh-my-codex.dev>
The capability-gate branch now includes bounded tool-output observability metrics while preserving capability metadata injection, proxy routing, and expanded capability linter coverage.

Constraint: develop advanced during PR #924 review and branch must remain mergeable while preserving default tools/list behavior.
Rejected: leaving the autogenerated merge message | project history requires Lore trailers and OmX co-author attribution.
Confidence: high
Scope-risk: moderate
Directive: Any new registerTool call must keep capability metadata and metrics instrumentation compatible; do not bypass the capability-injecting proxy.
Tested: npm run build; npm run lint:tool-schemas; npm test -- --runTestsByPath tests/metrics/tool-output-observability.test.ts --runInBand
Not-tested: full GitHub Actions matrix pending after push.
Co-authored-by: OmX <omx@oh-my-codex.dev>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3755d703da

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/mcp-server.ts
Comment on lines +1428 to +1432
// Capability gate check: reject calls to tools excluded by --tools-only / --disable-tools
if (this.capabilityFilter) {
const registry = this.tools.get(toolName);
if (registry && !this.isCapabilityAllowed(registry.definition.capability)) {
const capability = registry.definition.capability ?? 'core';
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Apply capability filter when resolving internal tool handlers

The new capability gate is only enforced in handleToolsCall, but internal execution paths can still bypass it by resolving handlers directly. In this repo, execute_plan builds a PlanExecutor from mcpServer.getToolHandler(...) (see src/tools/orchestration.ts), so a server started with --disable-tools=crawl can still run crawl tools if a cached plan references them. That violates the new “max tool surface” guarantee and allows excluded capabilities to execute in workflow contexts; the resolver path should apply the same capability check and return null/CAPABILITY_DISABLED for blocked tools.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 42c67a52ff

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/tools/index.ts
* Absent entry → defaults to 'core' (P1 backward-compat).
* lint:tools-capabilities enforces that every registered tool appears here.
*/
export const TOOL_CAPABILITY_MAP: Record<string, ToolCapability> = {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Add missing registered tools to capability map

TOOL_CAPABILITY_MAP is introduced as authoritative, but it omits currently registered tools (e.g. oc_progress_status, oc_reflect, oc_task_*, and oc_run_*), so the new lint:tools-capabilities check fails and the new capability-filter test that expects that lint to pass cannot succeed. This is reproducible on commit c1511083 by running node scripts/lint-tools-capabilities.js, which exits non-zero with missing tool names; it also means these tools silently fall back to 'core' instead of explicit grouping.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d53b70cb61

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/tools/index.ts
* Absent entry → defaults to 'core' (P1 backward-compat).
* lint:tools-capabilities enforces that every registered tool appears here.
*/
export const TOOL_CAPABILITY_MAP: Record<string, ToolCapability> = {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Add all registered tools to TOOL_CAPABILITY_MAP

The capability map is now treated as authoritative, but this object still omits registered tools (oc_progress_status, oc_reflect, oc_task_*, and oc_run_*), so those tools never get explicit capability tags. Fresh evidence: running node scripts/lint-tools-capabilities.js on this commit exits non-zero and reports 11 missing tools, which means the new lint:tools-capabilities check (and the related capability-filter test expectations) cannot pass reliably and these tools silently fall back to core.

Useful? React with 👍 / 👎.

Comment thread package.json
Comment on lines 81 to 86
"dependencies": {
"argon2": "^0.43.0",
"cheerio": "1.0.0-rc.12",
"commander": "^12.0.0",
"jose": "^5.10.0",
"proper-lockfile": "^4.1.2",
"puppeteer-core": "npm:rebrowser-puppeteer-core@23.10.3",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restore cheerio and turndown as direct dependencies

This change drops cheerio and turndown from dependencies, but production code still imports both in src/core/extract/html-to-markdown.ts (lines 1-2). On clean installs where those modules are not present transitively, loading extraction paths will fail with MODULE_NOT_FOUND, causing runtime breakage in markdown/content extraction flows.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 321d53042e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".


function collectRegisteredSourceFiles(dir, out = []) {
for (const entry of fs.readdirSync(dir, { withFileTypes: true })) {
if (entry.name === 'index.ts') continue;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Scan index.ts files when linting capability coverage

Do not skip every index.ts here: this creates false negatives in the new capability lint. The current walk ignores files like src/pilot/dynamic-skills/index.ts, which already calls state.server.registerTool(...), so missing capability tagging in those registrations can pass CI undetected and silently fall back to 'core' behavior instead of the intended group filtering.

Useful? React with 👍 / 👎.

Comment thread src/mcp-server.ts
Comment on lines +1222 to +1223
if (!this.capabilityFilter) return true;
return this.capabilityFilter.has(capability ?? 'core');
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid defaulting untagged tools to core capability

Treating missing capability metadata as 'core' causes pilot dynamic-skill tools to bypass --disable-tools=pilot: synthesized tools are registered at runtime without a capability field (src/pilot/dynamic-skills/index.ts), so this fallback marks them core and keeps them exposed/callable whenever core is allowed. That breaks the new CLI contract that disabled capability groups are removed from the tool surface.

Useful? React with 👍 / 👎.

shaun0927 added 4 commits May 13, 2026 19:58
The lint:tools-capabilities script regressed once develop landed 12 new
tools (#1062 normalize, #1060 progress, #1019 reflect, the #855 task
ledger and the run-harness ledger) — each missing from the capability
map. The capability-filter test then trips because filteredNames sees
develop-era tools that the filter doesn't classify.

Add 'core' entries for all 12 new tools so the filter routes them
through the default capability and lint stays green.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ccbfd773cf

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/mcp-server.ts
Comment on lines +1500 to +1503
// Capability gate check: reject calls to tools excluded by --tools-only / --disable-tools
if (this.capabilityFilter) {
const registry = this.tools.get(toolName);
if (registry && !this.isCapabilityAllowed(registry.definition.capability)) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Enforce capability gate on all tool execution paths

This capability check only runs inside handleToolsCall, so it does not protect internal callers that execute tools via direct handler lookup. In this codebase, execute_plan builds a PlanExecutor with mcpServer.getToolHandler(...) (src/tools/orchestration.ts), which means a server started with --disable-tools=<group> can still execute excluded tools when a plan references them. That breaks the new “maximum tool surface” guarantee and makes capability filtering inconsistent between direct MCP calls and workflow execution.

Useful? React with 👍 / 👎.

Comment thread src/mcp-server.ts
// Only inject expand_tools if the client supports notifications/tools/list_changed —
// otherwise there's no point since the client can't react to the notification.
if (this.exposedTier < 3 && this.clientSupportsListChanged) {
if (this.exposedTier < 3 && this.clientSupportsListChanged && this.isCapabilityAllowed('core')) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep tier expansion reachable for non-core capability filters

When core is filtered out (for example --tools-only=workflow), this condition suppresses expand_tools; combined with the default exposedTier = 1, higher-tier allowed tools (like workflow_* and execute_plan, tier 3) can become undiscoverable because tools/list returns no eligible tools and there is no expansion path. In practice, valid capability-filter combinations can yield an empty tool surface unless operators also know to set --all-tools/OPENCHROME_TOOL_TIER.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f42ab9fdda

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/index.ts
Comment on lines +102 to +103
.option('--tools-only <csv>', 'Expose only tools belonging to the specified capability groups (comma-separated). Valid values: core,crawl,recording,workflow,storage,profile,totp,pilot. Default: all groups exposed.')
.option('--disable-tools <csv>', 'Remove tools belonging to the specified capability groups (comma-separated). Valid values: core,crawl,recording,workflow,storage,profile,totp,pilot.')
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Honor capability flags in introspection mode

The new --tools-only/--disable-tools options are advertised globally, but when --introspect-tools-list is used the handler returns before capability parsing and before setMCPServerOptions(mcpOptions) runs, so introspection always emits the full unfiltered manifest. This makes automation that relies on --introspect-tools-list to verify filtered tool surfaces report incorrect results (e.g. --introspect-tools-list --tools-only=core still lists workflow/crawl tools).

Useful? React with 👍 / 👎.

@shaun0927 shaun0927 merged commit f267d92 into develop May 13, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 577b5dafc5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/tools/index.ts
* Absent entry → defaults to 'core' (P1 backward-compat).
* lint:tools-capabilities enforces that every registered tool appears here.
*/
export const TOOL_CAPABILITY_MAP: Record<string, ToolCapability> = {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Add oc_task_run tools to TOOL_CAPABILITY_MAP

TOOL_CAPABILITY_MAP is still incomplete: the seven oc_task_run_* tools registered by registerTaskRunTools are not listed, so the new capability lint path fails and those tools silently fall back to implicit 'core' classification instead of explicit grouping. Fresh evidence on this commit is that src/tools/task-run.ts registers oc_task_run_start/update/checkpoint/needs_help/complete/get/list (lines 250-256 there), while this map has no corresponding entries, which makes scripts/lint-tools-capabilities.js report missing tools and break the “all registered tools are tagged” invariant.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant