Skip to content

feat(host): tool category toggle / --slim mode (#847)#944

Closed
shaun0927 wants to merge 23 commits into
developfrom
feat/847-tool-category-toggle
Closed

feat(host): tool category toggle / --slim mode (#847)#944
shaun0927 wants to merge 23 commits into
developfrom
feat/847-tool-category-toggle

Conversation

@shaun0927
Copy link
Copy Markdown
Owner

@shaun0927 shaun0927 commented May 12, 2026

Progress / Review status

Auto-refreshed 2026-05-13 — owner comments cleaned up to reduce review noise.

Field Value
Branch feat/847-tool-category-toggledevelop
Draft no
CI ✅ all 9 checks passing
Mergeable ❌ CONFLICTING
Review decision
Codex (latest) 💡 suggestions posted
Other reviewers (latest) chatgpt-codex-connector: commented
Head 878d6a7 — Keep PR branch mergeable with develop
Commits 14

Owner comment cleanup: 9 issue + 0 inline review comments deleted. Outstanding feedback from automated/external reviewers above is unchanged.


Summary

Implements issue #847 (r2) — chrome-devtools-mcp adoption epic, item C.

Adds per-category tool registration filtering. Operators can ship a slimmer surface to small-context model deployments via --slim or --enable-categories=<csv> / --disable-categories=<csv>. The reliability and observe categories are always-on (server must remain diagnosable even under aggressive filtering).

What changed

  • New src/tools/_shared/category.tsToolCategory enum + canonical TOOL_TO_CATEGORY map (70 tools, one canonical category each) + resolveEnabledCategories() implementing the resolution rules.
  • New src/resources/tools-disabled.tsopenchrome://tools/disabled MCP sidecar resource exposing the filtered names with restart hints (Restart openchrome with --enable-categories=<category>).
  • New scripts/lint-tool-categories.mjs + npm run lint:categories — fails CI if any registered tool is missing from TOOL_TO_CATEGORY or stale entries linger.
  • New tests/tools/registration-default.snapshot.test.ts — pins default tools/list to an explicit alphabetized list of 70 tools (P2 byte-parity).
  • New tests/tools/category-resolution.test.ts — locks resolution rules: slim → SLIM_CATEGORIES + always-on; enable subset → only those + always-on; disable subtracts; reliability+observe cannot be disabled even via combined flags.
  • Modified src/index.ts (commander) — --slim, --enable-categories=<csv>, --disable-categories=<csv> plus OPENCHROME_SLIM/OPENCHROME_ENABLE_CATEGORIES/OPENCHROME_DISABLE_CATEGORIES env mirrors. Flags win over env. CSV parse errors are fatal (exit 2).
  • Modified src/tools/index.ts — registration filtered through resolveEnabledCategories(); skipped tools are not registered (so tools/list does not include them).
  • Modified src/mcp-server.ts — registers the disabled-tools sidecar resource at startup.

9 files changed, +1291 / -103.

Test plan

  • npm run build — PASS (exit 0)
  • npm test -- --testPathPattern="category|registration-default" — 28/28 PASS
  • node scripts/lint-tool-categories.mjs — PASS (70 registered tools all categorized)
  • Real verification per issue feat(host): tool category toggle / --slim mode (chrome-devtools-mcp adoption C) #847 §"Real verification" — restart-and-introspect walkthrough using a tester openchrome instance (tools/list payload byte-size comparison default vs --slim, always-on reliability/observe assertion, disabled-tool sidecar resource lookup).

P2 parity (portability-harness contract)

  • Default behavior (no flags, no env) → tools/list returns the 70-tool v1.10.4/v1.11.0 surface, pinned by snapshot test.
  • Resolution rules guarantee reliability + observe are always included (an openchrome instance must remain diagnosable).
  • New CLI flags + env vars + sidecar resource — all opt-in; no side effects when the operator does not use them.

Closes #847.

🤖 Generated with Claude Code

@gemini-code-assist
Copy link
Copy Markdown

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@qodo-code-review
Copy link
Copy Markdown

ⓘ You've reached your Qodo monthly free-tier limit. Reviews pause until next month — upgrade your plan to continue now, or link your paid account if you already have one.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d9a0328e18

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/tools/index.ts Outdated
Comment on lines +323 to +327
allEnabled = false;
}
}

if (allEnabled) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Register mixed-category registrars independently

registerAllTools currently skips an entire registrar when any one of its tool categories is disabled, because allEnabled is cleared if one tool name fails the category check. In this commit, the orchestration registrar emits both worker_update (categorized as tabs) and multiple workflow_* tools, so --disable-categories=tabs unintentionally removes workflow functionality too, which breaks the promised per-category filtering behavior.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7d83f29e3a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/index.ts Outdated
Comment on lines +282 to +283
const slim =
options.slim === true || process.env.OPENCHROME_SLIM === '1';
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Let CLI category flags override env-driven slim mode

The slim toggle is computed as options.slim === true || OPENCHROME_SLIM === '1', so a globally set OPENCHROME_SLIM=1 forces slim mode even when the operator explicitly passes --enable-categories=... for a one-off run. Because resolveEnabledCategories() ignores enabled when slim is true, those CLI categories are silently discarded and the process starts with the wrong tool surface, which contradicts the stated "flags win over env vars" behavior.

Useful? React with 👍 / 👎.

Comment thread src/index.ts
Comment on lines +295 to +301
enabledCsv.length > 0
? parseCategoryCsv(
enabledCsv,
options.enableCategories
? '--enable-categories'
: 'OPENCHROME_ENABLE_CATEGORIES',
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Ignore enabled-category parse errors when slim is active

When slim mode is active, enabled is not used by resolveEnabledCategories(), but the code still parses and validates enabledCsv unconditionally. This causes startup to fail on irrelevant data (for example, OPENCHROME_ENABLE_CATEGORIES=bogus openchrome serve --slim exits with code 2), even though slim is documented to win over --enable-categories; the ignored branch should not be able to block server startup.

Useful? React with 👍 / 👎.

shaun0927 and others added 4 commits May 13, 2026 04:02
Adds per-category tool registration filtering. Operators can ship a
slimmer surface to small-context model deployments via --slim or
--enable-categories=<csv> / --disable-categories=<csv>. Reliability and
observe categories are always-on (server must remain diagnosable).

Default behavior (no flags) is byte-identical to v1.11.0 — snapshot test
pins tools/list parity. CI lint enforces that every registered tool
declares a category, and a sidecar MCP resource (openchrome://tools/disabled)
exposes the filtered names with restart hints.

Closes #847.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex P1: registerAllTools previously skipped an entire registrar when
any tool in it had a disabled category, so --disable-categories=tabs
removed orchestration's workflow_* tools too.

Replaces the all-or-nothing per-registrar branch with a Proxy-wrapped
server that delegates registerTool only when the tool's category is in
the enabled set. Each registrar now registers exactly the tools whose
categories are enabled; tools in the same registrar with disabled
categories are silently skipped.

Tests now cover the mixed-category orchestration registrar: tabs-only
disable removes worker_update but keeps workflow_* tools.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Keep --enable/--disable category invocations from being silently overridden by OPENCHROME_SLIM and ignore enabled-category data when slim mode is the active branch.

Constraint: --slim itself still wins over enabled categories when explicitly requested.

Rejected: Parsing ignored enabled-category CSV under slim | invalid ignored data should not block startup.

Confidence: medium

Scope-risk: narrow

Directive: Maintain CLI flag precedence over env defaults for one-off operator runs.

Tested: /Users/jh0927/openchrome/node_modules/.bin/tsc -p tsconfig.json --pretty false; npx jest --config jest.config.js --runInBand tests/tools/category-resolution.test.ts tests/tools/registration-default.snapshot.test.ts

Not-tested: Full GitHub Actions matrix before push
Constraint: PR #944 was rebased onto develop, whose default tool surface includes newly merged observe/context/performance tools.\nRejected: Removing newly merged tools from REGISTRATION_ENTRIES | that would regress current develop behavior and broaden this PR beyond category filtering.\nConfidence: high\nScope-risk: narrow\nDirective: Update category snapshots whenever default registrations intentionally change.\nTested: npx tsc -p tsconfig.json --pretty false; npx jest --config jest.config.js --runInBand tests/tools/category-resolution.test.ts tests/tools/registration-default.snapshot.test.ts --silent\nNot-tested: full CI matrix
@shaun0927 shaun0927 force-pushed the feat/847-tool-category-toggle branch from 7d83f29 to 518c12e Compare May 12, 2026 19:19
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Swish!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

shaun0927 added 4 commits May 13, 2026 04:29
Constraint: CI runs the Cursor cross-env suite with OPENCHROME_RUN_CROSS_ENV=1, where develop now exposes 46 non-expand tier-one tools.\nRejected: Reverting new tier-one registrations | those came from current develop and are outside PR #944's scope.\nConfidence: high\nScope-risk: narrow\nDirective: Update this count when intentional tier-one tool exposure changes.\nTested: npx jest --config jest.config.js --runInBand tests/cross-env/cursor-verification.test.ts (skipped locally without cross-env flag)\nNot-tested: full CI matrix
Constraint: This branch inherits develop's 45-tool tier-one surface and Windows CI checks out fixtures with CRLF.\nRejected: Reverting develop tool registrations | outside PR #944 scope and would reintroduce merge drift.\nConfidence: high\nScope-risk: narrow\nDirective: Keep Cursor tier-one counts branch-specific and normalize fixture newlines for generated JSON.\nTested: npx jest --config jest.config.js --runInBand tests/tools/console-capture-regression.test.ts tests/cross-env/cursor-verification.test.ts\nNot-tested: full CI matrix
Constraint: The registration guard must keep the same source-level off-switch contract asserted by the #846 regression test.\nRejected: Updating the test to a different but equivalent return guard | this PR is about category filtering, so preserving develop's guard shape is lower risk.\nConfidence: high\nScope-risk: narrow\nDirective: Keep OPENCHROME_PERF_INSIGHTS default-on and only skip these tools when explicitly set to 0.\nTested: npx jest --config jest.config.js --runInBand tests/tools/oc-performance-insights-registration.test.ts tests/tools/category-resolution.test.ts --silent; npx tsc -p tsconfig.json --pretty false\nNot-tested: full CI matrix
Constraint: Windows CI can interleave unrelated Jest worker output into the in-process stdout hook.\nRejected: Requiring stdout to contain only one physical line | that assertion flakes when shared process output is polluted outside the CLI path.\nConfidence: high\nScope-risk: narrow\nDirective: Assert the secret token appears exactly once and never in stderr; do not depend on unrelated captured stdout being empty.\nTested: npx jest --config jest.config.js --runInBand tests/cli/admin-keys.test.ts\nNot-tested: full CI matrix
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Delightful!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Merge latest develop into PR #944 and keep the shared cursor/console fixture expectations aligned after the server-to-client primitive landed.

Constraint: PR #962 updated the same cross-environment and console fixture tests, leaving this PR blocked by fixture-only conflicts.\nRejected: Rewriting feature code during conflict refresh | the merge conflict was limited to shared CI fixtures.\nConfidence: high\nScope-risk: narrow\nDirective: Keep conflict refreshes limited to shared fixture expectations unless feature code conflicts require design review.\nTested: npx jest tests/tools/console-capture-regression.test.ts tests/cross-env/cursor-verification.test.ts --runInBand --forceExit (cross-env suite may skip locally by platform guard); git diff --check for touched fixture files.\nNot-tested: Full GitHub Actions matrix after this refresh.\nCo-authored-by: OmX <omx@oh-my-codex.dev>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 335d7ad491

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/tools/index.ts
Comment on lines +456 to +457
for (const entry of REGISTRATION_ENTRIES) {
entry.register(filteredServer);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Skip fully disabled registrars before invoking side effects

The registration loop invokes every entry.register(...) even when none of that entry's tools are in enabled categories, so category filtering only suppresses registerTool calls but not registrar side effects. In this commit, that means --disable-categories=pilot still executes the oc_proxy_hook registrar path (including require('../pilot/proxy/hook') when pilot flags are set), which can load optional pilot code or fail startup despite the category being explicitly disabled. This breaks the expectation that disabled categories are excluded at startup, not just hidden from tools/list.

Useful? React with 👍 / 👎.

Merge current develop into the category-toggle branch and keep disabled registrars from loading optional pilot code when their entire advertised tool surface is filtered out.

Constraint: PR #944 was GitHub-CONFLICTING and had Codex feedback that disabled categories must not invoke registrar side effects.

Rejected: invoking every registrar behind only a registerTool proxy | still loads optional modules for fully disabled categories.

Confidence: high

Scope-risk: moderate

Directive: Keep per-tool filtering for mixed-category registrars; only skip registrars when all advertised tools are disabled.

Tested: npm test -- tests/tools/category-resolution.test.ts --runInBand; npm run build

Not-tested: full cross-platform CI is left to GitHub Actions after push.

Co-authored-by: OmX <omx@oh-my-codex.dev>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5cc31c01e0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/index.ts
Comment on lines +419 to +423
: options.enableCategories ??
process.env.OPENCHROME_ENABLE_CATEGORIES ??
'';
const disabledCsv =
options.disableCategories ??
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Treat category CLI flags as full override of env selection

The category resolution still mixes in OPENCHROME_ENABLE_CATEGORIES/OPENCHROME_DISABLE_CATEGORIES whenever the corresponding CLI flag is omitted, so a one-off flag can be silently constrained by stale environment state and produce the wrong tool surface. For example, with OPENCHROME_DISABLE_CATEGORIES=vision, running --enable-categories=vision still removes vision and leaves only always-on categories. That contradicts the stated "flags win over env vars" behavior and can break operator expectations in production deployments that rely on env defaults.

Useful? React with 👍 / 👎.

Merge latest develop and categorize the new crawl and skill-replay tools so category filtering remains complete without invoking disabled pilot registrars.

Constraint: GitHub reported PR #944 as CONFLICTING after develop added crawl job and skill replay registrations.

Rejected: switching registerAllTools to async unconditionally | category filtering must preserve the synchronous registry contract.

Confidence: high

Scope-risk: moderate

Directive: Every new MCP tool must be added to TOOL_TO_CATEGORY before the registrar can be safely filtered.

Tested: npm test -- tests/tools/category-resolution.test.ts --runInBand; npm run build

Not-tested: full cross-platform CI is left to GitHub Actions after push.

Co-authored-by: OmX <omx@oh-my-codex.dev>
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b2ad97b3b6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/index.ts
Comment on lines 115 to 116
if (options.introspectToolsList) {
const { MCPServer } = await import('./mcp-server');
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Honor category flags in introspect-tools-list mode

The --introspect-tools-list early-return path registers tools with registerAllTools(server) before any category-selection parsing runs, so --slim / --enable-categories / --disable-categories (and their env mirrors) are ignored in introspection mode. This makes commands like openchrome serve --introspect-tools-list --slim report the full default tool surface instead of the actual filtered one, which can mislead rollout checks and any automation that relies on introspection output to verify startup configuration.

Useful? React with 👍 / 👎.

shaun0927 and others added 2 commits May 13, 2026 10:00
Flush the introspection JSON before process exit and mark new crawl job required fields with the schema-lint prefix so CI receives complete valid JSON.

Constraint: GitHub CI failed lint:tool-schemas with truncated stdin and required-prefix violations after the latest develop merge.

Rejected: Rerunning CI without code changes | local reproduction showed stdout could be truncated before the pipe consumed it.

Confidence: high

Scope-risk: narrow

Directive: Do not call process.exit immediately after writing machine-readable stdout; exit in the write callback or return naturally.

Tested: npm run build; npm run lint:tool-schemas

Not-tested: full matrix CI after this push.

Co-authored-by: OmX <omx@oh-my-codex.dev>
Constraint: The category-toggle PR must not change crawl registration semantics while CI expects the default tool snapshot to match the merged tool surface.
Rejected: Hiding crawl job tools behind the category-toggle change | that would alter #944 behavior instead of fixing the stale test baseline.
Confidence: high
Scope-risk: narrow
Directive: Update this snapshot whenever REGISTRATION_ENTRIES adds or removes default tools.
Tested: npm test -- tests/tools/registration-default.snapshot.test.ts --runInBand; npm run build; npm run lint:tool-schemas
Not-tested: Full CI matrix.

Co-authored-by: OmX <omx@oh-my-codex.dev>
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

Constraint: develop advanced while this PR was under review, making GitHub report the branch as conflicting.
Rejected: Reworking the feature scope during conflict resolution | only the develop merge and introspection flush conflict were required.
Confidence: high
Scope-risk: narrow
Directive: Preserve the awaited stdout flush in introspection mode when resolving future src/index.ts conflicts.
Tested: npm run build; npm run lint:tool-schemas
Not-tested: Full CI matrix after this merge commit.

Co-authored-by: OmX <omx@oh-my-codex.dev>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 878d6a7bbb

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/index.ts
Comment on lines 115 to 116
if (options.introspectToolsList) {
const { MCPServer } = await import('./mcp-server');
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Honor category flags in introspection path

The new --slim / --enable-categories / --disable-categories options are advertised on serve, but the --introspect-tools-list fast path still calls registerAllTools(server) with no selection. In practice, openchrome serve --introspect-tools-list --slim (or with category env vars) returns the full tool manifest instead of the filtered one, which makes introspection inconsistent with normal startup and can mislead CI/operators validating slim mode.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d32f0221f0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/tools/index.ts
Comment on lines +397 to +398
// Doctor report tool (#898)
{ tools: ['oc_doctor_report'], register: registerOcDoctorReportTool },
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Re-add task-ledger tools to REGISTRATION_ENTRIES

This refactor removes the oc_task_start/oc_task_list/oc_task_get/oc_task_cancel/oc_task_wait registrations (and the startup orphan-reap path tied to them), so upgrading to this commit makes those APIs disappear as unknown tool despite their implementations still existing under src/tools/oc-task-*.ts. Any automation depending on background task execution or task polling will break immediately.

Useful? React with 👍 / 👎.

Comment thread src/tools/index.ts
Comment on lines +327 to +330

// Self-healing tools
{ tools: ['oc_connection_health'], register: registerConnectionHealthTool },

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep continuity diagnostics tools registered by default

The continuity block now registers oc_session_snapshot, oc_session_resume, and oc_journal but no longer includes oc_reflect or oc_progress_status, even though their tool modules (and docs/tests for oc_progress_status) remain in-tree. This silently shrinks the default tool surface and causes callers of those existing public tool names to fail with unknown tool after upgrade.

Useful? React with 👍 / 👎.

Comment thread src/tools/index.ts
Comment on lines +434 to +438
// Deterministic observation / DevTools / portable context
{ tools: ['oc_observe'], register: registerOcObserveTool },
{ tools: ['oc_devtools_url'], register: registerOcDevToolsUrlTool },
{ tools: ['oc_context_export', 'oc_context_import'], register: registerOcContextTools },
];
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Restore run-harness registration when flag is enabled

The prior isRunHarnessEnabled() gate and registerRunHarnessTools(server) call are gone, so OPENCHROME_RUN_HARNESS=1 no longer exposes oc_run_start/oc_run_status/oc_run_events/oc_run_finish. Because src/mcp-server.ts still contains run-harness plumbing, this commit leaves the feature flag wired but non-functional, breaking run-ledger workflows in environments that enable it.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 48fb64b834

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/tools/index.ts
Comment on lines +377 to 379
{ tools: ['oc_assert'], register: registerOcAssertTool },
{ tools: ['oc_evidence_bundle'], register: registerOcEvidenceBundleTool },

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Re-register oc_normalize_action in tool registry

The REGISTRATION_ENTRIES refactor drops oc_normalize_action from startup registration, so after this commit hosts that call that existing public tool now get unknown tool. This is a behavior regression introduced here (the tool implementation and tests still exist in src/tools/oc-normalize-action.ts and tests/tools/oc-normalize-action.test.ts), and it contradicts the commit’s stated default-surface parity goal.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 17ce74c2ff

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/tools/index.ts
Comment on lines +435 to +438
{ tools: ['oc_observe'], register: registerOcObserveTool },
{ tools: ['oc_devtools_url'], register: registerOcDevToolsUrlTool },
{ tools: ['oc_context_export', 'oc_context_import'], register: registerOcContextTools },
];
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Re-register task-run tools in default registry

The new REGISTRATION_ENTRIES list no longer includes the registerTaskRunTools path, so oc_task_run_start/update/checkpoint/needs_help/complete/get/list disappear from tools/list and now return unknown tool after upgrade. This is a behavior regression introduced in this commit because the TaskRun tool implementation still exists in src/tools/task-run.ts (and read-scope wiring still references TaskRun names in src/auth/scope-policy.ts), so existing TaskRun-based automations lose their lifecycle API even when no category filtering flags are used.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4ac0f1c107

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/tools/index.ts
Comment on lines +227 to +230
const original = Reflect.get(target, prop, target) as (
...args: unknown[]
) => void;
original.call(target, name, ...rest);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Restore capability tagging in filtered registerTool path

The new proxy forwards registerTool without adding a definition.capability, but MCPServer still enforces capabilityFilter using that field (defaulting missing values to core). As a result, any run that sets capabilityFilter (including programmatic callers and existing capability-filter tests/scripts) misclassifies workflow/recording/crawl tools as core, so filtering and CAPABILITY_DISABLED behavior become incorrect.

Useful? React with 👍 / 👎.

Comment thread src/index.ts
Comment on lines +110 to +112
.option('--slim', `Register only the slim-mode tool categories (chrome-devtools-mcp parity). Always-on categories (reliability, observe) are kept. Also: OPENCHROME_SLIM=1 env var.\n${categoryHelpText()}`)
.option('--enable-categories <csv>', 'Comma-separated allow-list of tool categories to register. Mutually exclusive with --slim (slim wins). Also: OPENCHROME_ENABLE_CATEGORIES env var.')
.option('--disable-categories <csv>', 'Comma-separated deny-list of tool categories to skip. Always-on categories cannot be disabled. Also: OPENCHROME_DISABLE_CATEGORIES env var.')
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep legacy capability-filter CLI flags available

This refactor drops --tools-only/--disable-tools from serve, which makes existing startup commands using those flags fail at argument parsing. The capability-filter mechanism is still present in the server/types layer, so removing only the CLI entry points introduces a backward-incompatible regression for operators who rely on those flags in launch scripts.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2c625b0035

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/tools/index.ts
Comment on lines +347 to +351
'oc_recording_start',
'oc_recording_stop',
'oc_recording_list',
'oc_recording_export',
],
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 Badge Add oc_recording_status to recording registration metadata

The recording registrar entry omits oc_recording_status, but registerRecordingTools still registers that tool in src/tools/recording.ts; because TOOL_TO_CATEGORY also lacks oc_recording_status, the filtered registerTool proxy throws at startup (Tool "oc_recording_status" has no category...) as soon as the recording registrar runs. This is a default-path boot failure (no special flags needed), so registerAllTools cannot complete until this tool is included in the registration/category metadata (or no longer registered).

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 73e855b017

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/index.ts
Comment on lines 386 to +390
if (options.allTools || envTier >= 3) {
mcpOptions.initialToolTier = 3 as ToolTier;
setMCPServerOptions({ initialToolTier: 3 as ToolTier });
console.error('[openchrome] All tools exposed from startup');
} else if (envTier === 2) {
mcpOptions.initialToolTier = 2 as ToolTier;
setMCPServerOptions({ initialToolTier: 2 as ToolTier });
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reset MCP server options when no tool tier override is set

This branch only calls setMCPServerOptions(...) when --all-tools or OPENCHROME_TOOL_TIER is 2/3, but setMCPServerOptions is a replace-only global setter; if serve is invoked again in the same process (e.g., integration harnesses or embedded usage), a previous non-default tier persists and silently changes the next startup’s tool exposure. Before this commit the code always called the setter with an explicit object (including {}), so this introduces stale cross-invocation state.

Useful? React with 👍 / 👎.

Comment thread src/tools/index.ts
Comment on lines +346 to +350
tools: [
'oc_recording_start',
'oc_recording_stop',
'oc_recording_list',
'oc_recording_export',
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Include oc_recording_status in recording metadata list

registerRecordingTools registers oc_recording_status, but this registrar metadata omits that name, so category metadata is inconsistent with actual registration. Fresh evidence in this revision: TOOL_TO_CATEGORY now contains oc_recording_status, which avoids the old startup-throw path, but this omission now causes lint:categories to fail with a stale category entry and makes openchrome://tools/disabled under-report disabled capture tools.

Useful? React with 👍 / 👎.

@shaun0927
Copy link
Copy Markdown
Owner Author

Forthcoming review note: the author addressed Codex P0 (oc_recording_status added to TOOL_TO_CATEGORY, commit 73e855b). Three Codex P2 suggestions remain open and are deferred for follow-up since they are non-blocking for the category-toggle scope:

CI is currently blocked on develop's broken test signal; PR #1105 is in flight.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 52926d4488

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +35 to +38
let snapshot: { tools: DisabledToolEntry[]; capturedAt: string } = {
tools: [],
capturedAt: new Date(0).toISOString(),
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep disabled-tools snapshot isolated per server

The disabled-tools payload is stored in a module-level snapshot, so each setDisabledToolsSnapshot(...) call overwrites state for every MCPServer in the process. When two servers are started with different category selections (e.g., embedded/integration usage), openchrome://tools/disabled on the first server will report the second server’s filtered tools, which makes the resource inaccurate for that instance. This snapshot needs to be instance-scoped (or keyed) rather than global mutable state.

Useful? React with 👍 / 👎.

@shaun0927
Copy link
Copy Markdown
Owner Author

Develop CI restored, but this PR still has PR-specific test failures

PR #1105 has merged to develop and the develop CI baseline is now green (run 25808614236 was 9/9 on the test-hardened commit). After updating this branch via update-branch, current CI status is pass=0 fail=9 — the failures are PR-specific (not inherited from develop) and need author-driven debugging.

Addressed in this session

Author follow-up

  • Inspect this PR's failed jobs in the latest CI run; the failing suites are not inherited from develop. Common patterns I observed across the Group A PRs after the develop rebase: TaskJournal/journal summary regressions, snapshot drift in registration-default.snapshot.test.ts, and PR-feature test files (extract-data-modes, output-handles, etc.).
  • No outstanding Codex P0/P1/P2 on the latest commit; the PR is otherwise content-clean and ready for a final review once the PR-feature tests pass against the updated develop baseline.

@shaun0927
Copy link
Copy Markdown
Owner Author

Reviewed for merge readiness: not safe to merge in the current PR shape.

The product direction—tool category toggles / slim mode—is reasonable, but this branch is not currently a valid merge unit. It is DIRTY against develop, has 9 failing CI checks, and bot review identifies contract-breaking issues across registration and CLI behavior: mixed-category registrars can be skipped incorrectly, CLI flags do not consistently override env/slim mode, disabled categories can still invoke side effects, several default/flagged tool registrations are dropped, capability tagging regresses, legacy filter flags are removed, and oc_recording_status metadata is missing.

Those are not cosmetic failures; they change the exposed tool surface and host compatibility. This needs a fresh, current-develop implementation with regression tests for default registration, slim/tier filtering, CLI/env precedence, and disabled-tool isolation before it can be considered for merge.

Decision: not merge-ready; leaving unmerged.

@shaun0927
Copy link
Copy Markdown
Owner Author

Closing as superseded by fresh narrow PR #1196 for #847. This stale branch is conflicting and failing checks; the replacement keeps the scope to the requested --slim/core tool-surface alias with fresh verification.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant