Skip to content

feat(inspect): add opt-in output token metrics (#981)#1100

Merged
shaun0927 merged 8 commits into
feat/990-token-metricsfrom
feat/981-inspect-metrics
May 13, 2026
Merged

feat(inspect): add opt-in output token metrics (#981)#1100
shaun0927 merged 8 commits into
feat/990-token-metricsfrom
feat/981-inspect-metrics

Conversation

@shaun0927
Copy link
Copy Markdown
Owner

@shaun0927 shaun0927 commented May 12, 2026

Progress / Review status

Auto-refreshed 2026-05-13 — owner comments cleaned up to reduce review noise.

Field Value
Branch feat/981-inspect-metricsfeat/990-token-metrics
Draft no
CI
Mergeable ❌ CONFLICTING
Review decision
Codex (latest)
Other reviewers (latest)
Head 33d0204 — Expose focused inspect metrics without changing defaults
Commits 2

Owner comment cleanup: 0 issue + 0 inline review comments deleted. Outstanding feedback from automated/external reviewers above is unchanged.


Summary

Refs #981. Stacked on #1077; this PR should merge after #1077.

Direction / non-overlap review

Verification

  • npm test -- --runInBand tests/tools/inspect-metrics.test.ts
  • npm run build
  • npm run lint:changed
  • git diff --check

Post-merge OpenChrome validation checklist

After #1077 and this PR merge:

  1. Build and start OpenChrome:
    npm run build
    node dist/cli/index.js serve
  2. In an MCP client connected to a real Chrome session:
    • navigate to https://github.com/browser-use/browser-use
    • run inspect with { "query": "repository title, stars, latest visible release link" }
    • run the same inspect call with { "include_metrics": true }
  3. Pass criteria:
    • default inspect output has no [openchrome_metrics] footer.
    • metrics-enabled output includes a single [openchrome_metrics] JSON footer.
    • returned_chars equals the body length before the footer.
    • estimated_tokens === Math.ceil(returned_chars / 4).
    • no raw secret/password fixture text appears inside the metrics JSON.

shaun0927 added 2 commits May 13, 2026 00:05
Adds provider-neutral estimated token and output-size metrics for crawl, crawl_sitemap, read_page, and validate_page so optimization claims can be verified without changing legacy responses by default.\n\nConstraint: Metrics must be opt-in for text surfaces and approximate rather than provider-specific.\nRejected: Exact tokenizer integration | would add dependency and overstate provider compatibility.\nConfidence: high\nScope-risk: narrow\nDirective: Keep strict compression-ratio gates on committed fixtures, not public live pages.\nTested: npm run build -- --pretty false; npm test -- --runInBand tests/core/metrics/token-estimate.test.ts tests/core/tools/crawl.engine.test.ts; npm run lint:changed\nNot-tested: Full npm test suite; live OpenChrome smoke.
Constraint: #1077 already covers read_page/crawl/validate metrics, so this stacked slice only adds inspect support on top of that open metrics branch.\nRejected: Reimplementing #981 read_page response telemetry on develop | would duplicate open PRs #1063 and #1077.\nConfidence: high\nScope-risk: narrow\nDirective: Keep include_metrics opt-in and provider-neutral; do not imply exact LLM billing.\nTested: npm test -- --runInBand tests/tools/inspect-metrics.test.ts; npm run build; npm run lint:changed; git diff --check\nNot-tested: Live OpenChrome MCP smoke; dependency npm audit still reports existing 6 vulnerabilities.
@gemini-code-assist
Copy link
Copy Markdown

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@qodo-code-review
Copy link
Copy Markdown

ⓘ You've reached your Qodo monthly free-tier limit. Reviews pause until next month — upgrade your plan to continue now, or link your paid account if you already have one.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a0093361c3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/tools/read-page.ts
Comment on lines +567 to 569
} catch {
// DOM serialization failed — fall through to AX mode as fallback
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Return DOM-mode errors instead of silently falling back

This catch swallows DOM serialization failures and then falls through to the AX code path, so a request with mode: "dom" can return AX output instead of an error. That changes the contract for explicit DOM callers (they can no longer distinguish DOM failure from successful DOM output) and can break downstream parsers that rely on DOM-mode formatting.

Useful? React with 👍 / 👎.

Comment thread src/tools/read-page.ts
Comment on lines +872 to 873
if (config.security?.sanitize_content === false || result.isError || !result.content) {
return result;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Honor include_metrics when sanitization is disabled

With this early return, include_metrics: true is ignored whenever sanitize_content is disabled, because metrics are appended only later in the sanitization branch. In environments that run with --no-sanitize-content, callers now get no metrics despite explicitly requesting them.

Useful? React with 👍 / 👎.

Comment thread src/tools/read-page.ts
const maxDepth = filter === 'interactive'
? Math.min(requestedDepth ?? defaultDepth, defaultDepth)
: requestedDepth ?? defaultDepth;
const maxDepth = (args.depth as number) || defaultDepth;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve interactive depth cap in AX tree requests

Using (args.depth as number) || defaultDepth removes the prior interactive cap and also treats depth: 0 as unset. For filter: "interactive", callers can now request arbitrarily deep AX trees (e.g., depth 10+) instead of being bounded to the documented interactive default cap, which increases output size and runtime risk.

Useful? React with 👍 / 👎.

Comment thread src/tools/crawl.ts Outdated
}
: { summary, pages: minimalPages, note: 'Content omitted due to size constraints' };
outputJson = JSON.stringify(minimalOutput, null, 2);
outputJson = JSON.stringify({ summary, pages: minimalPages, note: 'Content omitted due to size constraints' }, null, 2);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep include_metrics data in crawl size-fallback output

When crawl output exceeds MAX_OUTPUT_CHARS twice, this branch emits a minimal payload that omits all metrics even if include_metrics was requested. That makes include_metrics behavior inconsistent specifically on large crawls, where size/token estimates are most valuable and expected by clients.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a5ee0324c2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/tools/crawl.ts
Comment on lines 282 to 284
async function fetchPage(
sessionId: string,
url: string,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 Badge Restore fetchOnePage exports consumed by crawl runner

This change removes the exported fetchOnePage, FetchOnePageOptions, and FetchOnePageResult API from src/tools/crawl.ts, but src/core/crawl/runner.ts still imports those symbols (fetchOnePage as defaultFetchOnePage plus the two types). In this commit state, TypeScript compilation fails with missing-export errors, which blocks build/deploy for crawl job tooling.

Useful? React with 👍 / 👎.

Comment thread src/tools/read-page.ts
Comment on lines +940 to +943
if (includeMetrics) {
cleaned['_metrics'] = buildTextMetrics(JSON.stringify(cleaned), { mode: modeForMetrics });
}
return { ...block, text: JSON.stringify(cleaned) };
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Compute semantic metrics from the final serialized JSON

When mode: "semantic" and include_metrics: true, metrics are calculated from JSON.stringify(cleaned) before _metrics is inserted, then the payload is stringified again with _metrics present. This guarantees returned_chars/estimated_tokens can be lower than the emitted text length, so callers receive inconsistent metrics specifically for semantic responses.

Useful? React with 👍 / 👎.

Comment thread src/tools/read-page.ts
Comment on lines 846 to 847
return {
content: [{ type: 'text', text: pageStatsLine + output + axPaginationSection }],
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve AX refs map in read_page responses

The AX-mode return path now emits only content and drops the refs object entirely, even though the AX walk still mints ref IDs and downstream consumers/tests rely on result.refs metadata (role/name/TTL) to resolve stable element references. This is a response-shape regression for read_page(mode='ax') callers that expect the structured refs payload.

Useful? React with 👍 / 👎.

Comment thread src/tools/read-page.ts
Comment on lines +784 to 786
const output = lines.join('\n');
const includePaginationAx = args.includePagination !== false;
const axPaginationSection = includePaginationAx ? formatPaginationSection(await detectPagination(page, tabId)) : '';
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Honor AX delta compression requests

This path now always joins and returns the full AX snapshot without checking args.compression === 'delta', so clients requesting AX delta mode receive full trees on every call. That regresses the existing compression contract and can significantly increase output size and latency on repeated reads of large pages.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 961c01b563

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/tools/read-page.ts
Comment on lines 966 to 968
export function registerReadPageTool(server: MCPServer): void {
server.registerTool('read_page', sanitizedHandler, definition);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 Badge Re-export readPageHandlerForReuse

This file no longer exports readPageHandlerForReuse, but src/tools/_shared/return-after-state.ts still imports and calls that symbol (import at line 16, call at line 96). In this commit state, TypeScript compilation fails with a missing-export error, which blocks builds and breaks every tool path that includes return-after-state chaining.

Useful? React with 👍 / 👎.

Comment thread src/tools/read-page.ts
Comment on lines +142 to 145
if (mode !== 'ax' && mode !== 'dom' && mode !== 'css' && mode !== 'semantic') {
return {
content: [{ type: 'text', text: `Error: Invalid mode "${mode}". Must be "ax", "dom", "css", "semantic", or "markdown".` }],
content: [{ type: 'text', text: `Error: Invalid mode "${mode}". Must be "ax", "dom", "css", or "semantic".` }],
isError: true,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep markdown mode accepted in read_page

The mode gate now rejects mode: "markdown", so existing callers receive an error instead of markdown output. This is a behavior regression for an already-supported mode (including onlyMainContent/includeLinks options), and it directly conflicts with current markdown contract coverage in tests/tools/read-page.test (Markdown Mode section), so integrations relying on markdown extraction will break.

Useful? React with 👍 / 👎.

Comment thread src/tools/read-page.ts
Comment on lines +551 to +553
return {
content: [{ type: 'text', text: statsLine + delta.content + domPaginationSection }],
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restore _compression metadata on DOM delta responses

When compression === "delta" and a delta is returned, this branch now emits only content and drops _compression fields (level, originalChars, compressedChars). Downstream observability and consumers that distinguish delta/full outputs from _compression lose that signal, and the existing read_page compression-savings expectations (see tests/metrics/tool-output-observability.test) no longer hold.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b3522a9148

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/tools/crawl.ts
Comment on lines 571 to 574
const batchResults = await Promise.all(
batch.map((item) =>
limiter(async () => {
const runFetch = async () => {
// Check robots.txt before fetching
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Restore adaptive dispatcher handling in crawl

This batch execution path now always uses the fixed limiter and never routes through the adaptive dispatcher, so dispatcher: "adaptive" / dispatcher_options requests are effectively ignored. That regresses crawl behavior on rate-limited or memory-pressured targets (the exact cases adaptive mode was designed for in src/core/crawl/dispatcher.ts) and removes the expected adaptive throttling semantics for existing callers.

Useful? React with 👍 / 👎.

Comment thread src/tools/read-page.ts
const output = outputLines.join('\n');
const outputCharCount = output.length;
diagnostics.formatMs = mark() - formatStart;
const output = lines.join('\n');
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Honor compact AX output requests

AX responses are now always built from the full lines array, so compact: true no longer changes output. Callers that rely on compact mode to suppress non-actionable leaves now receive much larger trees, which increases truncation risk and latency on complex pages.

Useful? React with 👍 / 👎.

Comment thread src/tools/read-page.ts
Comment on lines +846 to +848
return {
content: [{ type: 'text', text: pageStatsLine + output + axPaginationSection }],
refs: refsMap,
});
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep read_page diagnostics output when requested

The handler now returns plain content here without attaching any _diagnostics, and the function no longer tracks timing fields at all, so diagnostics: true callers cannot get mode/timing metadata. This is a regression for tooling that depends on those fields to debug DOM vs AX fallback behavior and sanitization overhead.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 73adb34f96

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/tools/crawl.ts Outdated
Comment on lines +531 to +533
// Seed the BFS queue with the start URL
const normalizedStart = normalizeUrl(url);
enqueueItems([{ url: normalizedStart, depth: 0 }]);
tracker.enqueue([{ url: normalizedStart, depth: 0 }]);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restore best_first crawl strategy handling

This path hard-codes BFS queue seeding/processing and no longer honors strategy: "best_first", so callers that previously depended on URL scoring now get FIFO traversal instead. Under max_pages limits this can materially change which pages are returned (high-priority URLs may never be visited), which is a behavior regression for existing crawl integrations.

Useful? React with 👍 / 👎.

Comment thread src/tools/read-page.ts
Comment on lines +523 to +527
const result = await serializeDOM(page, cdpClient, {
maxDepth: depth ?? -1,
filter: filter,
interactiveOnly: filter === 'interactive',
planningProfile,
}));
diagnostics.formatMs = diagnostics.domGetDocumentMs;
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Pass planningProfile through to DOM serialization

The DOM path now calls serializeDOM without forwarding args.planningProfile, which forces the serializer default profile even when callers request planningProfile: "stable". That silently disables the stable/noise-reduced output mode and can increase output churn and token usage for workflows that rely on deterministic stable snapshots.

Useful? React with 👍 / 👎.

Comment thread src/tools/read-page.ts
Comment on lines +564 to +566
return {
content: [{ type: 'text', text: outputText + domPaginationSection }],
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reattach node_refs block to DOM responses

This return now emits only DOM text plus pagination, dropping the [node_refs] mapping that was previously attached from emitted backend node IDs. Clients using DOM-mode output to recover stable node references lose that mapping and can no longer resolve those refs from the same response.

Useful? React with 👍 / 👎.

@shaun0927
Copy link
Copy Markdown
Owner Author

Deferred: the PR description specifies "this PR should merge after #1077". #1077 has fix commits pushed (370cf98 — markdown-clean restored) but is still blocked on develop CI signal (PR #1105 in flight). Once #1077 merges to develop, this PR can either be re-targeted to develop directly or merged forward into the consolidated feat/990-token-metrics base. The PR is content-clean (5 files, +168/-696 — mostly trimming, 0 Codex P0/P1/P2 outstanding).

@shaun0927
Copy link
Copy Markdown
Owner Author

Merge rationale (stack consolidation)

Intent. Refs #981 — extends the open #1077 include_metrics pattern to inspect, the focused high-frequency page-state tool.

Why this is correct.

CI. Targets feat/990-token-metrics (the #1077 head); workflow only runs on main/develop PRs. CI will run on the consolidated stack root when #1077 lands on develop.

@shaun0927 shaun0927 merged commit acafdb6 into feat/990-token-metrics May 13, 2026
shaun0927 pushed a commit that referenced this pull request May 13, 2026
The merge of PR #1100 (inspect metrics) into feat/990-token-metrics
dropped the readPageHandlerForReuse export from read-page.ts, which
src/tools/_shared/return-after-state.ts (#845 plumbing on develop)
imports. Restoring the export fixes the TS2305 build break that
followed the develop merge.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant