feat(inspect): add opt-in output token metrics (#981)#1100
Conversation
Adds provider-neutral estimated token and output-size metrics for crawl, crawl_sitemap, read_page, and validate_page so optimization claims can be verified without changing legacy responses by default.\n\nConstraint: Metrics must be opt-in for text surfaces and approximate rather than provider-specific.\nRejected: Exact tokenizer integration | would add dependency and overstate provider compatibility.\nConfidence: high\nScope-risk: narrow\nDirective: Keep strict compression-ratio gates on committed fixtures, not public live pages.\nTested: npm run build -- --pretty false; npm test -- --runInBand tests/core/metrics/token-estimate.test.ts tests/core/tools/crawl.engine.test.ts; npm run lint:changed\nNot-tested: Full npm test suite; live OpenChrome smoke.
Constraint: #1077 already covers read_page/crawl/validate metrics, so this stacked slice only adds inspect support on top of that open metrics branch.\nRejected: Reimplementing #981 read_page response telemetry on develop | would duplicate open PRs #1063 and #1077.\nConfidence: high\nScope-risk: narrow\nDirective: Keep include_metrics opt-in and provider-neutral; do not imply exact LLM billing.\nTested: npm test -- --runInBand tests/tools/inspect-metrics.test.ts; npm run build; npm run lint:changed; git diff --check\nNot-tested: Live OpenChrome MCP smoke; dependency npm audit still reports existing 6 vulnerabilities.
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
ⓘ You've reached your Qodo monthly free-tier limit. Reviews pause until next month — upgrade your plan to continue now, or link your paid account if you already have one. |
beb6b89 to
1225b20
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a0093361c3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| } catch { | ||
| // DOM serialization failed — fall through to AX mode as fallback | ||
| } |
There was a problem hiding this comment.
Return DOM-mode errors instead of silently falling back
This catch swallows DOM serialization failures and then falls through to the AX code path, so a request with mode: "dom" can return AX output instead of an error. That changes the contract for explicit DOM callers (they can no longer distinguish DOM failure from successful DOM output) and can break downstream parsers that rely on DOM-mode formatting.
Useful? React with 👍 / 👎.
| if (config.security?.sanitize_content === false || result.isError || !result.content) { | ||
| return result; |
There was a problem hiding this comment.
Honor include_metrics when sanitization is disabled
With this early return, include_metrics: true is ignored whenever sanitize_content is disabled, because metrics are appended only later in the sanitization branch. In environments that run with --no-sanitize-content, callers now get no metrics despite explicitly requesting them.
Useful? React with 👍 / 👎.
| const maxDepth = filter === 'interactive' | ||
| ? Math.min(requestedDepth ?? defaultDepth, defaultDepth) | ||
| : requestedDepth ?? defaultDepth; | ||
| const maxDepth = (args.depth as number) || defaultDepth; |
There was a problem hiding this comment.
Preserve interactive depth cap in AX tree requests
Using (args.depth as number) || defaultDepth removes the prior interactive cap and also treats depth: 0 as unset. For filter: "interactive", callers can now request arbitrarily deep AX trees (e.g., depth 10+) instead of being bounded to the documented interactive default cap, which increases output size and runtime risk.
Useful? React with 👍 / 👎.
| } | ||
| : { summary, pages: minimalPages, note: 'Content omitted due to size constraints' }; | ||
| outputJson = JSON.stringify(minimalOutput, null, 2); | ||
| outputJson = JSON.stringify({ summary, pages: minimalPages, note: 'Content omitted due to size constraints' }, null, 2); |
There was a problem hiding this comment.
Keep include_metrics data in crawl size-fallback output
When crawl output exceeds MAX_OUTPUT_CHARS twice, this branch emits a minimal payload that omits all metrics even if include_metrics was requested. That makes include_metrics behavior inconsistent specifically on large crawls, where size/token estimates are most valuable and expected by clients.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a5ee0324c2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| async function fetchPage( | ||
| sessionId: string, | ||
| url: string, |
There was a problem hiding this comment.
Restore fetchOnePage exports consumed by crawl runner
This change removes the exported fetchOnePage, FetchOnePageOptions, and FetchOnePageResult API from src/tools/crawl.ts, but src/core/crawl/runner.ts still imports those symbols (fetchOnePage as defaultFetchOnePage plus the two types). In this commit state, TypeScript compilation fails with missing-export errors, which blocks build/deploy for crawl job tooling.
Useful? React with 👍 / 👎.
| if (includeMetrics) { | ||
| cleaned['_metrics'] = buildTextMetrics(JSON.stringify(cleaned), { mode: modeForMetrics }); | ||
| } | ||
| return { ...block, text: JSON.stringify(cleaned) }; |
There was a problem hiding this comment.
Compute semantic metrics from the final serialized JSON
When mode: "semantic" and include_metrics: true, metrics are calculated from JSON.stringify(cleaned) before _metrics is inserted, then the payload is stringified again with _metrics present. This guarantees returned_chars/estimated_tokens can be lower than the emitted text length, so callers receive inconsistent metrics specifically for semantic responses.
Useful? React with 👍 / 👎.
| return { | ||
| content: [{ type: 'text', text: pageStatsLine + output + axPaginationSection }], |
There was a problem hiding this comment.
Preserve AX refs map in read_page responses
The AX-mode return path now emits only content and drops the refs object entirely, even though the AX walk still mints ref IDs and downstream consumers/tests rely on result.refs metadata (role/name/TTL) to resolve stable element references. This is a response-shape regression for read_page(mode='ax') callers that expect the structured refs payload.
Useful? React with 👍 / 👎.
| const output = lines.join('\n'); | ||
| const includePaginationAx = args.includePagination !== false; | ||
| const axPaginationSection = includePaginationAx ? formatPaginationSection(await detectPagination(page, tabId)) : ''; |
There was a problem hiding this comment.
Honor AX delta compression requests
This path now always joins and returns the full AX snapshot without checking args.compression === 'delta', so clients requesting AX delta mode receive full trees on every call. That regresses the existing compression contract and can significantly increase output size and latency on repeated reads of large pages.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 961c01b563
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| export function registerReadPageTool(server: MCPServer): void { | ||
| server.registerTool('read_page', sanitizedHandler, definition); | ||
| } |
There was a problem hiding this comment.
Re-export readPageHandlerForReuse
This file no longer exports readPageHandlerForReuse, but src/tools/_shared/return-after-state.ts still imports and calls that symbol (import at line 16, call at line 96). In this commit state, TypeScript compilation fails with a missing-export error, which blocks builds and breaks every tool path that includes return-after-state chaining.
Useful? React with 👍 / 👎.
| if (mode !== 'ax' && mode !== 'dom' && mode !== 'css' && mode !== 'semantic') { | ||
| return { | ||
| content: [{ type: 'text', text: `Error: Invalid mode "${mode}". Must be "ax", "dom", "css", "semantic", or "markdown".` }], | ||
| content: [{ type: 'text', text: `Error: Invalid mode "${mode}". Must be "ax", "dom", "css", or "semantic".` }], | ||
| isError: true, |
There was a problem hiding this comment.
Keep markdown mode accepted in read_page
The mode gate now rejects mode: "markdown", so existing callers receive an error instead of markdown output. This is a behavior regression for an already-supported mode (including onlyMainContent/includeLinks options), and it directly conflicts with current markdown contract coverage in tests/tools/read-page.test (Markdown Mode section), so integrations relying on markdown extraction will break.
Useful? React with 👍 / 👎.
| return { | ||
| content: [{ type: 'text', text: statsLine + delta.content + domPaginationSection }], | ||
| }; |
There was a problem hiding this comment.
Restore _compression metadata on DOM delta responses
When compression === "delta" and a delta is returned, this branch now emits only content and drops _compression fields (level, originalChars, compressedChars). Downstream observability and consumers that distinguish delta/full outputs from _compression lose that signal, and the existing read_page compression-savings expectations (see tests/metrics/tool-output-observability.test) no longer hold.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b3522a9148
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const batchResults = await Promise.all( | ||
| batch.map((item) => | ||
| limiter(async () => { | ||
| const runFetch = async () => { | ||
| // Check robots.txt before fetching |
There was a problem hiding this comment.
Restore adaptive dispatcher handling in crawl
This batch execution path now always uses the fixed limiter and never routes through the adaptive dispatcher, so dispatcher: "adaptive" / dispatcher_options requests are effectively ignored. That regresses crawl behavior on rate-limited or memory-pressured targets (the exact cases adaptive mode was designed for in src/core/crawl/dispatcher.ts) and removes the expected adaptive throttling semantics for existing callers.
Useful? React with 👍 / 👎.
| const output = outputLines.join('\n'); | ||
| const outputCharCount = output.length; | ||
| diagnostics.formatMs = mark() - formatStart; | ||
| const output = lines.join('\n'); |
There was a problem hiding this comment.
Honor compact AX output requests
AX responses are now always built from the full lines array, so compact: true no longer changes output. Callers that rely on compact mode to suppress non-actionable leaves now receive much larger trees, which increases truncation risk and latency on complex pages.
Useful? React with 👍 / 👎.
| return { | ||
| content: [{ type: 'text', text: pageStatsLine + output + axPaginationSection }], | ||
| refs: refsMap, | ||
| }); | ||
| }; |
There was a problem hiding this comment.
Keep read_page diagnostics output when requested
The handler now returns plain content here without attaching any _diagnostics, and the function no longer tracks timing fields at all, so diagnostics: true callers cannot get mode/timing metadata. This is a regression for tooling that depends on those fields to debug DOM vs AX fallback behavior and sanitization overhead.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 73adb34f96
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // Seed the BFS queue with the start URL | ||
| const normalizedStart = normalizeUrl(url); | ||
| enqueueItems([{ url: normalizedStart, depth: 0 }]); | ||
| tracker.enqueue([{ url: normalizedStart, depth: 0 }]); |
There was a problem hiding this comment.
Restore best_first crawl strategy handling
This path hard-codes BFS queue seeding/processing and no longer honors strategy: "best_first", so callers that previously depended on URL scoring now get FIFO traversal instead. Under max_pages limits this can materially change which pages are returned (high-priority URLs may never be visited), which is a behavior regression for existing crawl integrations.
Useful? React with 👍 / 👎.
| const result = await serializeDOM(page, cdpClient, { | ||
| maxDepth: depth ?? -1, | ||
| filter: filter, | ||
| interactiveOnly: filter === 'interactive', | ||
| planningProfile, | ||
| })); | ||
| diagnostics.formatMs = diagnostics.domGetDocumentMs; | ||
| }); |
There was a problem hiding this comment.
Pass planningProfile through to DOM serialization
The DOM path now calls serializeDOM without forwarding args.planningProfile, which forces the serializer default profile even when callers request planningProfile: "stable". That silently disables the stable/noise-reduced output mode and can increase output churn and token usage for workflows that rely on deterministic stable snapshots.
Useful? React with 👍 / 👎.
| return { | ||
| content: [{ type: 'text', text: outputText + domPaginationSection }], | ||
| }; |
There was a problem hiding this comment.
Reattach node_refs block to DOM responses
This return now emits only DOM text plus pagination, dropping the [node_refs] mapping that was previously attached from emitted backend node IDs. Clients using DOM-mode output to recover stable node references lose that mapping and can no longer resolve those refs from the same response.
Useful? React with 👍 / 👎.
|
Deferred: the PR description specifies "this PR should merge after #1077". #1077 has fix commits pushed (370cf98 — markdown-clean restored) but is still blocked on develop CI signal (PR #1105 in flight). Once #1077 merges to |
# Conflicts: # src/tools/crawl.ts
Merge rationale (stack consolidation)Intent. Refs #981 — extends the open #1077 Why this is correct.
CI. Targets |
The merge of PR #1100 (inspect metrics) into feat/990-token-metrics dropped the readPageHandlerForReuse export from read-page.ts, which src/tools/_shared/return-after-state.ts (#845 plumbing on develop) imports. Restoring the export fixes the TS2305 build break that followed the develop merge. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Progress / Review status
Auto-refreshed 2026-05-13 — owner comments cleaned up to reduce review noise.
feat/981-inspect-metrics→feat/990-token-metrics33d0204— Expose focused inspect metrics without changing defaultsOwner comment cleanup: 0 issue + 0 inline review comments deleted. Outstanding feedback from automated/external reviewers above is unchanged.
Summary
include_metricspattern toinspect, the focused high-frequency page-state tool called out by feat(obs): payload budget telemetry for browser tool outputs (browser-use adoption G4) #981.inspectdefault output byte-for-byte free of metrics unless callers passinclude_metrics: true.ceil(chars / 4)helper from feat(core): token and compression metrics for high-volume read tools (#990) #1077 and labels the result as estimated tokens, not billing.Refs #981. Stacked on #1077; this PR should merge after #1077.
Direction / non-overlap review
read_page,crawl,crawl_sitemap, andvalidate_page; this PR only addsinspecton top of that branch.Verification
npm test -- --runInBand tests/tools/inspect-metrics.test.tsnpm run buildnpm run lint:changedgit diff --checkPost-merge OpenChrome validation checklist
After #1077 and this PR merge:
navigatetohttps://github.com/browser-use/browser-useinspectwith{ "query": "repository title, stars, latest visible release link" }inspectcall with{ "include_metrics": true }inspectoutput has no[openchrome_metrics]footer.[openchrome_metrics]JSON footer.returned_charsequals the body length before the footer.estimated_tokens === Math.ceil(returned_chars / 4).