You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add lightweight output-size and token-estimate metrics to high-volume read/crawl tools so OpenChrome can prove and regress-test token optimization claims at the tool-response level.
Why
OpenChrome already claims strong token compression through DOM mode, snapshot delta, adaptive screenshot, and compact summaries. However, most tool responses do not expose consistent per-call metrics such as raw chars, returned chars, estimated tokens, truncation, compression mode, cache status, or filter reduction. Crawl4AI exposes token usage concepts in extraction flows; OpenChrome should expose deterministic local metrics for its own optimization layer.
This helps agents choose narrower tools and helps maintainers catch regressions after merge.
Non-goals
Do not call tokenizer services or LLM APIs.
Do not make exact provider-token guarantees.
Do not change default response content.
Do not add metrics to every tool in the first PR.
Scope
Initial tools:
read_page
crawl
crawl_sitemap
validate_page
extract_data is deliberately deferred unless its current JSON response path can expose metrics without changing legacy output. This keeps the first PR focused on high-volume page/content reads.
Add a shared helper:
estimateTokens(text: string): number
Use a documented approximation such as Math.ceil(chars / 4) for English/ASCII-heavy content and label it estimated_tokens, not exact tokens.
Metrics are never computed from unsanitized secret-bearing intermediate strings when sanitizer would remove content; returned metrics describe returned/sanitized payloads.
Existing clients that read only content[0].text see byte-identical output unless include_metrics: true is passed.
Benchmark or test fixture asserts DOM mode compression ratio does not regress below a documented threshold for at least one committed fixture.
npm run build && npm test && npm run lint && npm run lint:tier pass.
Pass if both include metrics. Compression ratio comparisons are gated only by committed fixture tests, not this live smoke, because public pages can change.
Pass if summary includes total returned chars/tokens and page includes per-page metrics.
Regression check:
Run a legacy read_page without include_metrics. Pass if content remains byte-identical to the pre-change fixture snapshot.
Self-review tightening
Exact tokenizer compatibility is out of scope; the field is named estimated_tokens to avoid overclaiming.
extract_data was deferred from the initial scope unless it can be made strictly additive.
Legacy byte parity is mandatory when include_metrics is omitted.
Any strict compression-ratio gate must use committed fixtures, not public live pages.
Fit with OpenChrome direction
This directly supports OpenChrome's token-efficiency positioning and gives future optimization PRs measurable acceptance gates. It is low-risk because it is mostly additive and opt-in for legacy text responses.
Keep metrics optional/metadata-only and avoid bloating tool responses.
Implementation checklist
Add deterministic metrics fields such as raw chars, returned chars, estimated tokens, truncation, compression mode, cache status, and filter reduction for initial high-volume tools like read_page/crawl.
Use local approximate token estimates only; No network/model tokenizer calls.
Expose metrics in a compact metadata field or optional verbose mode without changing primary content.
Add tests for metrics presence, estimate determinism, truncation reporting, cache/compression labels, and default-content compatibility.
Document how maintainers should use metrics to detect token regressions.
Success criteria
High-volume read tool responses include consistent local size/token metrics when enabled or in metadata.
Metrics are deterministic enough for regression tests but do not claim provider-exact billing.
Default response content remains compatible.
Maintainers can compare compression modes before/after changes.
Post-merge OpenChrome live verification checklist
Run read_page/crawl on a local large fixture and verify metrics include raw/returned chars, estimated tokens, truncation, and compression mode.
Run a cached/repeated call if supported and verify cache status is reported.
Compare a compact vs full mode response and record metric deltas.
Include sanitized tool response metadata and fixture path in merge notes.
Summary
Add lightweight output-size and token-estimate metrics to high-volume read/crawl tools so OpenChrome can prove and regress-test token optimization claims at the tool-response level.
Why
OpenChrome already claims strong token compression through DOM mode, snapshot delta, adaptive screenshot, and compact summaries. However, most tool responses do not expose consistent per-call metrics such as raw chars, returned chars, estimated tokens, truncation, compression mode, cache status, or filter reduction. Crawl4AI exposes token usage concepts in extraction flows; OpenChrome should expose deterministic local metrics for its own optimization layer.
This helps agents choose narrower tools and helps maintainers catch regressions after merge.
Non-goals
Scope
Initial tools:
read_pagecrawlcrawl_sitemapvalidate_pageextract_datais deliberately deferred unless its current JSON response path can expose metrics without changing legacy output. This keeps the first PR focused on high-volume page/content reads.Add a shared helper:
Use a documented approximation such as
Math.ceil(chars / 4)for English/ASCII-heavy content and label itestimated_tokens, not exact tokens.Proposed response metadata
Where response shape allows JSON metadata, add:
{ "metrics": { "raw_chars": 180000, "returned_chars": 12000, "estimated_tokens": 3000, "raw_estimated_tokens": 45000, "compression_ratio": 15.0, "truncated": false, "mode": "dom", "filter": "prune", "cache_status": "hit" } }For plain text outputs, append a compact machine-readable footer only when an explicit option is set:
{ "include_metrics": true }Default behavior should avoid surprising legacy clients unless the tool already returns JSON.
Acceptance criteria
read_page(mode: "dom")reportsreturned_chars,estimated_tokens,mode, andtruncatedwheninclude_metrics: trueor when structuredContent is available via feat(core): outputSchema + structuredContent for Tier-1 read tools (mcp-servers adoption 5) #871.crawlandcrawl_sitemapreport page-level and summary-level metrics for output chars and truncation wheninclude_metrics: true.fit_markdown.lengthandraw_markdown.length.content[0].textsee byte-identical output unlessinclude_metrics: trueis passed.npm run build && npm test && npm run lint && npm run lint:tierpass.OpenChrome real verification after merge
navigate({ "url": "https://example.com" }) -> tabIdread_page({ "tabId": tabId, "mode": "dom", "include_metrics": true })returned_chars > 0,estimated_tokens > 0,mode: "dom", andtruncated: false.read_page({ "tabId": tabId, "mode": "ax", "include_metrics": true }) read_page({ "tabId": tabId, "mode": "dom", "include_metrics": true })crawl({ "url": "https://example.com", "max_pages": 1, "output_format": "text", "include_metrics": true })Run a legacy
read_pagewithoutinclude_metrics. Pass if content remains byte-identical to the pre-change fixture snapshot.Self-review tightening
estimated_tokensto avoid overclaiming.extract_datawas deferred from the initial scope unless it can be made strictly additive.include_metricsis omitted.Fit with OpenChrome direction
This directly supports OpenChrome's token-efficiency positioning and gives future optimization PRs measurable acceptance gates. It is low-risk because it is mostly additive and opt-in for legacy text responses.
Related
Curated scope, overlap handling, and verification checklist
Scope classification
feat/990-token-metrics). Continue there; do not create duplicate implementation work.Overlap and conflict resolution
Implementation checklist
read_page/crawl.Success criteria
Post-merge OpenChrome live verification checklist
read_page/crawl on a local large fixture and verify metrics include raw/returned chars, estimated tokens, truncation, and compression mode.