feat(core): token and compression metrics for high-volume read tools

## Summary

Add lightweight output-size and token-estimate metrics to high-volume read/crawl tools so OpenChrome can prove and regress-test token optimization claims at the tool-response level.

## Why

OpenChrome already claims strong token compression through DOM mode, snapshot delta, adaptive screenshot, and compact summaries. However, most tool responses do not expose consistent per-call metrics such as raw chars, returned chars, estimated tokens, truncation, compression mode, cache status, or filter reduction. Crawl4AI exposes token usage concepts in extraction flows; OpenChrome should expose deterministic local metrics for its own optimization layer.

This helps agents choose narrower tools and helps maintainers catch regressions after merge.

## Non-goals

- Do not call tokenizer services or LLM APIs.
- Do not make exact provider-token guarantees.
- Do not change default response content.
- Do not add metrics to every tool in the first PR.

## Scope

Initial tools:

- `read_page`
- `crawl`
- `crawl_sitemap`
- `validate_page`

`extract_data` is deliberately deferred unless its current JSON response path can expose metrics without changing legacy output. This keeps the first PR focused on high-volume page/content reads.

Add a shared helper:

```ts
estimateTokens(text: string): number
```

Use a documented approximation such as `Math.ceil(chars / 4)` for English/ASCII-heavy content and label it `estimated_tokens`, not exact tokens.

## Proposed response metadata

Where response shape allows JSON metadata, add:

```jsonc
{
  "metrics": {
    "raw_chars": 180000,
    "returned_chars": 12000,
    "estimated_tokens": 3000,
    "raw_estimated_tokens": 45000,
    "compression_ratio": 15.0,
    "truncated": false,
    "mode": "dom",
    "filter": "prune",
    "cache_status": "hit"
  }
}
```

For plain text outputs, append a compact machine-readable footer only when an explicit option is set:

```jsonc
{ "include_metrics": true }
```

Default behavior should avoid surprising legacy clients unless the tool already returns JSON.

## Acceptance criteria

- [ ] Shared metrics helper is unit-tested for empty strings, ASCII text, CJK text, and large strings.
- [ ] `read_page(mode: "dom")` reports `returned_chars`, `estimated_tokens`, `mode`, and `truncated` when `include_metrics: true` or when structuredContent is available via #871.
- [ ] `crawl` and `crawl_sitemap` report page-level and summary-level metrics for output chars and truncation when `include_metrics: true`.
- [ ] If #980 fit_markdown has landed, filter metrics are included and consistent with `fit_markdown.length` and `raw_markdown.length`.
- [ ] Metrics are never computed from unsanitized secret-bearing intermediate strings when sanitizer would remove content; returned metrics describe returned/sanitized payloads.
- [ ] Existing clients that read only `content[0].text` see byte-identical output unless `include_metrics: true` is passed.
- [ ] Benchmark or test fixture asserts DOM mode compression ratio does not regress below a documented threshold for at least one committed fixture.
- [ ] `npm run build && npm test && npm run lint && npm run lint:tier` pass.

## OpenChrome real verification after merge

1. Navigate to a medium page:
   ```jsonc
   navigate({ "url": "https://example.com" }) -> tabId
   ```
2. Read DOM with metrics:
   ```jsonc
   read_page({ "tabId": tabId, "mode": "dom", "include_metrics": true })
   ```
   Pass if metrics include `returned_chars > 0`, `estimated_tokens > 0`, `mode: "dom"`, and `truncated: false`.
3. Compare AX vs DOM:
   ```jsonc
   read_page({ "tabId": tabId, "mode": "ax", "include_metrics": true })
   read_page({ "tabId": tabId, "mode": "dom", "include_metrics": true })
   ```
   Pass if both include metrics. Compression ratio comparisons are gated only by committed fixture tests, not this live smoke, because public pages can change.
4. Crawl metrics:
   ```jsonc
   crawl({ "url": "https://example.com", "max_pages": 1, "output_format": "text", "include_metrics": true })
   ```
   Pass if summary includes total returned chars/tokens and page includes per-page metrics.
5. Regression check:
   Run a legacy `read_page` without `include_metrics`. Pass if content remains byte-identical to the pre-change fixture snapshot.

## Self-review tightening

- Exact tokenizer compatibility is out of scope; the field is named `estimated_tokens` to avoid overclaiming.
- `extract_data` was deferred from the initial scope unless it can be made strictly additive.
- Legacy byte parity is mandatory when `include_metrics` is omitted.
- Any strict compression-ratio gate must use committed fixtures, not public live pages.

## Fit with OpenChrome direction

This directly supports OpenChrome's token-efficiency positioning and gives future optimization PRs measurable acceptance gates. It is low-risk because it is mostly additive and opt-in for legacy text responses.

## Related

- #871 outputSchema + structuredContent would make these metrics cleaner to expose.
- #884 clean markdown
- #980 fit_markdown content filters



## Curated scope, overlap handling, and verification checklist

### Scope classification
- **Canonical lane:** token/compression observability metrics.
- **Primary deliverable:** lightweight per-call output-size and token-estimate metrics for high-volume read/crawl tools.
- **Open PR:** #1077 (`feat/990-token-metrics`). Continue there; do not create duplicate implementation work.
- **Non-goal:** calling tokenizer services/LLM APIs, exact provider-token guarantees, changing default response content, or instrumenting every tool in v1.

### Overlap and conflict resolution
- [ ] Coordinate with #871/#884/#980 compression work by measuring their effects, not reimplementing compression.
- [ ] Coordinate with #998 stable profile so output-size impact can be measured but keep metrics independent.
- [ ] Keep metrics optional/metadata-only and avoid bloating tool responses.

### Implementation checklist
- [ ] Add deterministic metrics fields such as raw chars, returned chars, estimated tokens, truncation, compression mode, cache status, and filter reduction for initial high-volume tools like `read_page`/crawl.
- [ ] Use local approximate token estimates only; No network/model tokenizer calls.
- [ ] Expose metrics in a compact metadata field or optional verbose mode without changing primary content.
- [ ] Add tests for metrics presence, estimate determinism, truncation reporting, cache/compression labels, and default-content compatibility.
- [ ] Document how maintainers should use metrics to detect token regressions.

### Success criteria
- [ ] High-volume read tool responses include consistent local size/token metrics when enabled or in metadata.
- [ ] Metrics are deterministic enough for regression tests but do not claim provider-exact billing.
- [ ] Default response content remains compatible.
- [ ] Maintainers can compare compression modes before/after changes.

### Post-merge OpenChrome live verification checklist
- [ ] Run `read_page`/crawl on a local large fixture and verify metrics include raw/returned chars, estimated tokens, truncation, and compression mode.
- [ ] Run a cached/repeated call if supported and verify cache status is reported.
- [ ] Compare a compact vs full mode response and record metric deltas.
- [ ] Include sanitized tool response metadata and fixture path in merge notes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(core): token and compression metrics for high-volume read tools #990

Summary

Why

Non-goals

Scope

Proposed response metadata

Acceptance criteria

OpenChrome real verification after merge

Self-review tightening

Fit with OpenChrome direction

Related

Curated scope, overlap handling, and verification checklist

Scope classification

Overlap and conflict resolution

Implementation checklist

Success criteria

Post-merge OpenChrome live verification checklist

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

feat(core): token and compression metrics for high-volume read tools #990

Description

Summary

Why

Non-goals

Scope

Proposed response metadata

Acceptance criteria

OpenChrome real verification after merge

Self-review tightening

Fit with OpenChrome direction

Related

Curated scope, overlap handling, and verification checklist

Scope classification

Overlap and conflict resolution

Implementation checklist

Success criteria

Post-merge OpenChrome live verification checklist

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions