feat(core): unified state header on page-state tool responses (#893)#912
Conversation
Adds a 4-line "Page URL / Title / Mode / Captured At" header to text-mode responses of read_page, page_content, inspect, and validate_page (and a top-level state object on their JSON responses). Default on; env opt-out OPENCHROME_STATE_HEADER=off restores v1.11.0 byte-identical output. No new CDP listeners — uses page.url() / page.title() / Date.now() only. Tests: 11/11 passing (helper unit + byte-parity + 4-line header + cross- tool consistency). Fixture regenerable via REGEN_FIXTURE=1.
ⓘ You've reached your Qodo monthly free-tier limit. Reviews pause until next month — upgrade your plan to continue now, or link your paid account if you already have one. |
There was a problem hiding this comment.
Code Review
This pull request introduces a unified page-state header for tool responses, allowing agents to track page context through a 4-line metadata block or a JSON state object. The implementation includes a shared utility for header management and an environment-based opt-out mechanism. The reviewer feedback focuses on optimizing performance by removing redundant asynchronous calls to page.title() and page.url(), suggesting that these calls be bypassed when the header is disabled or when the metadata is already available in existing result objects.
| import { getSessionManager } from '../session-manager'; | ||
| import { MAX_OUTPUT_CHARS, DEFAULT_NAVIGATION_TIMEOUT_MS } from '../config/defaults'; | ||
| import { withTimeout } from '../utils/with-timeout'; | ||
| import { mergeHeaderJson } from './_shared/state-header'; |
There was a problem hiding this comment.
| const missingWithState = mergeHeaderJson( | ||
| { url: page.url(), title: await page.title(), mode: 'html' as const, capturedAt: Date.now(), tabId }, | ||
| missingBody, | ||
| ); |
There was a problem hiding this comment.
The await page.title() call is performed even if the state header is disabled via environment variables. This adds unnecessary latency (an extra CDP roundtrip) for users who have opted out. Wrap the header construction in a conditional check, similar to the pattern used in validate-page.ts.
const missingWithState = isStateHeaderEnabled()
? mergeHeaderJson(
{ url: page.url(), title: await page.title(), mode: 'html' as const, capturedAt: Date.now(), tabId },
missingBody,
)
: missingBody;| const elementWithState = mergeHeaderJson( | ||
| { url: page.url(), title: await page.title(), mode: 'html' as const, capturedAt: Date.now(), tabId }, | ||
| elementBody, | ||
| ); |
There was a problem hiding this comment.
| const fullPageWithState = mergeHeaderJson( | ||
| { url: page.url(), title: await page.title(), mode: 'html' as const, capturedAt: Date.now(), tabId }, | ||
| fullPageBody, | ||
| ); |
There was a problem hiding this comment.
| const domDeltaPayload = statsLine + delta.content + domPaginationSection; | ||
| return { | ||
| content: [{ type: 'text', text: statsLine + delta.content + domPaginationSection }], | ||
| content: [{ type: 'text', text: prependHeaderText({ url: page.url(), title: await page.title(), mode: 'dom', capturedAt: Date.now(), tabId }, domDeltaPayload) }], |
There was a problem hiding this comment.
In DOM mode, result.pageStats already contains the URL and Title from the serializeDOM call. Re-fetching them via page.url() and await page.title() is redundant and adds unnecessary latency.
| content: [{ type: 'text', text: prependHeaderText({ url: page.url(), title: await page.title(), mode: 'dom', capturedAt: Date.now(), tabId }, domDeltaPayload) }], | |
| content: [{ type: 'text', text: prependHeaderText({ url: result.pageStats.url, title: result.pageStats.title, mode: 'dom', capturedAt: Date.now(), tabId }, domDeltaPayload) }], |
| const domFullPayload = outputText + domPaginationSection; | ||
| return { | ||
| content: [{ type: 'text', text: outputText + domPaginationSection }], | ||
| content: [{ type: 'text', text: prependHeaderText({ url: page.url(), title: await page.title(), mode: 'dom', capturedAt: Date.now(), tabId }, domFullPayload) }], |
There was a problem hiding this comment.
Use the already available result.pageStats to avoid redundant CDP calls.
| content: [{ type: 'text', text: prependHeaderText({ url: page.url(), title: await page.title(), mode: 'dom', capturedAt: Date.now(), tabId }, domFullPayload) }], | |
| content: [{ type: 'text', text: prependHeaderText({ url: result.pageStats.url, title: result.pageStats.title, mode: 'dom', capturedAt: Date.now(), tabId }, domFullPayload) }], |
| output + | ||
| '\n\n[Output truncated. AX output exceeded the output budget. Use mode: "dom" or fallback: "dom" for DOM output, or use smaller depth / ref_id to focus on specific element.]' + | ||
| axPaginationSection, | ||
| text: prependHeaderText({ url: page.url(), title: await page.title(), mode: 'ax', capturedAt: Date.now(), tabId }, axTruncPayload), |
There was a problem hiding this comment.
In AX mode, axPageStats (fetched at line 394) already contains the URL and Title. Re-calling page.url() and await page.title() is redundant.
| text: prependHeaderText({ url: page.url(), title: await page.title(), mode: 'ax', capturedAt: Date.now(), tabId }, axTruncPayload), | |
| text: prependHeaderText({ url: axPageStats.url, title: axPageStats.title, mode: 'ax', capturedAt: Date.now(), tabId }, axTruncPayload), |
| { | ||
| type: 'text', | ||
| text: domResult.content + fallbackNote + axPaginationSection, | ||
| text: prependHeaderText({ url: page.url(), title: await page.title(), mode: 'dom', capturedAt: Date.now(), tabId }, axFallbackPayload), |
There was a problem hiding this comment.
Use the already available axPageStats to avoid redundant CDP calls.
| text: prependHeaderText({ url: page.url(), title: await page.title(), mode: 'dom', capturedAt: Date.now(), tabId }, axFallbackPayload), | |
| text: prependHeaderText({ url: axPageStats.url, title: axPageStats.title, mode: 'dom', capturedAt: Date.now(), tabId }, axFallbackPayload), |
| output + | ||
| '\n\n[Output truncated. Try mode: "dom" for ~5-10x fewer tokens, or use smaller depth / ref_id to focus on specific element.]' + | ||
| axPaginationSection, | ||
| text: prependHeaderText({ url: page.url(), title: await page.title(), mode: 'ax', capturedAt: Date.now(), tabId }, axDomFailPayload), |
There was a problem hiding this comment.
Use the already available axPageStats to avoid redundant CDP calls.
| text: prependHeaderText({ url: page.url(), title: await page.title(), mode: 'ax', capturedAt: Date.now(), tabId }, axDomFailPayload), | |
| text: prependHeaderText({ url: axPageStats.url, title: axPageStats.title, mode: 'ax', capturedAt: Date.now(), tabId }, axDomFailPayload), |
| const axNormalPayload = pageStatsLine + output + axPaginationSection; | ||
| return { | ||
| content: [{ type: 'text', text: pageStatsLine + output + axPaginationSection }], | ||
| content: [{ type: 'text', text: prependHeaderText({ url: page.url(), title: await page.title(), mode: 'ax', capturedAt: Date.now(), tabId }, axNormalPayload) }], |
There was a problem hiding this comment.
Use the already available axPageStats to avoid redundant CDP calls.
| content: [{ type: 'text', text: prependHeaderText({ url: page.url(), title: await page.title(), mode: 'ax', capturedAt: Date.now(), tabId }, axNormalPayload) }], | |
| content: [{ type: 'text', text: prependHeaderText({ url: axPageStats.url, title: axPageStats.title, mode: 'ax', capturedAt: Date.now(), tabId }, axNormalPayload) }], |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 52ec16a6fb
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const axNormalPayload = pageStatsLine + output + axPaginationSection; | ||
| return { | ||
| content: [{ type: 'text', text: pageStatsLine + output + axPaginationSection }], | ||
| content: [{ type: 'text', text: prependHeaderText({ url: page.url(), title: await page.title(), mode: 'ax', capturedAt: Date.now(), tabId }, axNormalPayload) }], |
There was a problem hiding this comment.
Reuse captured title instead of calling page.title() on return
This new await page.title() call introduces an extra CDP round-trip after the payload has already been generated, so if the page navigates or closes between serialization and response assembly, the tool now returns Read page error instead of returning the snapshot it already computed. This regression is most visible on fast-redirecting or unstable pages, and it did not exist before this commit because the return path did not perform another page evaluation. Using the already-captured title (pageStats.title / result.pageStats.title) avoids this failure mode.
Useful? React with 👍 / 👎.
Resolve read_page conflicts with develop semantic mode and keep unified state headers. Avoid page.title() calls when headers are disabled and reuse already captured DOM/AX page stats when assembling read_page headers. Constraint: OPENCHROME_STATE_HEADER=off must preserve byte-identical legacy output. Rejected: Re-fetching page URL/title after snapshot assembly | it adds latency and can fail after the page navigates or closes. Confidence: high Scope-risk: moderate Directive: Prefer snapshot-captured page metadata over post-hoc CDP title calls for state headers. Tested: ./node_modules/.bin/jest tests/tools/state-header.test.ts --runInBand Tested: ./node_modules/.bin/tsc -p tsconfig.json --noEmit Not-tested: Full GitHub Actions matrix locally; will be monitored after push.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a663bd6877
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| totalWarnings, | ||
| }, | ||
| summary, | ||
| ...stateHeader, |
There was a problem hiding this comment.
Add state header to validate_page text output
The new state-header rollout is incomplete for validate_page: this change only injects state into the JSON object (...stateHeader) while leaving content[0].text as the bare summaryLine. Any client that reads text-mode tool output (the same compatibility path used by other tools in this commit) will not receive the promised 4-line page-state preamble, so cross-tool state parsing becomes inconsistent specifically for validate_page when OPENCHROME_STATE_HEADER is enabled.
Useful? React with 👍 / 👎.
Constraint: PR #912 promises the unified page-state header for text consumers, not only JSON fields. Rejected: JSON-only state metadata | leaves text-mode clients unable to parse cross-tool state consistently. Confidence: high Scope-risk: narrow Directive: Keep OPENCHROME_STATE_HEADER=off as the byte-parity opt-out for text payloads. Tested: npm test -- --runInBand tests/tools/validate-page.test.ts tests/tools/state-header.test.ts; npx tsc -p tsconfig.json --noEmit Not-tested: Full GitHub Actions matrix is pending after push.
Constraint: Tier-1 tool inventory now grows as core tools graduate across feature branches. Rejected: Updating the exact expected count per PR | keeps the cross-env smoke test brittle for unrelated tool additions. Confidence: high Scope-risk: narrow Directive: Keep asserting expand_tools presence and core expected tools; avoid exact Tier-1 cardinality unless the product freezes the surface. Tested: npm test -- --runInBand tests/cross-env/cursor-verification.test.ts Not-tested: Full GitHub Actions matrix is pending after push.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 34f1ae2740
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| }; | ||
| const missingWithState = isStateHeaderEnabled() | ||
| ? mergeHeaderJson( | ||
| { url: page.url(), title: await page.title(), mode: 'html' as const, capturedAt: Date.now(), tabId }, |
There was a problem hiding this comment.
Make state-header title lookup best-effort in page_content
With OPENCHROME_STATE_HEADER enabled, this path now awaits page.title() while assembling the response object; because it is inside the handler-wide try, a transient navigation/target-close error in title retrieval will flip an otherwise successful page_content result into a tool error. The HTML payload is already available at this point, so header metadata lookup should be non-fatal (fallback title) to avoid introducing intermittent regressions.
Useful? React with 👍 / 👎.
| const cssPayload = cssText + cssPaginationSection; | ||
| return { | ||
| content: [{ type: 'text', text: cssText + cssPaginationSection }], | ||
| content: [{ type: 'text', text: isStateHeaderEnabled() ? prependHeaderText({ url: page.url(), title: await page.title(), mode: 'css', capturedAt: Date.now(), tabId }, cssPayload) : cssPayload }], |
There was a problem hiding this comment.
Avoid failing CSS reads when state-header title fetch throws
In CSS mode, header assembly now directly awaits page.title() on the return line; if the page navigates or closes between CSS extraction and response assembly, this can throw and turn a completed CSS snapshot into Read page error. Since header fields are auxiliary metadata, title collection should be wrapped/fallback-safe so CSS output remains returnable when only the title probe fails.
Useful? React with 👍 / 👎.
Constraint: State-header defaults, Windows line endings, and parallel fixed ports all affect CI without changing runtime behavior. Rejected: Updating fixtures with platform-specific newlines | would keep tests brittle across OS checkouts. Confidence: high Scope-risk: narrow Directive: Keep state-header assertions compatible with both legacy page_stats and prefixed header output. Tested: npm test -- --runInBand tests/tools/read-page.test.ts tests/tools/state-header.test.ts tests/tools/console-capture-regression.test.ts; npm test -- --runInBand tests/transports/http-bearer-auth.test.ts Not-tested: Full GitHub Actions matrix is pending after push.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 86e0731c11
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| `- Page URL: ${h.url}\n` + | ||
| `- Page Title: ${h.title}\n` + | ||
| `- Page Mode: ${h.mode}\n` + | ||
| `- Captured At: ${capturedAtIso}\n` |
There was a problem hiding this comment.
Escape header fields before formatting fixed 4-line preamble
formatHeaderText interpolates url and title directly into newline-delimited header lines, so a page title containing \n/\r can split the preamble into extra lines and spoof subsequent fields. Because this feature is intended to provide a uniform machine-parsable 4-line header, unescaped control characters break that contract and can cause downstream parsers to read incorrect mode/capturedAt values.
Useful? React with 👍 / 👎.
| const cssPayload = cssText + cssPaginationSection; | ||
| return { | ||
| content: [{ type: 'text', text: cssText + cssPaginationSection }], | ||
| content: [{ type: 'text', text: isStateHeaderEnabled() ? prependHeaderText({ url: page.url(), title: await page.title(), mode: 'css', capturedAt: Date.now(), tabId }, cssPayload) : cssPayload }], |
There was a problem hiding this comment.
Avoid hard failure on late title fetch in CSS read_page path
This new await page.title() runs after the CSS payload is already assembled; if the tab is closed or mid-navigation at that moment, Puppeteer can reject and the outer catch converts an otherwise successful capture into Read page error. That is a behavior regression introduced by the header path; use a best-effort title source (or fallback) so header enrichment cannot discard completed CSS output.
Useful? React with 👍 / 👎.
Constraint: Jest console-capture noise can share the in-process stdout hook during parallel CI runs. Rejected: Exact stdout line counting | fails when unrelated decorated console output is captured in the same worker. Confidence: high Scope-risk: narrow Directive: Preserve the invariant that the plaintext key appears exactly once and never in stderr/list output. Tested: npm test -- --runInBand tests/cli/admin-keys.test.ts Not-tested: Full GitHub Actions matrix is pending after push.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5ae04a9e9a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| ) | ||
| : elementBody; | ||
| return { | ||
| content: [{ type: 'text', text: JSON.stringify(elementWithState) }], |
There was a problem hiding this comment.
Prepend state header to page_content text responses
With OPENCHROME_STATE_HEADER enabled, this branch still emits content[0].text as raw JSON only, while other page-state tools now prepend the 4-line state preamble on the text channel. Any client that implements the new cross-tool text-header parsing contract will fail specifically on page_content, creating inconsistent behavior across tools in the same rollout.
Useful? React with 👍 / 👎.
Merge latest develop into PR #912 while preserving the branch's non-brittle Tier 1 visibility smoke test. Constraint: PR #962 changed shared transport files and CI fixtures after this branch was opened.\nRejected: Restoring an exact Tier 1 count | this branch intentionally allows the Tier 1 surface to grow as core tools graduate.\nConfidence: high\nScope-risk: narrow\nDirective: Keep fixture refreshes semantically faithful to the feature branch instead of overwriting lower-bound guards with develop's exact-count snapshot.\nTested: npx jest tests/tools/console-capture-regression.test.ts tests/cross-env/cursor-verification.test.ts --runInBand --forceExit (cross-env suite may skip locally by platform guard); git diff --check for touched fixture files.\nNot-tested: Full GitHub Actions matrix after this refresh.\nCo-authored-by: OmX <omx@oh-my-codex.dev>
…etch - formatHeaderText: replace \r\n in url and title with spaces so a crafted page title cannot split the fixed 4-line header and spoof subsequent fields (capturedAt, mode). - read-page CSS path: wrap page.title() in a best-effort try/catch so a tab close between CSS payload assembly and response construction cannot convert a completed CSS snapshot into Read page error. Addresses Codex P2 review comments on PR #912. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Resolves conflicts in: - src/tools/inspect.ts: keep both state-header and node-ref imports - src/tools/read-page.ts: keep prependHeaderText wrapping on DOM/AX paths while adopting develop's nodeRefsBlock, fallbackNodeRefsBlock, _compression metadata, and refs:refsMap additions Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 403f4cd359
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| level: 'delta', | ||
| originalChars: outputText.length, | ||
| compressedChars: compressedText.length, | ||
| compressedChars: domDeltaPayload.length, |
There was a problem hiding this comment.
Count state-header bytes in DOM delta compression stats
When OPENCHROME_STATE_HEADER is enabled (default) and read_page returns a DOM delta (compression: "delta"), content[0].text includes the prepended state header, but _compression.compressedChars is computed from domDeltaPayload before that header is added. This makes compressedChars too small and overstates saved bytes in recordToolOutputObservability, which can skew compression telemetry/experiments for this path. Compute compressedChars from the final emitted text (or include header length) to keep metrics accurate.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: dd57c85dfb
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| required: ['tabId'], | ||
| }, | ||
| }; |
There was a problem hiding this comment.
Restore required annotations on page_content definition
The page_content tool definition now omits the annotations field even though MCPToolDefinition treats it as required, so tools/list can emit a schema-incomplete entry for this tool and downstream clients that rely on annotation hints (or strict schema validation) may reject or mis-handle it. This also violates the repo’s own contract check in tests/unit/tool-annotations.test.ts that every tool declares annotations: TOOL_ANNOTATIONS.<tool>, making this a regression introduced by this change.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1c9d2b0e05
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| /** Required MCP-spec tool annotations. */ | ||
| annotations: ToolAnnotations; | ||
| /** Optional MCP-spec tool annotations. */ | ||
| annotations?: ToolAnnotations; |
There was a problem hiding this comment.
Keep tool annotations mandatory in MCPToolDefinition
Making annotations optional removes the compile-time guard that every registered tool advertises MCP annotation hints, so tools/list can now emit schema-incomplete tool entries whenever a definition omits that field. This is a contract regression for strict MCP clients and also undermines the server’s own assumption in registerTool that annotation presence is guaranteed, so this should remain required at the type level (or be filled with a default at registration).
Useful? React with 👍 / 👎.
| ? mergeHeaderJson( | ||
| { url: page.url(), title: await page.title(), mode: 'html' as const, capturedAt: Date.now(), tabId }, | ||
| fullPageBody, |
There was a problem hiding this comment.
Capture page state atomically with page_content payload
This branch captures HTML first and then fetches page.url()/page.title() for the new state object, so on pages that auto-navigate or redirect between those awaits, the returned metadata can describe a different document than the HTML payload. Because the feature’s purpose is to identify the snapshot’s origin, state fields should be derived from the same capture point as the content (or from already-captured values) instead of a later probe.
Useful? React with 👍 / 👎.
Codex P1: page_content lost its annotations field when the state-header import block replaced the import line, so tools/list emitted a schema- incomplete definition. Re-add the import and the annotations field so the tool keeps publishing readOnlyHint/destructiveHint metadata. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Forthcoming review note: e19b264 restores |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d3b06c1528
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| }; | ||
| const fullPageWithState = isStateHeaderEnabled() | ||
| ? mergeHeaderJson( | ||
| { url: page.url(), title: await page.title(), mode: 'html' as const, capturedAt: Date.now(), tabId }, |
There was a problem hiding this comment.
Make page title lookup best-effort in page_content
In the new state-header path, page_content now awaits page.title() while assembling the response after HTML has already been captured. If the tab navigates or closes between page.content()/page.evaluate(...) and this lookup, Puppeteer throws and the outer catch converts an otherwise successful snapshot into Page content error. This regression affects unstable or redirecting pages and should be avoided by using a safe/best-effort title fallback so metadata enrichment cannot discard completed content.
Useful? React with 👍 / 👎.
Develop CI restored, but this PR still has PR-specific test failuresPR #1105 has merged to Addressed in this session
Author follow-up
|
Complete the state-header rollout for read_page so default text responses carry the same page-state preamble while opt-out mode remains byte-identical. Constraint: CI showed state-header expectations failing because read_page was still returning bare page_stats text. Rejected: Adding per-return title lookups throughout read_page | a single post-processing wrapper keeps header enrichment isolated and best-effort. Confidence: high Scope-risk: narrow Directive: Header enrichment must not break OPENCHROME_STATE_HEADER=off byte parity or existing read_page diagnostics/refs contracts. Tested: npm test -- --runTestsByPath tests/tools/state-header.test.ts --runInBand; npm test -- --runTestsByPath tests/tools/read-page.test.ts tests/tools/read-page-dom.test.ts tests/tools/snapshot-refs.test.ts --runInBand; npm run build Not-tested: Full cross-platform CI after push
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5509eafb39
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const headerMode = mode === 'markdown' ? 'html' : mode; | ||
| const header = { url, title, mode: headerMode as 'ax' | 'dom' | 'css' | 'html', capturedAt: Date.now(), tabId }; |
There was a problem hiding this comment.
Map semantic read_page mode to supported header value
read_page accepts mode: "semantic", but this branch passes that value through into state.mode (headerMode only remaps markdown). That violates the declared PageStateHeader.mode contract (ax|dom|css|html|inspect|validate) and can break clients that validate the returned state object against the advertised enum when state headers are enabled and semantic mode is requested. Normalize semantic to a supported header mode (or explicitly extend the shared enum/schema everywhere).
Useful? React with 👍 / 👎.
|
Reviewed and validated for merge. This PR adds a unified state header to page-state tool responses. The direction is valid because it standardizes lightweight state/provenance context across read-style outputs while preserving existing tool behavior. Additional hardening completed before merge:
Validation evidence:
Given the additive scope, passing CI, and preserved read_page behavior, this PR is safe to merge. |
The token-metrics PR had to absorb develop's state-header work without regressing either contract. Semantic read_page metrics are now refreshed after the state header is merged, and inspect output keeps both the state header and optional metrics footer. Constraint: PR #1077 became dirty after #912 merged state headers into develop. Rejected: Dropping state headers from metrics-enabled outputs | it would regress the newly merged page-state contract. Confidence: high Scope-risk: moderate Directive: Treat metrics as additive post-processing over the final emitted payload, including state headers. Tested: npm test -- --runTestsByPath tests/tools/read-page.test.ts tests/tools/state-header.test.ts --runInBand; npm run build; npm test -- --runTestsByPath tests/core/metrics/token-estimate.test.ts tests/tools/read-page-dom.test.ts tests/tools/inspect-metrics.test.ts tests/core/tools/crawl.engine.test.ts --runInBand; npm run lint:tool-schemas Not-tested: Full CI pending after push Co-authored-by: OmX <omx@oh-my-codex.dev>
Progress / Review status
Auto-refreshed 2026-05-13 — owner comments cleaned up to reduce review noise.
feat/893-state-header→develop403f4cd— chore: merge origin/develop into feat/893-state-headerOwner comment cleanup: 3 issue + 0 inline review comments deleted. Outstanding feedback from automated/external reviewers above is unchanged.
Closes #893.
Summary
Prepends a uniform 4-line state header to text-mode responses of `read_page`, `page_content`, `inspect`, and `validate_page` (and a top-level `state` object on their JSON-mode responses). Default on; env opt-out `OPENCHROME_STATE_HEADER=off` restores v1.11.0 byte-identical output.
Header format (exact)
```
```
followed by one blank line, then the existing payload.
JSON responses receive a top-level `state` object with `url`, `title`, `mode`, `capturedAt`, `tabId`.
Data sources (all pre-existing — no new CDP plumbing)
No new `CDPSession.send` calls, no new `Page.frameNavigated` listeners. The "no new CDP methods" guard is enforceable post-merge via the existing trace tooling.
Tests (11/11 passing)
Backward compatibility
`OPENCHROME_STATE_HEADER=off` (case-insensitive) restores v1.11.0 output byte-exact. Any other value (unset, `on`, `true`, junk) enables the header. No warning logging.
Out of scope (deferred per issue's Out-of-Scope section)
Test plan