feat(read_page): add opt-in AX compact and delta output#1087
Conversation
Add AX-only compact output plus delta caching under AX-specific snapshot keys so repeated read_page calls can return smaller payloads without changing the default AX or DOM behavior. Constraint: Avoided overlap with existing DOM snapshot-cache work by namespacing AX cache entries under tabId:ax. Rejected: Making compact AX the default | it could hide non-actionable context agents still rely on. Confidence: high Scope-risk: moderate Directive: Keep future AX compression modes opt-in until benchmark data proves default-token improvements without task regressions. Tested: ./node_modules/.bin/jest tests/tools/read-page.test.ts --runInBand Tested: ./node_modules/.bin/tsc -p tsconfig.json --noEmit Not-tested: Live browser token comparison; captured as PR OpenChrome validation checklist.
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
ⓘ You've reached your Qodo monthly free-tier limit. Reviews pause until next month — upgrade your plan to continue now, or link your paid account if you already have one. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 631a3fbe43
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const outputLines = compactAX ? compactAXLines(lines) : lines; | ||
| const output = outputLines.join('\n'); | ||
| const outputCharCount = output.length; |
There was a problem hiding this comment.
Apply AX compaction before enforcing output budget
In compact AX mode, compaction is applied only after the full traversal output is collected, but traversal itself still stops once charCount exceeds MAX_OUTPUT in formatNode. On large pages with many non-actionable/no-ref nodes, this means traversal can terminate early and omit later actionable nodes even though the compacted output would have fit the budget. The new compact: true option therefore does not reliably preserve actionable content on long trees.
Useful? React with 👍 / 👎.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: db8c2f820e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const outputLines = compactAX ? compactAXLines(lines) : lines; | ||
| const output = outputLines.join('\n'); | ||
| const outputCharCount = output.length; |
There was a problem hiding this comment.
Apply compact filtering before AX output budget cutoff
When compact: true is enabled, the traversal still stops based on pre-compaction charCount, and only then filters lines via compactAXLines. On large pages with lots of non-actionable text early in the tree, this can stop traversal before later actionable/ref-bearing nodes are visited, then return a short compact result (often under MAX_OUTPUT) without any truncation warning. This makes compact AX output silently incomplete and can hide interactive elements the caller explicitly asked to preserve.
Useful? React with 👍 / 👎.
Constraint: CI pipes the full tools manifest through lint-tool-schemas, and immediate process.exit can truncate larger manifests. Rejected: Updating the schema lint baseline | this was an output-drain issue, not a schema exception. Confidence: high Scope-risk: narrow Directive: Introspection commands that write large JSON should return only after stdout flushes. Tested: npm test -- --runInBand tests/tools/read-page.test.ts; npm run build; npm run lint:tool-schemas Not-tested: Full GitHub CI matrix pending after push. Co-authored-by: OmX <omx@oh-my-codex.dev>
|
@codex review |
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 80f4ac247e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (delta.isDelta) { | ||
| return { | ||
| content: [{ type: 'text', text: pageStatsLine + delta.content.replace('[DOM Delta', '[AX Delta') + axPaginationSection }], | ||
| }; |
There was a problem hiding this comment.
Honor AX fallback=dom before returning delta responses
When mode: "ax", compression: "delta", and fallback: "dom" are combined, this early return can bypass the AX-size check that enforces DOM fallback for oversized trees. After the first call caches a truncated AX snapshot, subsequent calls on the same large page can return an [AX Delta ...] response here instead of the promised DOM fallback, so callers receive partial AX-derived output even though fallback: "dom" was explicitly requested for overflow cases.
Useful? React with 👍 / 👎.
Constraint: Latest develop added crawl tools whose required fields must satisfy the existing schema lint gate in PR merge refs. Rejected: Updating the schema lint baseline | crawl required fields can document REQUIRED semantics directly. Confidence: high Scope-risk: narrow Directive: Required MCP tool fields must keep REQUIRED-prefixed descriptions when adding or refreshing tools. Tested: npm run build; npm run lint:tool-schemas Not-tested: Full GitHub CI matrix pending after push. Co-authored-by: OmX <omx@oh-my-codex.dev>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a2fc0367d0
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const compression = args.compression as string | undefined; | ||
| if (compression === 'delta') { | ||
| const snapshotStore = SnapshotStore.getInstance(); | ||
| const axCacheTabId = `${tabId}:ax${compactAX ? ':compact' : ''}`; |
There was a problem hiding this comment.
Include AX snapshot scope in delta cache key
The AX delta cache key only uses tabId (plus compact flag), so compression: "delta" reuses the previous AX snapshot even when the caller changes scope parameters like ref_id, depth, or filter. In that case, computeDelta() compares different query shapes and can return an [AX Delta ...] response instead of a full tree for the new scope, which is not reconstructible unless the client also kept the unrelated prior snapshot. Please include the effective AX query shape in the cache key (or bypass delta when scope args differ).
Useful? React with 👍 / 👎.
Constraint: Latest develop and feature branches add core tier-one tools, making the old exact Cursor discovery count stale. Rejected: Encoding branch-specific counts | the verification should prove expand_tools presence and a bounded tier-one surface instead. Confidence: high Scope-risk: narrow Directive: Cursor discovery tests should assert contract shape and bounds, not an exact count that drifts with tool graduations. Tested: npm test -- --runInBand tests/cross-env/cursor-verification.test.ts Not-tested: Full GitHub CI matrix pending after push. Co-authored-by: OmX <omx@oh-my-codex.dev>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f12a1b5f76
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const delta = snapshotStore.computeDelta(previous, output, axPageStats.url); | ||
| snapshotStore.set(sessionId, axCacheTabId, output, axPageStats.url); |
There was a problem hiding this comment.
Strip unstable ref IDs before AX delta comparison
compression: "delta" in AX mode diffs the raw output text, but AX output embeds generated ref_N IDs and read_page clears/rebuilds refs each call. In production, RefIdManager.clearTargetRefs does not reset counters (src/utils/ref-id-manager.ts), so unchanged nodes get new ref numbers on every read; once ref-bearing lines exceed about one-third of the snapshot, computeDelta() crosses its 0.5 threshold and falls back to full output, effectively defeating AX delta on many unchanged pages. Normalizing/removing ref tokens before diffing (or diffing a ref-stable representation) would make delta behavior match the feature intent.
Useful? React with 👍 / 👎.
Summary
read_pageAXcompact: trueoutput to keep actionable/ref-bearing nodes, value/state nodes, and required ancestors while omitting non-actionable no-ref leaves.compression: "delta"for AX mode using AX-specific snapshot cache keys, including compact/non-compact separation.Closes #968
Fit / non-overlap review
tabId:ax).Tests
./node_modules/.bin/jest tests/tools/read-page.test.ts --runInBand./node_modules/.bin/tsc -p tsconfig.json --noEmitOpenChrome real-validation checklist
After merge, validate with OpenChrome itself:
read_pagewith{ "mode": "ax" }and record output chars.read_pagewith{ "mode": "ax", "compact": true }; confirm controls/refs remain and non-actionable no-ref text is reduced.read_pagetwice with{ "mode": "ax", "compression": "delta" }; confirm the second call returns an[AX Delta ...]response when the page is unchanged or lightly changed.{ "mode": "ax", "compact": true, "compression": "delta" }; confirm compact and delta caches do not cross-contaminate full AX output.