feat(read_page): surface custom interactive controls#1071
Conversation
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
ⓘ You've reached your Qodo monthly free-tier limit. Reviews pause until next month — upgrade your plan to continue now, or link your paid account if you already have one. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7317ae1cbf
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (interactiveOnly) { | ||
| await markCursorInteractiveElements(page); | ||
| } |
There was a problem hiding this comment.
Degrade gracefully when interactive hint scan times out
When interactiveOnly is enabled, serializeDOM now hard-fails if markCursorInteractiveElements exceeds its 5s timeout or throws. This is a regression from prior behavior where interactive serialization still completed based on native tags/roles; now large or script-heavy pages can make read_page fail entirely instead of returning partial interactive output. Wrapping this pre-scan in a fallback path (continue without custom hints) would avoid breaking the whole call.
Useful? React with 👍 / 👎.
| const all = Array.from(root.querySelectorAll('*')) as HTMLElement[]; | ||
| for (const el of all) { | ||
| if (el.shadowRoot) roots.push(el.shadowRoot); | ||
| if (el.hasAttribute(hintAttr)) el.removeAttribute(hintAttr); |
There was a problem hiding this comment.
Avoid deleting pre-existing page attributes during cleanup
Cleanup removes data-oc-interactive-hints from every element unconditionally, even if that attribute existed before serialization. Combined with the earlier setAttribute, this can overwrite and then permanently delete application-owned DOM state, which is an unexpected side effect of a read operation. Track which nodes were actually marked (or preserve original values) so existing attributes are restored instead of blindly removed.
Useful? React with 👍 / 👎.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7820e673e9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const editable = el.getAttribute('contenteditable'); | ||
| const isEditable = editable === '' || editable === 'true'; |
There was a problem hiding this comment.
Use computed editability when tagging contenteditable controls
The interactive pre-scan treats editability as editable === '' || editable === 'true', which misses valid editable elements like contenteditable="plaintext-only" and inherited editability from ancestors. In interactiveOnly/filter: "interactive" mode this causes real text-edit controls to be dropped from the serialized output, reducing action-target recall on rich-text/plaintext editors. Prefer el.isContentEditable (or equivalent computed logic) so all standards-compliant editable states are surfaced.
Useful? React with 👍 / 👎.
|
@codex review |
1 similar comment
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: fb5132b3a2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (interactiveOnly) { | ||
| try { | ||
| await markCursorInteractiveElements(page); | ||
| } catch { |
There was a problem hiding this comment.
Cancel timed-out hint scans before continuing serialization
If markCursorInteractiveElements hits the 5s withTimeout limit, this catch path continues immediately, but withTimeout only races promises and does not abort the underlying page.evaluate. On large/slow pages the scan can keep mutating attributes in the background while serializeDOM proceeds and runs a one-time cleanup, which can leave data-oc-interactive-hints markers behind after read_page returns and violate the side-effect-free expectation for reads.
Useful? React with 👍 / 👎.
| * Check if a node is interactive | ||
| */ | ||
| function isInteractive(tagName: string, attrMap: Map<string, string>): boolean { | ||
| if (attrMap.has(INTERACTIVE_HINT_ATTR)) return true; |
There was a problem hiding this comment.
Restrict hint-attribute interactivity to owned scan markers
The serializer now treats any element with data-oc-interactive-hints as interactive, regardless of whether OpenChrome added that attribute. That means pages that already use the same data-* key will be reclassified and get ★ [...] markers even in normal (non-interactive-only) reads, creating output noise and false positives. This should only trust markers that were inserted by this scan (for example via the owned flag) or be gated to interactive filtering mode.
Useful? React with 👍 / 👎.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 35cd356cc3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // Await the browser-side scan directly. page.evaluate cannot be aborted by | ||
| // racing its promise with a timeout; doing so could let background marker | ||
| // mutations continue after serializeDOM has already run cleanup. | ||
| await page.evaluate((hintAttr: string) => { |
There was a problem hiding this comment.
Bound interactive hint scan duration
markCursorInteractiveElements now awaits a full page.evaluate with no timeout, while the browser-side callback traverses querySelectorAll('*') and performs expensive style/layout checks per element. On large or script-heavy pages, read_page with interactiveOnly can block for an unbounded period instead of returning within a predictable latency budget, which is a production reliability regression for interactive reads. Add a bounded scan budget (or node cap) and gracefully fall back to native tag/role detection when that budget is exceeded.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 35cd356cc3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // Await the browser-side scan directly. page.evaluate cannot be aborted by | ||
| // racing its promise with a timeout; doing so could let background marker | ||
| // mutations continue after serializeDOM has already run cleanup. | ||
| await page.evaluate((hintAttr: string) => { |
There was a problem hiding this comment.
Bound cursor-hint pre-scan with a timeout
markCursorInteractiveElements runs a full-page page.evaluate scan (querySelectorAll('*'), getComputedStyle, getBoundingClientRect) without any timeout or cancellation guard. In read_page calls with filter: "interactive", a very large or script-hostile page can keep this promise pending for a long time, causing serializeDOM to stall before DOM.getDocument and effectively hanging the tool response for users. The rest of this serializer already uses bounded operations (withTimeout), so this unbounded pre-scan is a reliability regression specific to the new interactive-hint path.
Useful? React with 👍 / 👎.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f757126d96
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const root = { querySelectorAll: () => [el] }; | ||
| const previousDocument = (global as any).document; | ||
| (global as any).document = root; | ||
| try { await fn(hintAttr); } finally { (global as any).document = previousDocument; } |
There was a problem hiding this comment.
Make contenteditable hint test exercise real scan logic
This test can pass without validating the behavior it names. The mock document assigned here does not implement createTreeWalker, and the callback is invoked with only hintAttr, so markCursorInteractiveElements throws before setting any hint. Because serializeDOM intentionally swallows hint-scan errors, the thrown error is hidden and the test still succeeds, which means regressions in contenteditable hint detection would not be caught.
Useful? React with 👍 / 👎.
| const all = Array.from(root.querySelectorAll('*')) as HTMLElement[]; | ||
| for (const el of all) { |
There was a problem hiding this comment.
Restrict cleanup traversal to owned marker nodes
Cleanup currently does a full querySelectorAll('*') walk for each root, which scales with total DOM size rather than the number of marked elements. On large documents this can exceed the 5s timeout; since timeout errors are swallowed, marker attributes can remain temporarily and interfere with subsequent page operations. Because markers are already tagged with data-oc-interactive-hints-owned, cleanup should target only those elements instead of scanning the entire tree.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f757126d96
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| el.setAttribute(hintAttr, hints.join(', ')); | ||
| el.setAttribute(ownedAttr, 'true'); |
There was a problem hiding this comment.
Avoid DOM attribute writes in interactive read path
The interactive pre-scan now calls setAttribute/removeAttribute on live page elements, which makes read_page observable to application code (for example, MutationObserver listeners or custom-element attribute handlers). On pages that react to attribute mutations, an otherwise read-only tool call can trigger app logic or re-renders and perturb subsequent automation state. Because this happens on every interactiveOnly read, it can affect production reliability; consider carrying hint state out-of-band (e.g., collected node ids/hints returned from evaluate) instead of mutating the DOM.
Useful? React with 👍 / 👎.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5c9ab68274
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| * Check if a node is interactive | ||
| */ | ||
| function isInteractive(tagName: string, attrMap: Map<string, string>): boolean { | ||
| if (attrMap.has(INTERACTIVE_HINT_ATTR)) return true; |
There was a problem hiding this comment.
Require owned marker before trusting interactive hints
Treating any data-oc-interactive-hints attribute as interactive allows page-authored markup to forge interactive controls and get echoed as trusted hint text. In interactiveOnly mode this can surface arbitrary non-clickable nodes, and in normal mode it can still add misleading ★ [...] markers, which degrades target selection accuracy and opens a prompt-injection path via attacker-controlled hint values. Restrict this check to markers created by this scan (for example by also requiring data-oc-interactive-hints-owned="true" in the snapshot) instead of trusting the public data attribute alone.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5c9ab68274
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| 'menu', 'menuitem', 'tab', 'switch', 'slider', | ||
| ]); | ||
| const interactiveTags = new Set(['a', 'button', 'input', 'select', 'textarea', 'details', 'summary']); | ||
| const roots: Array<Document | ShadowRoot> = [document]; |
There was a problem hiding this comment.
Scan iframe documents for custom interactive hints
markCursorInteractiveElements only traverses the top-level document, so custom-control hinting never runs inside iframe contentDocuments. In interactiveOnly mode, serialization later relies on data-oc-interactive-hints to classify non-semantic controls as interactive, while DOM.getDocument is still fetched with pierce: true; this means custom clickable elements inside iframes are omitted from interactive output even though their DOM is included. Run the hint scan per frame (or collect hints via frame-aware traversal) so iframe custom controls are not dropped.
Useful? React with 👍 / 👎.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f778c83799
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const path = `${parentPath}/c:${siblingIndex}`; | ||
| paths.set(el, path); | ||
|
|
||
| if (el.shadowRoot) roots.push({ root: el.shadowRoot, path: `${path}/s:0` }); |
There was a problem hiding this comment.
Track actual shadow-root index when recording hint paths
Use the same shadow-root indexing scheme as serialization when enqueueing roots for hint scanning. The scanner currently hardcodes "/s:0" for every el.shadowRoot, but serializer lookups use node.shadowRoots.indexOf(shadowRoot); if CDP exposes multiple shadow roots on a host (for example when user-agent roots are present), custom hints are written under a different path and never applied, so interactive-only output silently misses those controls.
Useful? React with 👍 / 👎.
| const previousDocument = (global as any).document; | ||
| (global as any).document = root; | ||
| try { await fn(hintAttr); } finally { (global as any).document = previousDocument; } | ||
| expect(el.setAttribute).toHaveBeenCalledWith(hintAttr, 'editable'); |
There was a problem hiding this comment.
Remove obsolete contenteditable marker assertion
This test still asserts that the scan path sets a temporary attribute (setAttribute(hintAttr, 'editable')), but the new implementation no longer mutates DOM attributes and instead builds hint records from a tree walk. With the current mock setup, invoking fn(...) does not exercise that old behavior, so the assertion is invalid and will fail when the suite is executed, causing a false negative in CI.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f778c83799
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| .mockImplementationOnce(async (fn: Function, hintAttr: string) => { | ||
| const el = { | ||
| tagName: 'DIV', | ||
| shadowRoot: null, | ||
| onclick: null, | ||
| isContentEditable: true, | ||
| getAttribute: (name: string) => name === 'contenteditable' ? 'plaintext-only' : null, | ||
| hasAttribute: () => false, | ||
| setAttribute: jest.fn(), | ||
| }; | ||
| const root = { querySelectorAll: () => [el] }; | ||
| const previousDocument = (global as any).document; | ||
| (global as any).document = root; | ||
| try { await fn(hintAttr); } finally { (global as any).document = previousDocument; } |
There was a problem hiding this comment.
Make contenteditable hint test execute the real scan path
This test currently no longer verifies contenteditable hint detection: the mocked page.evaluate callback is invoked as fn(hintAttr) even though the scan function now expects (maxMs, maxElements, includeIframes), and the mocked document object does not provide createTreeWalker. That causes the scan to throw before any hint logic runs, but serializeDOM swallows scan errors, so the test still passes and can miss regressions in contenteditable detection.
Useful? React with 👍 / 👎.
f778c83 to
2a9b5d4
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2a9b5d462b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| customInteractiveHints = new Map(scanResult.hints.map(({ path, hints }) => [path, hints])); | ||
| } |
There was a problem hiding this comment.
Key scan hints by stable node identity
The new interactive hint pipeline stores results by positional path (d/c:n/...) from scanCustomInteractiveElements and then applies them to a later DOM.getDocument snapshot. Because these are two separate browser operations, any DOM mutation between them (e.g., inserting/removing a sibling before the target) shifts the c:n indexes and can attach a hint to the wrong node or drop a real target in interactiveOnly mode. This makes interactive filtering nondeterministic on dynamic pages; anchoring hints to a stable identifier (such as backend node ids resolved at scan time) would avoid path drift.
Useful? React with 👍 / 👎.
2a9b5d4 to
8112b7f
Compare
|
@codex review Rebased onto current develop and pushed fixes for the current review/CI findings. Please review the current head. |
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
|
@codex review Follow-up commits pushed for the latest review and CI findings; please review the current head. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 37bf179f07
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| .mockImplementationOnce(async (fn: Function, hintAttr: string) => { | ||
| const el = { | ||
| tagName: 'DIV', | ||
| shadowRoot: null, | ||
| onclick: null, | ||
| isContentEditable: true, | ||
| getAttribute: (name: string) => name === 'contenteditable' ? 'plaintext-only' : null, | ||
| hasAttribute: () => false, | ||
| setAttribute: jest.fn(), | ||
| }; | ||
| const root = { querySelectorAll: () => [el] }; | ||
| const previousDocument = (global as any).document; | ||
| (global as any).document = root; | ||
| try { await fn(hintAttr); } finally { (global as any).document = previousDocument; } |
There was a problem hiding this comment.
Update stale contenteditable hint test invocation
This test still mocks the old page.evaluate shape (fn, hintAttr) and calls fn(hintAttr), but serializeDOM now calls the cursor scan as page.evaluate(fn, maxMs, maxElements, includeIframes). In this path, hintAttr is actually a number (the time budget), the callback throws, and serializeDOM swallows that error in its catch, so the test can pass without ever validating contenteditable hint behavior. This leaves regressions in computed contenteditable detection effectively untested.
Useful? React with 👍 / 👎.
Detect cursor/onClick/tabindex/contenteditable elements only during interactive DOM reads, annotate why they were included, and clear temporary markers after the CDP snapshot so default read_page behavior stays unchanged. Constraint: Implemented as an additive interactiveOnly DOM pass to avoid changing default snapshot output or OpenChrome action semantics. Rejected: Broad static inclusion of all pointer-cursor descendants | it bloats snapshots and duplicates inherited cursor containers. Confidence: high Scope-risk: moderate Directive: Keep future discovery heuristics opt-in or filter-scoped unless benchmark evidence shows default-token improvements. Tested: ./node_modules/.bin/jest tests/dom/dom-serializer.test.ts --runInBand Tested: ./node_modules/.bin/tsc -p tsconfig.json --noEmit Not-tested: Live browser validation is documented in the PR checklist for post-merge manual verification.
Constraint: GitHub CI runs lint:tool-schemas through a pipe across every PR branch.\nRejected: Updating the schema-lint baseline | crawl tool required-field descriptions can satisfy the ratchet instead.\nRejected: process.exit immediately after stdout write | it can truncate large introspection JSON in a pipe.\nConfidence: high\nScope-risk: narrow\nDirective: For one-shot JSON CLI output, await stdout flush before returning.\nTested: npm run build && npm run lint:tool-schemas\nNot-tested: full GitHub matrix before push
Constraint: Latest develop already carried the safe stdout-flush form, and rebase conflict resolution left an extra block terminator.\nRejected: Leaving the syntax error for CI | TypeScript build fails before tests.\nConfidence: high\nScope-risk: narrow\nDirective: Keep introspection output flush awaited without adding rejecting promises to the command path.\nTested: pending build rerun after syntax repair\nNot-tested: full GitHub matrix before push
37bf179 to
915e5f1
Compare
|
@codex review Rebased again onto current develop; CI syntax/tool-schema fixes are included at the current head. Please review current head. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 915e5f1bc1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| try { await fn(hintAttr); } finally { (global as any).document = previousDocument; } | ||
| expect(el.setAttribute).toHaveBeenCalledWith(hintAttr, 'editable'); |
There was a problem hiding this comment.
Align contenteditable hint test with current scan behavior
This test still exercises the removed marker-attribute flow (setAttribute expectation) and invokes the scan callback with a single hintAttr argument, but scanCustomInteractiveElements now expects (maxMs, maxElements, includeIframes) and relies on document.createTreeWalker. In this mock setup fn(hintAttr) runs with the wrong runtime/document shape and no attribute mutation path, so the assertion can never hold and the test will fail when the DOM serializer suite is executed in CI.
Useful? React with 👍 / 👎.
Summary
read_pageinteractive filtering to surface custom controls that are clickable but lack semantic tags/roles.interactiveOnly/filter: "interactive"is active.★ [cursor:pointer, onclick]while keeping temporary marker attributes out of output and cleaning them after snapshot capture.Closes #967
Fit / non-overlap review
read_pagecalls.Tests
./node_modules/.bin/jest tests/dom/dom-serializer.test.ts --runInBand./node_modules/.bin/tsc -p tsconfig.json --noEmitOpenChrome real-validation checklist
After merge, validate with OpenChrome itself:
<div style="cursor:pointer" onclick="...">Open settings</div>withoutroleortabindex.read_pagewith{ "mode": "dom", "filter": "interactive" }.[cursor:pointer, onclick].read_pagewithoutfilter: "interactive"; confirm the output is not bloated with the discovery marker attribute.