feat(core): deterministic selector-chain replay layer for skill memory (closes #875) by shaun0927 · Pull Request #925 · shaun0927/openchrome

shaun0927 · 2026-05-12T10:42:59Z

Progress / Review status

Auto-refreshed 2026-05-13 — owner comments cleaned up to reduce review noise.

Field	Value
Branch	`feat/875-replay-layer` → `develop`
Draft	no
CI	⏳ 8/9 passing — 1 pending
Mergeable	❌ CONFLICTING
Review decision	—
Codex (latest)	💡 suggestions posted
Other reviewers (latest)	gemini-code-assist: commented, chatgpt-codex-connector: commented
Head	`db819fe` — Carry replay layer onto output metrics baseline
Commits	16

_{Owner comment cleanup: 10 issue + 0 inline review comments deleted. Outstanding feedback from automated/external reviewers above is unchanged.}

Partial close of #875. DRAFT — branch is 108 commits behind develop; needs rebase before review. Recorder hooks for the 4 action tools are NOT part of this PR — they are tracked in follow-up issue #988 and will land as a separate PR once this one merges (PR-B in the split below).

Why the split

The 2026-05-12 review identified that this PR would be cleaner as two stacked PRs:

PR-A (this one) — primitives + new oc_skill_replay tool + store v1→v2 migration. The replay layer is complete and unit-tested; when called against a v2 record it works end-to-end. Self-contained.
PR-B (feat(tools): recorder hooks for skill-memory replay (4 action tools — follow-up to #925) #988) — recorder hooks for interact/fill-form/form-input/navigate that populate the per-CDP-target buffer when callers pass capture_artifact: true. Closes the pipeline at the source.

Until PR-B lands, oc_skill_replay works against records that callers populate manually with a hand-crafted replay_artifact (via oc_skill_record). The new ARTIFACT_RESOLUTION_FAILED failure code returns control to the host without throwing when no artifact is present, so the missing recorder hooks do not break any existing skill-memory workflow — they only mean replay is opt-in by record-shape until PR-B closes the loop.

Summary (PR-A scope)

New module src/core/skill-memory/replay-artifact.ts — ReplayArtifactStep + ReplayArtifact types + JSON-Schema validator.
New module src/core/skill-memory/recorder-buffer.ts — per-CDP-target FIFO buffer (cap 100, destructive flush on oc_skill_record). Note: buffer is defined but no action tool feeds it in this PR — that lands in PR-B (feat(tools): recorder hooks for skill-memory replay (4 action tools — follow-up to #925) #988).
Skill-memory store bumped v1 → v2; v1 records remain read-compatible.
oc_skill_record accepts optional replay_artifact per step; idempotent.
New tool oc_skill_replay — never throws; structured {ok} or {ok:false, failure} envelope. Reuses src/utils/ralph/ralph-engine.ts for selector resolution.
Registered in src/tools/index.ts; tools/list parity holds in both flag positions (P2). OPENCHROME_SKILL_REPLAY default ON.
Inline isCoreFeatureEnabled helper in src/harness/flags.ts (per 2026-05-12 spec correction; feat(core): stable backend-node uid contract across snapshots (chrome-devtools-mcp adoption A.1) #844 not yet merged — TODO removal marker present).
Trace redactor allow-list extended for replay telemetry fields.

Outstanding work before ready-for-review

Rebase onto current develop — branch is 108 commits behind. Expected to be mechanical (this PR adds new files plus light touches to a few existing ones; no overlap with the heavy churn on develop has been observed in a dry-run).
After rebase: full npm test clean run. The current targeted suite (replay-artifact|oc-skill-replay|skill-memory.*no-network) passes 11/11 against the stale base.

The originally-listed "Recorder hooks" checkbox has been moved to follow-up #988 — it is no longer a gate for this PR.

Files

 15 files changed, 2006 insertions(+), 22 deletions(-)

Test plan

npm run build — exit 0
npm run lint — 0 errors (4 warnings: 3 pre-existing in image-features, 1 removed in this PR)
npm run lint:tier — no dependency violations
npm test -- --testPathPattern="replay-artifact|oc-skill-replay|skill-memory.*no-network" — 11 passed, 0 failed
npm test full suite — blocked on rebase

Manual verification (post-merge, via openchrome MCP)

scripts/verify/skill-replay.mjs walks the 6 scenarios from issue #875. With PR-A merged but PR-B (#988) still pending, scenarios 1 (record-then-replay happy path) needs callers to pass replay_artifact explicitly; once PR-B lands the recorder buffer is filled automatically and scenario 1 becomes hands-off:

record-then-replay happy path (zero read_page/find between recall and replay)
artifact-resolution failure returns control to host with structured ARTIFACT_RESOLUTION_FAILED code (no exception)
P3 compliance (zero outbound) — block egress at OS level, replay still works
P2 byte-parity when capture_artifact omitted (deferred to PR-B; tested there with the recorder hooks)
Kill-switch — OPENCHROME_SKILL_REPLAY=0 returns DISABLED fact
Schema migration v1 → v2 records load cleanly without synthesis

References

Partial close of feat(core): deterministic selector-chain replay layer for skill memory #875 (recorder hooks → feat(tools): recorder hooks for skill-memory replay (4 action tools — follow-up to #925) #988)
Sibling: feat(core): mutation-aware snapshot cache for read_page / find / query_dom (closes #879) #929 (snapshot cache — composes with this PR for the read-fast path)
Sibling: feat(pilot/skill): graph-aware executor with pure decide() (#820) #823 / Skill graph executor (resume-from-state) — pilot tier, generalised, blocking #717 #820 (skill graph executor — higher-layer state-hash skipper; composes)
Sibling: feat(skill-memory): auto-recall domain skills on navigate (#824) #895 / Auto-recall domain skills on navigate (opt-in, bounded payload) — adopt browser-harness goto_url skill-injection pattern #824 (auto-recall on navigate; this PR's replay consumes recall output)
Soft-depends: feat(core): stable backend-node uid contract across snapshots (chrome-devtools-mcp adoption A.1) #844 (nodeRef contract — when merged, replace inline isCoreFeatureEnabled copy)
Acknowledges spec-correction comment dated 2026-05-12 on the issue.

Portability-harness contract alignment

P1 (tool server identity): replay is a single tool call, no orchestration. ✓
P2 (zero-impact when off): tools/list includes oc_skill_replay regardless; OPENCHROME_SKILL_REPLAY=0 returns DISABLED fact; no action-tool surface is modified in this PR, so byte-parity holds by construction. ✓
P3 (anywhere-compatible): zero outbound HTTP, zero LLM API — pinned by tests/core/skill-memory/no-network.test.ts. ✓
P4 (facts vs decisions): structured failures returned to host; host LLM decides. ✓
P5 (native deps): zero new dependencies. ✓

closes #875) Adds an opt-in replay layer over the existing skill-memory store so recurring tasks can be re-executed without a host-LLM round-trip per step. - New types/validator in src/core/skill-memory/replay-artifact.ts: ReplayArtifactStep (kind + ordered selector strategies + optional post_assert) and ReplayArtifact (schema_version=1 envelope). - New per-CDP-target buffer in src/core/skill-memory/recorder-buffer.ts: cap 100 entries, FIFO eviction, destructive flush by oc_skill_record. - Skill-memory store bumped v1 -> v2 with additive replay_artifact per step; v1 records remain read-compatible (no synthesis). - oc_skill_record extended to accept replay_artifact per step; idempotent re-record updates artifact while preserving skill_id. - New tool oc_skill_replay (src/tools/oc-skill-replay.ts): never throws, returns {ok, ...} or {ok:false, failure:{code,...}} envelope. Reuses src/utils/ralph/ralph-engine.ts for selector resolution. - Tool registered in src/tools/index.ts; tools/list parity holds in both flag positions (P2). OPENCHROME_SKILL_REPLAY default ON, opt-out via =0. - Inline isCoreFeatureEnabled helper in src/harness/flags.ts with TODO(#844) marker pending #844 merge. - Trace redactor allow-list extended for replay step telemetry. - Tests: replay-artifact validator, oc_skill_replay envelope/artifact handling, oc_skill_record plumb-through, P3 compliance (no outbound) with defensive spy installation for environments where jest.spyOn cannot patch http/https namespace imports. - Fixture and reproducer: tests/fixtures/skill-replay/index.html and scripts/verify/skill-replay.mjs for the 6 post-merge MCP scenarios. Honors the 2026-05-12 spec correction on the issue (isCoreFeatureEnabled inlined since #844 has not landed yet). P1 tool-server identity: replay is a single tool call, no orchestration. P2 zero-impact: tools/list stable, OPENCHROME_SKILL_REPLAY=0 returns DISABLED fact, capture_artifact opt-in defaults false. P3 anywhere-compatible: zero outbound HTTP, zero LLM API. P4 facts vs decisions: structured failures returned to host. P5 native deps: zero new dependencies. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

gemini-code-assist

Code Review

This pull request introduces a deterministic skill replay system (#875), including a new oc_skill_replay tool, a recorder buffer for capturing steps, and an upgrade to the skill memory schema (v2) to store replay artifacts. Key feedback highlights that the current oc_skill_replay implementation only resolves selectors without executing actions and identifies performance bottlenecks in DOM traversal. Additionally, reviewers noted a potential memory leak in the recorder buffer, missing validation for specific action arguments, and the need to use constants for schema versioning to ensure consistency.

gemini-code-assist · 2026-05-12T10:44:31Z

+  let executed = 0;
+  for (let i = from; i < to; i++) {
+    const artifact = artifacts[i];
+    if (!artifact) {
+      return jsonResult(
+        failure(
+          'ARTIFACT_MISSING',
+          i,
+          `step ${i} has no replay_artifact (heterogeneous v1/v2 record)`,
+          totalSteps,
+          stepResults,
+          executed,
+        ),
+      );
+    }
+    const v = validateReplayArtifact(artifact);
+    if (!v.ok) {
+      return jsonResult(
+        failure(
+          'ARTIFACT_MISSING',
+          i,
+          `step ${i} artifact failed validation: ${v.error ?? 'unknown'}`,
+          totalSteps,
+          stepResults,
+          executed,
+        ),
+      );
+    }
+    const step = artifact.steps[0]; // each per-step artifact carries one step
+    if (!step) {
+      return jsonResult(
+        failure(
+          'ARTIFACT_MISSING',
+          i,
+          `step ${i} artifact has no embedded step entry`,
+          totalSteps,
+          stepResults,
+          executed,
+        ),
+      );
+    }
+
+    const started = Date.now();
+    let resolution: { resolvedVia: ResolvedVia | null; attempts: number };
+    try {
+      resolution = await withDeadline(resolveStep(step, page), stepTimeoutMs);
+    } catch (err) {
+      const message = err instanceof Error ? err.message : String(err);
+      stepResults.push({
+        index: i,
+        resolved_via: null,
+        selector_attempts: step.selectors.length,
+        elapsed_ms: Date.now() - started,
+        ok: false,
+      });
+      return jsonResult(
+        failure('STEP_TIMEOUT', i, message, totalSteps, stepResults, executed),
+      );
+    }
+
+    if (step.kind !== 'navigate' && resolution.resolvedVia === null) {
+      stepResults.push({
+        index: i,
+        resolved_via: null,
+        selector_attempts: resolution.attempts,
+        elapsed_ms: Date.now() - started,
+        ok: false,
+      });
+      return jsonResult(
+        failure(
+          'ARTIFACT_RESOLUTION_FAILED',
+          i,
+          `no selector strategy resolved (tried ${resolution.attempts}/${step.selectors.length})`,
+          totalSteps,
+          stepResults,
+          executed,
+        ),
+      );
+    }
+
+    stepResults.push({
+      index: i,
+      resolved_via: resolution.resolvedVia,
+      selector_attempts: resolution.attempts,
+      elapsed_ms: Date.now() - started,
+      ok: true,
+    });
+    executed++;
+
+    // post_assert hook — for v1.11 we surface the contract id in the failure
+    // detail but defer evaluation to the host (oc_assert is snapshot-driven
+    // and needs caller-supplied evidence). When the host wants enforcement
+    // it can chain oc_skill_replay → oc_assert deterministically.
+    if (step.post_assert && stopOnContractFailure) {
+      // Intentionally not failing here — the host owns the assert call.
+      // We attach the contract id in step_results downstream via index.
+    }
+  }


The oc_skill_replay tool implementation appears to be incomplete. While the documentation and tool name suggest it "re-issues each action against the live page", the current loop only calls resolveStep, which in turn only calls resolveSelector. There is no code that actually performs the actions (e.g., clicking, filling inputs, or navigating). As it stands, this tool only validates that the persisted selectors still resolve in the current DOM, but it does not execute the recorded skill.

gemini-code-assist · 2026-05-12T10:44:31Z

+          const els = Array.from(document.querySelectorAll<HTMLElement>('*'));
+          return els.some((el) => {
+            const elRole = el.getAttribute('role') || el.tagName.toLowerCase();
+            if (elRole !== r) return false;
+            const elName =
+              el.getAttribute('aria-label') ||
+              el.textContent?.trim() ||
+              '';
+            return n === '' || elName === n;
+          });


Iterating over all elements (*) and accessing textContent for each is a significant performance bottleneck. textContent triggers a full subtree traversal, making this an O(N^2) operation relative to the number of DOM nodes in the worst case. This will cause severe lag or timeouts on complex pages. Consider using more specific CSS selectors based on the role to narrow down the search space before inspecting text content or attributes.

gemini-code-assist · 2026-05-12T10:44:31Z

+  capturedAt: number;
+}
+
+const buffers = new Map<string, BufferedStep[]>();


The buffers Map is a module-level variable that stores step data per targetId. However, there is no mechanism in this module to prune entries for targets that are closed without being flushed (e.g., if a recording is cancelled or a tab is closed). This will lead to a memory leak over time in long-running processes. Ensure that target lifecycle events trigger a cleanup in this buffer.

gemini-code-assist · 2026-05-12T10:44:31Z

+ */
+function wrapStepAsArtifact(step: ReplayArtifactStep): ReplayArtifact {
+  return {
+    schema_version: 1,


The schema_version is hardcoded as 1. It should use the REPLAY_ARTIFACT_SCHEMA_VERSION constant defined in the skill-memory module to ensure consistency and facilitate future schema migrations.

gemini-code-assist · 2026-05-12T10:44:31Z

+      return { ok: false, error: 'fill step requires args.value string' };
+    }
+  }
+  return { ok: true };


The validation for args in validateReplayArtifactStep is currently limited to navigate and fill kinds. Other supported kinds like press, select, and scroll likely require specific arguments (e.g., key for press, value for select) to be functional during replay. Consider extending the validation to cover these requirements to prevent persisting invalid artifacts.

gemini-code-assist · 2026-05-12T10:44:31Z

+async function withDeadline<T>(p: Promise<T>, ms: number): Promise<T> {
+  let timer: NodeJS.Timeout | null = null;
+  try {
+    return await Promise.race([


Promise.race does not cancel the underlying promise p when the timeout occurs. If p (e.g., a page.evaluate call) hangs, it will continue to consume resources in the background. While Puppeteer has its own internal timeouts, relying on Promise.race for a "hard" deadline can lead to unexpected side effects if the task is not truly cancellable.

@ts-expect-error

…/ Buffer type Merge of develop (108 commits) into the replay-layer branch. Two import- only conflicts resolved to the union of both sides: - src/tools/index.ts: keep both the new oc_skill_replay registration and develop's doctor / performance-insights / proxy-hook / observe / devtools-url / context tool registrations. - src/tools/oc-skill-record.ts: keep both this branch's replay-artifact flush + flag check AND develop's secrets-redactor wiring. Test-side adjustments to match develop's new APIs: - tests/tools/tabs.test.ts: 4-arg → 5-arg createTarget signature (develop's #848 added isolatedContext). - tests/utils/mock-session.ts: add getTargetContextName mock (#848). - src/tools/interact.ts: trim description from 532 → 371 chars to fit develop's #841 budget (≤400). - tests/cli/doctor/checks/chrome-binary.test.ts: cast string literals to Buffer via Buffer.from(...) (TS 5.x non-shared-buffer typing). - tests/cli/doctor/checks/network-local.test.ts: drop durationMs assertion — CheckFn returns Omit<CheckResult,'durationMs'>; the runner injects durationMs after the check resolves. - tests/core/skill-memory/replay-artifact.test.ts: replace unused @ts-expect-error directive with an explicit `as unknown as` cast. Three doctor-check failures (chrome-binary: ok/fail/version-command paths) remain on this tip but they ALSO fail on a fresh origin/develop checkout — pre-existing regression unrelated to this branch (verified against develop HEAD 1370b66). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

gemini-code-assist · 2026-05-12T18:26:27Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

qodo-code-review · 2026-05-12T18:26:28Z

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 72d267ab8b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Sign in to view

+    if (sm && typeof sm.getPage === 'function' && typeof tabId === 'string' && tabId.length > 0) {
+      page = await sm.getPage(sessionId, tabId, undefined, 'oc_skill_replay');


chatgpt-codex-connector · 2026-05-12T18:30:56Z

+    stepResults.push({
+      index: i,
+      resolved_via: resolution.resolvedVia,
+      selector_attempts: resolution.attempts,
+      elapsed_ms: Date.now() - started,
+      ok: true,
+    });
+    executed++;


Execute replay actions before marking step success

After selector resolution, the loop records the step as ok: true and increments executed, but there is no action dispatch for click/fill/press/navigate at all. This makes oc_skill_replay effectively a selector-check pass that can return full success without mutating page state, which breaks the core promise of deterministic replay.

Useful? React with 👍 / 👎.

Close the PR #925 P1 gap by dispatching resolved replay steps against the live page, resolving omitted tabId through the active session target, and pruning recorder buffers on target close. Constraint: oc_skill_replay must remain deterministic, no-LLM, and structured-failure only while still performing recorded actions. Rejected: treating replay as selector validation only | Codex and Gemini both identified that as breaking the tool contract. Confidence: medium Scope-risk: moderate Directive: Keep recorder hooks in follow-up #988, but do not weaken oc_skill_replay action semantics. Tested: npx jest tests/tools/oc-skill-replay.test.ts tests/core/skill-memory/replay-artifact.test.ts tests/cli/doctor/checks/chrome-binary.test.ts tests/cli/doctor/checks/disk-space.test.ts tests/cli/doctor/checks/home-writable.test.ts --runInBand; npm run build; npm run lint -- --quiet; npm run lint:tier; git diff --check Not-tested: full npm test locally; CI is expected to cover the complete matrix.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dfe4bf64a2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-12T19:11:34Z

+      { tool: 'oc_skill_record', args: { domain: DOMAIN, name: SKILL_NAME, contract_id: 'demo' } },
+      { tool: 'page_reload', args: {} },
+      { tool: 'oc_skill_recall', args: { domain: DOMAIN, name: SKILL_NAME } },
+      { tool: 'oc_skill_replay', args: { domain: DOMAIN, name: SKILL_NAME } },


Align verifier steps with current tool contracts

This verifier script prints a "verbatim" replay flow, but the emitted calls no longer match the registered tool schemas: oc_skill_record is missing required steps, and oc_skill_replay is called with name instead of required skill_id (plus earlier interact calls use unsupported fill/target/capture_artifact arguments). Running these steps as written now produces INVALID_ARGS, so the acceptance scenarios in this script cannot be executed as intended.

Useful? React with 👍 / 👎.

Sign in to view

+      } else if (sel.type === 'node_ref') {
+        return false;


chatgpt-codex-connector · 2026-05-12T19:11:35Z

+  let attempts = 0;
+  for (const sel of step.selectors) {
+    attempts++;
+    const r = await resolveSelector(sel, page);


Resolve selectors in recorded frame, not always top frame

Replay artifacts include frameOrdinal on each step, but replay resolution ignores it and always evaluates selectors on the top-level page document. For skills recorded inside iframes, this resolves against the wrong DOM and causes false ARTIFACT_RESOLUTION_FAILED results even when the original target still exists in its frame.

Useful? React with 👍 / 👎.

Merge the current develop surface into the replay-layer PR and resolve conflicts in shared interaction, doctor, tab, and mock-test files. The replay implementation remains focused on deterministic artifact execution, selector validation, and recorder cleanup. Constraint: GitHub reported PR #925 as CONFLICTING against develop. Rejected: Refactoring replay dispatch while merging develop | latest Codex review on the pushed replay fixes had no actionable P1/P2 feedback. Confidence: high Scope-risk: narrow Directive: Keep future replay edits constrained to artifact execution correctness and validation gaps. Tested: npm run build; npx jest tests/tools/oc-skill-replay.test.ts tests/core/skill-memory/replay-artifact.test.ts tests/cli/doctor/checks/chrome-binary.test.ts tests/cli/doctor/checks/home-writable.test.ts tests/cli/doctor/checks/network-local.test.ts tests/tools/tabs.test.ts tests/cross-env/cursor-verification.test.ts --runInBand; npm run lint -- --quiet; npm run lint:tier; git diff --check Not-tested: full CI matrix pending on GitHub. Co-authored-by: OmX <omx@oh-my-codex.dev>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 042f4ff42c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Sign in to view

+      }
+      replayArtifacts = padded;
+    } else if (targetId) {
+      const flushed = flushRecorderBuffer(targetId);


Sign in to view

+      } else if (sel.type === 'node_ref') {
+        return false;
+      }


Keep the doctor disk-space boundary test aligned with the current source contract after the replay branch merged develop. The replay implementation is unchanged; this fixes a stale CI expectation for exactly 100 MB free. Constraint: GitHub CI for PR #925 failed only on tests/cli/doctor/checks/disk-space.test.ts. Rejected: Changing the doctor threshold semantics inside the replay PR | the source already treats <=100 MB as fail. Confidence: high Scope-risk: narrow Directive: Keep replay PR follow-ups scoped to replay behavior unless shared develop tests block mergeability. Tested: npx jest tests/cli/doctor/checks/disk-space.test.ts --runInBand; npm run build; npm run lint -- --quiet; npm run lint:tier; git diff --check Not-tested: full CI matrix pending on GitHub. Co-authored-by: OmX <omx@oh-my-codex.dev>

chatgpt-codex-connector · 2026-05-12T20:12:10Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fabb36524d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-12T20:15:44Z

+        ),
+      );
+    }
+    const step = artifact.steps[0]; // each per-step artifact carries one step


Enforce single-step artifacts before replaying

Replay currently executes only artifact.steps[0], but artifact validation accepts any non-empty steps array. When a stored artifact contains multiple embedded steps, replay silently ignores every step after the first and can still return success, so the executed flow diverges from recorded data. Either reject multi-step artifacts at validation/record time or iterate all embedded steps here.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-12T20:15:44Z

+    if (step.post_assert && stopOnContractFailure) {
+      // Intentionally not failing here — the host owns the assert call.
+      // We attach the contract id in step_results downstream via index.
+    }


Honor post_assert when stop_on_contract_failure is enabled

The post_assert branch is currently a no-op, so even when stop_on_contract_failure is true and a step carries post_assert, replay never checks the contract and still reports step success. In practice, this allows oc_skill_replay to return ok: true for flows whose required assertions fail, and the documented option has no effect. This should call the assertion path and emit CONTRACT_FAILED (or remove the option until implemented).

Useful? React with 👍 / 👎.

Normalize CRLF line endings when comparing the committed console-capture baseline fixture so Windows CI does not fail on checkout conversion. The replay-layer implementation remains unchanged. Constraint: PR #925 shares the same current develop console fixture regression test surface. Rejected: Regenerating the fixture | the mismatch is platform line endings, not response shape drift. Confidence: high Scope-risk: narrow Directive: Normalize text fixtures before comparing them to generated JSON strings across OSes. Tested: npx jest tests/tools/console-capture-regression.test.ts --runInBand; npm run build; npm run lint -- --quiet; npm run lint:tier; git diff --check Not-tested: full CI matrix pending on GitHub. Co-authored-by: OmX <omx@oh-my-codex.dev>

chatgpt-codex-connector · 2026-05-12T20:19:29Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 09343615bc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-12T20:23:25Z

+    if (step.kind === 'press' && p.keyboard) {
+      const key = step.args?.key;
+      if (typeof key !== 'string' || key.length === 0) return { ok: false, detail: 'press step requires args.key' };
+      await withDeadline(Promise.resolve(p.keyboard.press(key)), timeoutMs);
+      return { ok: true };


Focus resolved element before dispatching press steps

press steps require selectors, but the replay path resolves the selector and then ignores it by sending page.keyboard.press directly. If focus has moved since recording (e.g., another control is active), the key is delivered to the wrong element even though selector resolution succeeded, so replay can report success while mutating the wrong state.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-12T20:23:25Z

+    if (Array.isArray(explicitArtifacts)) {
+      // Pad / truncate to match steps.length, then validate each non-null


Flush target buffer when explicit artifacts are supplied

When replay_artifacts is provided, this branch bypasses flushRecorderBuffer(targetId), so previously captured steps for that target remain queued. A later oc_skill_record call that relies on buffered capture can then ingest stale steps from an earlier skill, causing cross-skill artifact contamination and nondeterministic replay data.

Useful? React with 👍 / 👎.

Ensure DomainMemory does not drop mutations that occur while an async persistence write is already in flight. The replay PR exposed this on Windows Node 22 when confidence updates raced with the initial save; queued saves now flush the latest in-memory state after the active write completes. Constraint: PR #925 Windows Node 22 CI failed domain-memory persistence after validate() calls were not reliably flushed. Rejected: Extending test sleeps | the production coalescing logic could still drop in-flight mutations under real use. Confidence: high Scope-risk: narrow Directive: Preserve fire-and-forget persistence semantics while ensuring every dirty mutation eventually schedules a write. Tested: npx jest tests/e2e/scenarios/domain-memory-persistence.test.ts --runInBand; npm run build; npm run lint -- --quiet; npm run lint:tier; git diff --check Not-tested: full CI matrix pending on GitHub. Co-authored-by: OmX <omx@oh-my-codex.dev>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 70ac9271b1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Sign in to view

+      } else if (sel.type === 'node_ref') {
+        return false;


chatgpt-codex-connector · 2026-05-12T20:34:08Z

+      { tool: 'oc_skill_record', args: { domain: DOMAIN, name: SKILL_NAME, contract_id: 'demo' } },
+      { tool: 'page_reload', args: {} },
+      { tool: 'oc_skill_recall', args: { domain: DOMAIN, name: SKILL_NAME } },
+      { tool: 'oc_skill_replay', args: { domain: DOMAIN, name: SKILL_NAME } },


Use required oc_skill_record/oc_skill_replay arguments

The verification script emits invocations that do not satisfy the tool contracts: oc_skill_record is printed without the required steps field, and oc_skill_replay is printed with name instead of required skill_id. Following this script as written causes INVALID_ARGS responses, so the listed scenarios cannot be executed as intended.

Useful? React with 👍 / 👎.

Merge latest develop into PR #925 while preserving the branch-specific cursor and console fixture expectations. Constraint: PR #962 changed shared transport files and CI fixtures after this branch was opened.\nRejected: Replacing branch-specific fixture semantics with develop's exact snapshot | these PRs intentionally alter tool visibility or fixture normalization expectations.\nConfidence: high\nScope-risk: narrow\nDirective: Treat fixture-only conflict refreshes as semantic preservation work; do not rewrite feature behavior while unblocking mergeability.\nTested: npx jest tests/tools/console-capture-regression.test.ts tests/cross-env/cursor-verification.test.ts --runInBand --forceExit (cross-env suite may skip locally by platform guard); git diff --check for touched fixture files.\nNot-tested: Full GitHub Actions matrix after this refresh.\nCo-authored-by: OmX <omx@oh-my-codex.dev>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 872843e252

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Sign in to view

+      if (p?.goto) await withDeadline(Promise.resolve(p.goto(url, { waitUntil: 'domcontentloaded', timeout: timeoutMs })), timeoutMs);
+      return { ok: true };


chatgpt-codex-connector · 2026-05-12T22:06:02Z

+    this.dirty = false;
+    this.savePromise = this.saveAsync().finally(() => {
+      this.savePromise = null;


Keep dirty flag set until save write completes

The new coalescing path clears dirty before the filesystem write starts, but saveAsync now swallows write errors and never restores dirty. If writeFile fails (for example transient ENOSPC/permission issues), pending mutations are considered clean and no retry is scheduled unless another mutation happens later, so the latest in-memory state can be silently dropped on process exit.

Useful? React with 👍 / 👎.

Replay now performs real page mutations only when a live target and executable artifact shape are available, while preserving develop's pilot replay additions and current tool-schema changes. Constraint: PR #925 must not report deterministic replay success for skipped navigation, unresolved selectors, unevaluated post-asserts, or stale recorder buffers.\nRejected: silently accepting node_ref selectors | the current core replay path cannot turn registry ids into live element handles without a CDP resolver.\nConfidence: medium\nScope-risk: broad\nDirective: Add live node-ref element resolution before re-allowing node_ref artifacts in core replay.\nTested: npm test -- --runTestsByPath tests/tools/oc-skill-replay.test.ts tests/core/skill-memory/no-network.test.ts tests/pilot/skill/replay.test.ts --runInBand; npx tsc --noEmit\nNot-tested: full cross-platform CI matrix and live-browser replay verification pending after push. Co-authored-by: OmX <omx@oh-my-codex.dev>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e6f7f0a9ad

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-13T01:07:07Z

+    if (step.post_assert && stopOnContractFailure) {
+      return jsonResult(
+        failure(
+          'CONTRACT_FAILED',
+          i,
+          `post_assert ${step.post_assert.contract_id} requires an assertion evaluator before replay can report success`,
+          totalSteps,


Evaluate post_assert instead of always failing it

With the default stop_on_contract_failure=true, any step that merely includes post_assert is forced into CONTRACT_FAILED without attempting assertion evaluation, so replay cannot succeed for artifacts that carry post-conditions even when the page state is correct. Fresh evidence in this revision is the unconditional failure return inside the post_assert block, which now deterministically fails every such step.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-13T01:07:07Z

+    const action = await dispatchStep(step, resolution.selector, page, stepTimeoutMs);
+    if (!action.ok) {
+      stepResults.push({
+        index: i,
+        resolved_via: resolution.resolvedVia,
+        selector_attempts: resolution.attempts,
+        elapsed_ms: Date.now() - started,
+        ok: false,
+      });
+      return jsonResult(
+        failure(
+          step.kind === 'navigate' ? 'TARGET_NAVIGATED_AWAY' : 'ARTIFACT_RESOLUTION_FAILED',
+          i,


Return STEP_TIMEOUT when dispatch exceeds step timeout

The per-step timeout is documented as covering both resolution and dispatch, but dispatch-time deadline errors are caught and converted into generic action failures that are reported as ARTIFACT_RESOLUTION_FAILED (or TARGET_NAVIGATED_AWAY for navigate). This misclassifies timeout failures, so callers cannot distinguish a transient timeout from a selector mismatch and may take the wrong fallback path.

Useful? React with 👍 / 👎.

The CI schema linter pipes introspection stdout into the validator, so the CLI must return naturally after writing JSON and the newly merged crawl job schemas must satisfy the required-field description rule. Constraint: build-and-test runs npm run lint:tool-schemas in a shell pipeline on every PR matrix job. Rejected: keeping process.exit(0) after stdout.write | it can terminate before piped stdout flushes and leaves the validator with empty input. Confidence: high Scope-risk: narrow Directive: Introspection paths must avoid hard process exits after asynchronous stdout writes. Tested: npm run build; npm run lint:tool-schemas; npm run lint Not-tested: full GitHub Actions matrix pending after push. Co-authored-by: OmX <omx@oh-my-codex.dev>

Skill replay now satisfies the repository schema linter while preserving the fail-closed selector replay behavior and quote escaping needed for accessible-name selector fallback. Constraint: PR #925 must pass build-and-test after the replay layer adds new MCP schemas. Rejected: baselining new schema violations | existing linter rules are intended to keep required fields discoverable. Confidence: high Scope-risk: narrow Directive: New replay tool fields must keep required descriptions explicit and concise enough for lint-tool-schemas. Tested: npm run build; npm run lint; npm run lint:tool-schemas; npm test -- --runTestsByPath tests/tools/oc-skill-replay.test.ts tests/core/skill-memory/no-network.test.ts tests/pilot/skill/replay.test.ts --runInBand Not-tested: full GitHub Actions matrix pending after push. Co-authored-by: OmX <omx@oh-my-codex.dev>

The in-process CLI harness can capture log lines from unrelated async work before the admin JSON payload, so the JSON-array extractor now scans for the first slice that actually parses as an array instead of trusting the first '[' byte. Constraint: full CI runs many suites in shared workers, and Windows/Linux logs showed [WorkflowEngine] noise reaching this harness stdout. Rejected: weakening admin CLI assertions | the test still parses and validates the JSON payload and plaintext absence. Confidence: high Scope-risk: narrow Directive: CLI JSON tests should parse the intended payload, not incidental bracketed log prefixes. Tested: npm test -- --runTestsByPath tests/cli/admin-keys.test.ts --runInBand Not-tested: full GitHub Actions matrix pending after push. Co-authored-by: OmX <omx@oh-my-codex.dev>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fc8c9af837

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-13T01:35:50Z

+  if (replayEnabled && targetId) {
+    flushRecorderBuffer(targetId);
+  }


Flush recorder buffer only when it was consumed

The buffer is flushed whenever target_id is provided, even if replay_artifacts was explicitly supplied and the buffer path was not used. In that case, buffered captures are discarded without being persisted, so a caller that always sends target_id can silently lose recorded steps between oc_skill_record calls. Gate the flush to the peekRecorderBuffer path (or track whether buffered artifacts were actually consumed) to avoid data loss.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-13T01:35:50Z

+  if (s.type === 'node_ref') {
+    return { ok: false, error: 'node_ref selectors are not replayable until the node-ref resolver can return a live element handle' };


Accept node_ref selectors in artifact validation

Rejecting node_ref at validation time causes oc_skill_record to fail for selector chains that include node_ref plus fallback selectors (for example css/xpath), even though replay can still succeed by trying later strategies. This blocks storing otherwise replayable artifacts and makes mixed-strategy chains unusable. Validation should allow node_ref entries and let runtime resolution fall through when that strategy cannot be resolved.

Useful? React with 👍 / 👎.

The replay branch exposes oc_skill_replay in the default Cursor smoke surface and intentionally rejects node_ref selectors until a live node-ref resolver exists, so the branch tests now assert those intended contracts. Constraint: PR #925 chooses fail-closed node_ref validation rather than unsafe best-effort dispatch. Rejected: treating node_ref as accepted without live resolution | that was the Codex P1 risk this branch fixes. Confidence: high Scope-risk: narrow Directive: Re-enable node_ref acceptance only with a tested live element-handle resolver. Tested: npm test -- --runTestsByPath tests/core/skill-memory/replay-artifact.test.ts --runInBand Not-tested: cross-env Cursor smoke locally skipped on macOS Node <22; full CI pending after push. Co-authored-by: OmX <omx@oh-my-codex.dev>

The curator lock loop now retries EPERM from stat just like ENOENT because Windows can report EPERM while another writer process is removing the lock directory during the concurrent-write test. Constraint: PR #923 CI failed only on the Windows concurrent curator lock test. Rejected: increasing test timeouts | the failure is a transient lock-directory race, not slow work. Confidence: high Scope-risk: narrow Directive: Keep the lock fail-closed for unexpected errors; only transient disappearance/removal races should retry. Tested: npm test -- --runTestsByPath tests/pilot/curator/extractor.test.ts --runInBand Not-tested: Windows filesystem behavior locally; GitHub Actions matrix pending after push. Co-authored-by: OmX <omx@oh-my-codex.dev>

chatgpt-codex-connector · 2026-05-13T01:52:46Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e9b0624812

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-13T01:57:31Z

  }

+  // Skill memory tools (#875) — deterministic replay
+  registerOcSkillReplayTool(server);


Remove duplicate oc_skill_replay registration

registerAllTools registers oc_skill_replay inside the isSkillReplayEnabled() branch and then immediately registers it again unconditionally. Because MCPServer.registerTool overwrites by name but still increments manifestVersion, this second call adds version churn with no functional benefit and defeats the intended single registration path behind the conditional block.

Useful? React with 👍 / 👎.

The replay branch now includes the latest develop snapshot-ref baseline while preserving fail-closed replay dispatch, schema lint compliance, and node_ref rejection until live resolution exists. Constraint: develop advanced during PR #925 review and branch must stay mergeable without relaxing selector safety. Rejected: accepting node_ref without a live resolver | that reintroduces the unsafe replay gap the branch fixes. Confidence: high Scope-risk: moderate Directive: Re-enable node_ref only with a tested live element-handle resolver; keep replay schema descriptions lint-clean. Tested: npm run build; npm run lint:tool-schemas; npm test -- --runTestsByPath tests/tools/oc-skill-replay.test.ts tests/core/skill-memory/no-network.test.ts tests/pilot/skill/replay.test.ts tests/core/skill-memory/replay-artifact.test.ts tests/pilot/curator/extractor.test.ts tests/transports/http-bearer-auth.test.ts --runInBand Not-tested: full GitHub Actions matrix pending after push. Co-authored-by: OmX <omx@oh-my-codex.dev>

The replay branch now includes bounded tool-output observability metrics while preserving fail-closed replay dispatch, schema lint compliance, and node_ref rejection until live resolution exists. Constraint: develop advanced during PR #925 review and branch must stay mergeable without relaxing selector safety. Rejected: leaving the autogenerated merge message | project history requires Lore trailers and OmX co-author attribution. Confidence: high Scope-risk: moderate Directive: Keep replay result safety independent from output-size metrics; re-enable node_ref only with a tested live element-handle resolver. Tested: npm run build; npm run lint:tool-schemas; npm test -- --runTestsByPath tests/metrics/tool-output-observability.test.ts --runInBand Not-tested: full GitHub Actions matrix pending after push. Co-authored-by: OmX <omx@oh-my-codex.dev>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: db819fe1ff

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-13T02:08:50Z

      successCount: 0,
      lastUsedAt: 0,
      frozenSnapshotPath: snapshotPath ?? null,
+      replayArtifacts,


Preserve replay artifacts when replay is disabled

When OPENCHROME_SKILL_REPLAY=0, replayArtifacts is still initialized to an all-null array and always passed into store.record, so any re-record of an existing skill overwrites previously captured artifacts with null. This makes the kill-switch destructive: operators who temporarily disable replay lose artifact data and cannot replay those skills after re-enabling the flag.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-13T02:08:50Z

+ *   - non-array `steps` (we leave the field undefined; callers handle).
+ */
+function normaliseRecordForRead(rec: SkillRecord): SkillRecord {
+  if (!Array.isArray(rec.steps)) return { ...rec };


Handle null records during read normalization

normaliseRecordForRead assumes each skills value is an object and immediately reads rec.steps; if a persisted file contains a null entry (e.g., partial/manual corruption), this throws and breaks get()/list() for the whole domain. Before this change, reads were tolerant of unknown record shapes, so this introduces a new crash path in the loader.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3d5e08b123

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-13T10:31:15Z

+      if (this.saveAgain || this.dirty) {
+        this.saveAgain = false;
+        this.dirty = true;
+        this.save();


Prevent infinite save retries on persistent write failures

The new coalescing logic now retries immediately whenever saveAsync fails (catch sets dirty = true, then finally calls save() again when dirty is true). If the storage path is persistently unwritable (for example EACCES or ENOSPC), every mutation triggers an unbounded retry loop that continuously reissues writeFile calls, causing sustained event-loop churn and repeated I/O with no recovery path. This should stop after a bounded number of retries or require a new mutation/backoff trigger before retrying.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 08c2a5f2f4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-13T10:44:14Z

      successCount: 0,
      lastUsedAt: 0,
      frozenSnapshotPath: snapshotPath ?? null,
+      replayArtifacts,


Preserve artifacts when replay_artifacts is omitted

oc_skill_record now always passes a replayArtifacts array into store.record, even when the caller did not supply replay_artifacts and no target_id buffer was consumed. Because that array is initialized as all-null, re-recording an existing skill in this common path silently wipes previously stored replay artifacts instead of preserving them, so later oc_skill_replay calls regress to ARTIFACT_MISSING.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-13T10:44:14Z

+
+  // Mark the skill used so the recall layer reflects the replay activity.
+  try {
+    await store.markUsed(skillId, Date.now(), true);


Persist replay outcome timestamps after each replay

The new replay handler only calls store.markUsed(..., true) on full success and never writes lastReplayPassedAt/lastReplayFailedAt via recordReplayResult; all failure exits return before any store update. oc_skill_recall ranks by those replay timestamps (computeReplaySignal), so replay health never affects ordering anymore and failed skills cannot be demoted as intended.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6ce739a3e4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-13T11:22:43Z

+          const role = el.getAttribute('role') || el.tagName.toLowerCase();
+          if (role !== sel.role && !(sel.role === 'button' && ['button', 'input'].includes(el.tagName.toLowerCase())) && !(sel.role === 'link' && el.tagName.toLowerCase() === 'a')) return false;
+          const name = el.getAttribute('aria-label') || textOf(el);


Treat native textbox controls as role_name matches

resolveSelector advertises native textbox support in roleCandidates but the role check only has native fallbacks for button/link. For a role_name selector with role: "textbox", plain <input>/<textarea> elements are rejected because their computed role is input/textarea, so replay can fail with ARTIFACT_RESOLUTION_FAILED on unchanged forms whenever no non-role fallback selector exists.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-13T11:22:43Z

+          el = Array.from(document.querySelectorAll(`[role="${cssEscape(sel.role)}"],button,a[href],input,textarea`)).find((candidate) => {
+            const text = (candidate.textContent || '').replace(/\s+/g, ' ').trim();
+            const name = candidate.getAttribute('aria-label') || text;
+            return sel.name === '' || name === sel.name;


Enforce sel.role when focusing role_name press targets

In the press dispatch path, the role_name branch queries [role=...] plus button/a/input/textarea and then filters only by accessible name, without checking that the candidate actually matches sel.role. If multiple controls share the same name, replay can focus a different element type than intended and send the key to the wrong target while still reporting success.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0b336fc7de

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-13T13:30:16Z

+  const explicitArtifacts = args.replay_artifacts as Array<ReplayArtifact | null> | undefined;
+  const targetId = typeof args.target_id === 'string' ? (args.target_id as string) : undefined;


Redact replay_artifacts before persisting skill records

oc_skill_record redacts steps and frozen_snapshot, but replay_artifacts is taken directly from user input and later stored unchanged, so sensitive literals in artifact args (for example fill values such as passwords/tokens) can be written to skills.json in plaintext. This is a new secret-leak path introduced with replay artifacts and bypasses the module’s documented redaction behavior for skill recording.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-13T13:30:16Z

-  return {
-    content: [{ type: 'text', text: JSON.stringify({ error: message }) }],
-    isError: true,
+    const action = await dispatchStep(step, resolution.selector, page, stepTimeoutMs);


Enforce a single timeout budget across resolve and dispatch

The tool contract says step_timeout_ms covers both selector resolution and action dispatch, but the loop gives each phase its own full deadline: resolveStep gets stepTimeoutMs, then dispatchStep is called with the same full value again. A step can therefore run nearly 2 * step_timeout_ms (or more with overhead), so callers cannot rely on the configured per-step cap.

Useful? React with 👍 / 👎.

gemini-code-assist Bot reviewed May 12, 2026

View reviewed changes

shaun0927 marked this pull request as ready for review May 12, 2026 18:26

chatgpt-codex-connector Bot reviewed May 12, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed May 13, 2026

View reviewed changes

shaun0927 and others added 3 commits May 13, 2026 10:21

chatgpt-codex-connector Bot reviewed May 13, 2026

View reviewed changes

shaun0927 and others added 2 commits May 13, 2026 10:49

chatgpt-codex-connector Bot reviewed May 13, 2026

View reviewed changes

shaun0927 and others added 2 commits May 13, 2026 11:01

chatgpt-codex-connector Bot reviewed May 13, 2026

View reviewed changes

shaun0927 mentioned this pull request May 13, 2026

feat(core): deterministic selector-chain replay layer for skill memory #875

Open

25 tasks

Merge remote-tracking branch 'origin/develop' into feat/875-replay-layer

3d5e08b

chatgpt-codex-connector Bot reviewed May 13, 2026

View reviewed changes

Merge branch 'develop' into feat/875-replay-layer

08c2a5f

chatgpt-codex-connector Bot reviewed May 13, 2026

View reviewed changes

Merge develop into feat/875-replay-layer

6ce739a

chatgpt-codex-connector Bot reviewed May 13, 2026

View reviewed changes

shaun0927 added 2 commits May 13, 2026 22:23

Merge develop into feat/875-replay-layer

0ce2891

Merge develop into feat/875-replay-layer

0b336fc

shaun0927 merged commit e5fabbd into develop May 13, 2026

chatgpt-codex-connector Bot reviewed May 13, 2026

View reviewed changes

		if (sm && typeof sm.getPage === 'function' && typeof tabId === 'string' && tabId.length > 0) {
		page = await sm.getPage(sessionId, tabId, undefined, 'oc_skill_replay');

		if (Array.isArray(explicitArtifacts)) {
		// Pad / truncate to match steps.length, then validate each non-null

		if (p?.goto) await withDeadline(Promise.resolve(p.goto(url, { waitUntil: 'domcontentloaded', timeout: timeoutMs })), timeoutMs);
		return { ok: true };

		if (s.type === 'node_ref') {
		return { ok: false, error: 'node_ref selectors are not replayable until the node-ref resolver can return a live element handle' };

		const explicitArtifacts = args.replay_artifacts as Array<ReplayArtifact \| null> \| undefined;
		const targetId = typeof args.target_id === 'string' ? (args.target_id as string) : undefined;

Conversation

shaun0927 commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Progress / Review status

Why the split

Summary (PR-A scope)

Outstanding work before ready-for-review

Files

Test plan

Manual verification (post-merge, via openchrome MCP)

References

Portability-harness contract alignment

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot commented May 12, 2026

Uh oh!

qodo-code-review Bot commented May 12, 2026

Qodo reviews are paused for this user.

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

This comment was marked as outdated.

Uh oh!

chatgpt-codex-connector Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

This comment was marked as outdated.

Uh oh!

chatgpt-codex-connector Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

chatgpt-codex-connector Bot commented May 12, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 12, 2026

Choose a reason for hiding this comment

shaun0927 commented May 12, 2026 •

edited

Loading