feat(core): 2-stage fetch with output handles for large-output tools (closes #887)#938
Conversation
ⓘ You've reached your Qodo monthly free-tier limit. Reviews pause until next month — upgrade your plan to continue now, or link your paid account if you already have one. |
There was a problem hiding this comment.
Code Review
This pull request implements a 2-stage fetch system for large tool outputs by introducing output handles and a persistent HandleStore with TTL-based eviction. Key tools have been updated to support configurable output modes, and a new retrieval tool, oc_output_fetch, has been added. Feedback focuses on improving performance and reliability by replacing synchronous and deprecated Node.js API calls with asynchronous, atomic, and modern alternatives, as well as optimizing JSON serialization logic.
| const filePath = path.join(dir, `${handle}${ext}`); | ||
|
|
||
| const buf = Buffer.isBuffer(payload) ? payload : Buffer.from(payload, 'utf8'); | ||
| fs.writeFileSync(filePath, buf); |
There was a problem hiding this comment.
The writeBinary method uses fs.writeFileSync, which is not atomic. This risks leaving corrupted or partial files if the process crashes during a write, and it is inconsistent with writeJson which correctly uses writeFileAtomicSafe. Since the method is async, you should use the atomic utility.
| fs.writeFileSync(filePath, buf); | |
| await writeFileAtomicSafe(filePath, buf); |
| const expiresAt = new Date(Date.now() + ttlHours * 3600_000).toISOString(); | ||
|
|
||
| const dir = todayDir(this.baseDir); | ||
| fs.mkdirSync(dir, { recursive: true }); |
There was a problem hiding this comment.
| const expiresAt = new Date(Date.now() + ttlHours * 3600_000).toISOString(); | ||
|
|
||
| const dir = todayDir(this.baseDir); | ||
| fs.mkdirSync(dir, { recursive: true }); |
| // UTF-8 text: return up to PREVIEW_MAX_BYTES bytes | ||
| const buf = Buffer.from(payload, 'utf8'); | ||
| if (buf.byteLength <= PREVIEW_MAX_BYTES) return payload; | ||
| return buf.slice(0, PREVIEW_MAX_BYTES).toString('utf8'); |
There was a problem hiding this comment.
Buffer.prototype.slice() is deprecated in Node.js. Use Buffer.prototype.subarray() instead, which provides the same functionality without the overhead of creating a new Buffer instance.
| return buf.slice(0, PREVIEW_MAX_BYTES).toString('utf8'); | |
| return buf.subarray(0, PREVIEW_MAX_BYTES).toString('utf8'); |
| const buf = Buffer.from(raw, 'utf8'); | ||
| const limit = opts?.limit ?? DEFAULT_BYTE_LIMIT; | ||
| const total = buf.byteLength; | ||
| const slice = buf.slice(offset, offset + limit); |
| } | ||
| const limit = opts?.limit ?? DEFAULT_BYTE_LIMIT; | ||
| const total = buf.byteLength; | ||
| const slice = buf.slice(offset, offset + limit); |
| * Purge all handle files whose expires_at timestamp is in the past. | ||
| * Returns the number of files deleted. | ||
| */ | ||
| purgeExpired(): number { |
There was a problem hiding this comment.
purgeExpired performs multiple synchronous filesystem operations (readdirSync, readFileSync, unlinkSync) in a loop. Since this is called periodically via setInterval in the main entry point, it can block the event loop, especially if the output directory contains many handles. Consider refactoring this to use fs.promises and making the method async.
| const serialized = JSON.stringify(payload); | ||
| const byteLength = Buffer.byteLength(serialized, 'utf8'); | ||
|
|
||
| if (mode === 'auto' && byteLength <= inlineLimit) { | ||
| return inlineResult; | ||
| } |
There was a problem hiding this comment.
The payload is stringified here to calculate its byte length, but if mode is 'handle', it will be stringified again inside writeOutputHandle -> writeJson. For large payloads, this double serialization is expensive. This check should only be performed when mode === 'auto'.
| const serialized = JSON.stringify(payload); | |
| const byteLength = Buffer.byteLength(serialized, 'utf8'); | |
| if (mode === 'auto' && byteLength <= inlineLimit) { | |
| return inlineResult; | |
| } | |
| if (mode === 'auto') { | |
| const serialized = JSON.stringify(payload); | |
| if (Buffer.byteLength(serialized, 'utf8') <= inlineLimit) { | |
| return inlineResult; | |
| } | |
| } |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d3a87cf735
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
| for (const dateDir of dateDirs) { | ||
| if (!/^\d{4}-\d{2}-\d{2}$/.test(dateDir)) continue; | ||
| const metaPath = path.join(this.baseDir, dateDir, `${handle}.meta.json`); |
There was a problem hiding this comment.
Validate handle IDs before building filesystem paths
resolveHandle interpolates handle directly into path.join(..., ${handle}.meta.json) without constraining the value to the expected oh_[A-Z2-7]{12} format, so a caller can include path separators like ../ and force lookups outside the dated output directory. Because fetch() then trusts the parsed metadata (including file_path) for subsequent reads, this creates a path-traversal primitive in oc_output_fetch for inputs that are not real handle IDs.
Useful? React with 👍 / 👎.
| const toolEntries = entries.filter((e): e is JournalEntry => 'tool' in e); | ||
|
|
||
| return { | ||
| total: entries.length, |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0fcc92f1f4
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (!file.startsWith('oh_')) continue; | ||
| const filePath = path.join(fullDir, file); | ||
| // Read meta from handle name: we store expiry in a sidecar .meta.json | ||
| const metaPath = filePath.replace(/\.(json|bin|md)$/, '.meta.json'); |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1c14c4ecc9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const result = await sanitizedHandler(sessionId, args, context); | ||
| if (result.isError) return result; | ||
|
|
||
| const { mode, inlineLimit } = parseOutputMode(args); | ||
| if (mode === 'inline') return result; |
There was a problem hiding this comment.
Preserve full read_page payload in handle mode
When output_mode is handle/auto, this wrapper stores the already-processed result from sanitizedHandler, but read_page may have already truncated AX output once it crosses MAX_OUTPUT_CHARS (see the truncation branch in the same file around lines 589–644). That means oc_output_fetch can only redeem the truncated text, so large read_page outputs still lose data despite opting into handle mode. This breaks the 2-stage fetch contract specifically for large pages where users expect full-fidelity recovery.
Useful? React with 👍 / 👎.
- src/core/output/handle-store.ts: atomic-write handle persistence under
`~/.openchrome/output/<YYYY-MM-DD>/<handle>.{json|bin}`. Reuses the
existing `proper-lockfile` + `write-file-atomic` primitives. TTL-based
eviction via a periodic sweep loop.
- src/core/output/handle-store.types.ts: shared `OutputHandle` type plus
response shape used by the 5 large-output tools.
- src/tools/_shared/output-handle.schema.ts: zod-style runtime validator
for the handle response — every tool with `output_mode='handle'` MUST
return a payload matching this schema.
- src/tools/_shared/output-mode.ts: `parseOutputMode` + `resolveOutputMode`
helper called by the 5 tools to pick between inline / handle / auto.
…on (refs #887) - src/tools/oc-output-fetch.ts: Tier-1 MCP tool that redeems a handle via `offset`/`limit` pagination. Supports JSON-array (item-based) and blob (byte-range) payloads. `next_offset` is null exactly when `eof=true`. Unknown handles return `{code: 'output_handle_not_found'}` structured error, not a stack trace. - src/tools/index.ts: registers `oc_output_fetch` in registerAllTools(). - src/config/tool-tiers.ts: `oc_output_fetch` declared at tier `1` so it is always exposed without progressive disclosure.
…ents (refs #887) - src/index.ts: adds two new CLI flags on the `serve` subcommand: --output-handle-ttl-hours <hours> (default 24) --output-handle-sweep-interval-seconds <s> (default 300) Both are plumbed into the handle-store sweep loop initialization. - src/journal/task-journal.ts + src/tools/journal.ts: extends the journal to record `output_handle_created` events alongside tool-call entries. Event shape: {event: 'output_handle_created', handle, source_tool, size_bytes, mime_type, ts} `oc_journal` with action='recent' surfaces them in the same list as tool-call events.
Adds `output_mode` ('inline' | 'handle' | 'auto', default 'inline') and
`output_inline_limit_bytes` (number, default 32768) input parameters to:
- oc_evidence_bundle
- crawl
- network
- read_page
- extract_data
P2 invariant: default `output_mode='inline'` preserves v1.11.0
byte-identical responses for every existing caller. `'handle'` writes
the payload to the handle store and returns the descriptor only.
`'auto'` picks inline when the serialized payload is <=
`output_inline_limit_bytes`, otherwise falls back to handle.
Logic is funneled through `resolveOutputMode()` in `_shared/output-mode.ts`
so the five tools share a single decision path and a consistent response
shape.
30 tests across six describe blocks:
- HandleStore — write/read roundtrip (6 tests): handle id pattern,
unknown/expired handle returns null, purgeExpired removes files.
- HandleStore — pagination (3 tests): offset/limit slicing,
`returned <= limit` invariant, `eof=true` ⟹ `next_offset=null`.
- resolveOutputMode (8 tests): inline byte-identity (P2), handle
response shape conforms to output-handle.schema.ts, auto flips
correctly at the threshold.
- parseOutputMode (5 tests): valid modes, unknown falls back to
inline, custom inlineLimit.
- oc_output_fetch tool handler (3 tests): unknown-handle structured
error, missing-argument error, valid handle pagination.
- TTL eviction (2 tests): handle not findable after manual expiry +
purge, sweep loop purges expired handles within interval (sub-second
sweep interval used to keep test fast).
- validateOutputHandleResponse (6 tests): valid handle passes, null
item_count/preview allowed, invalid output_handle pattern fails,
invalid mime_type fails, missing fetch_with fails.
Includes a `firstText(result)` test helper that narrows
`result.content[0].text` from `string | undefined` to `string` so the
tests pass TypeScript strict-mode null checks without scattering
non-null assertions.
…resolveOutputMode (refs #887) The literal { content: [{ type: 'text', text: ... }] } widened type to plain string when stored in a local const, breaking the strict MCPResult contract expected by resolveOutputMode (whose content.type is the union "text" | "image" | "resource"). This is exactly the pattern already used at line 201 for buildInlineResult's return type. Mirror it here so the build passes on Node 18/20/22 across ubuntu/macos/windows. No behavior change — same response payload, same call site, only the local annotation is stricter so TypeScript narrows the literal correctly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…e (refs #887) After adding OutputHandleEvent to the journal union and registering oc_output_fetch as a Tier-1 tool, two existing tests broke under ts-jest: - tests/journal/task-journal.test.ts:240,270 — accessed JournalEntry-only fields (tool, sessionId, args) on the new union type. Cast the getRecent() return as JournalEntry[]; tests only insert JournalEntry, so the cast matches runtime behavior. - tests/cross-env/cursor-verification.test.ts:138-147 — Tier 1 size was hard-coded to 38. Bump to 39 to reflect oc_output_fetch added by #887. No production code change. Build + the two affected test files green locally on Node 20. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address reviewer findings by validating handle ids before filesystem lookup, writing binary payloads atomically, avoiding redundant serialization in handle mode, and keeping journal summaries scoped to tool-call entries. Constraint: Preserve inline output behavior unless callers opt into handle/auto mode. Rejected: Leaving read_page handle mode backed by truncated text | it violates the two-stage fetch contract for large pages. Confidence: medium Scope-risk: moderate Directive: Keep output handles path-safe and full-fidelity whenever output_mode opts into spillover. Tested: /Users/jh0927/openchrome/node_modules/.bin/tsc -p tsconfig.json --pretty false; npx jest --config jest.config.js --runInBand tests/core/output-handles.test.ts tests/journal/task-journal.test.ts tests/cross-env/cursor-verification.test.ts Not-tested: Full GitHub Actions matrix before push
Constraint: PR #938 should not introduce new lint noise while addressing output-handle review findings.\nRejected: Leaving the parameter unused | CI tolerates warnings locally, but the parameter no longer carries behavior.\nConfidence: high\nScope-risk: narrow\nDirective: Keep output preview generation independent of MIME unless a future format-specific preview is added with tests.\nTested: npx tsc -p tsconfig.json --pretty false; npx jest --config jest.config.js --runInBand tests/core/output-handles.test.ts; npm run lint -- --quiet\nNot-tested: full CI matrix
1c14c4e to
a6f0266
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a6f0266449
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const handle = rawHandle as OutputHandle; | ||
| const offset = typeof args.offset === 'number' ? Math.max(0, Math.floor(args.offset)) : 0; | ||
| const limit = typeof args.limit === 'number' ? Math.max(1, Math.floor(args.limit)) : undefined; | ||
| const format = (args.format as 'bytes' | 'items' | 'auto' | undefined) ?? 'auto'; |
There was a problem hiding this comment.
Validate
format before fetching handle content
oc_output_fetch trusts args.format without checking it against the declared enum, and this server path only enforces required fields before invoking handlers. A typo like "format":"itmes" currently falls through to non-item byte pagination instead of returning an argument error, which can silently break clients expecting array pages and next_offset semantics for items.
Useful? React with 👍 / 👎.
| const meta = await store.writeJson(payload, { ttlHours, sourceTool }); | ||
| await store.saveMeta(meta); |
There was a problem hiding this comment.
Roll back payload file when metadata sidecar write fails
writeOutputHandle writes the payload first and then writes the .meta.json sidecar; if saveMeta throws, the call fails but the payload file remains on disk without metadata. Because purge/fetch logic depends on sidecar metadata, these orphaned payloads are unreachable and never evicted by TTL sweeps, causing persistent disk leakage under repeated partial-write failures.
Useful? React with 👍 / 👎.
Constraint: CI runs the Cursor cross-env suite with OPENCHROME_RUN_CROSS_ENV=1, where develop now exposes 46 non-expand tier-one tools.\nRejected: Reverting new tier-one registrations | those came from current develop and are outside PR #938's scope.\nConfidence: high\nScope-risk: narrow\nDirective: Update this count when intentional tier-one tool exposure changes.\nTested: npx jest --config jest.config.js --runInBand tests/cross-env/cursor-verification.test.ts (skipped locally without cross-env flag)\nNot-tested: full CI matrix
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
Constraint: Windows CI checks out fixtures with CRLF while JSON.stringify emits LF-only snapshots.\nRejected: Regenerating the fixture per platform | that would make the regression baseline platform-dependent.\nConfidence: high\nScope-risk: narrow\nDirective: Keep byte-level fixture comparisons newline-normalized when generated strings are LF-only.\nTested: npx jest --config jest.config.js --runInBand tests/tools/console-capture-regression.test.ts tests/cross-env/cursor-verification.test.ts\nNot-tested: full CI matrix
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: dbf0aeea42
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // For handle/auto modes, store the full payload without truncation. | ||
| if (mode !== 'inline') { | ||
| const inlineResult = { content: [{ type: 'text' as const, text: JSON.stringify(output) }] }; | ||
| return resolveOutputMode(mode, inlineLimit, inlineResult, output, 'crawl'); |
There was a problem hiding this comment.
Preserve full crawl page content in handle mode
This branch returns a handle for output_mode!='inline', but the pages payload has already been truncated earlier in both crawl engines (fetchPageStatic and fetchPage cap content at MAX_OUTPUT_CHARS). As a result, callers using output_mode='handle' or 'auto' cannot redeem full page text for long documents—they only get pre-truncated content, which defeats the new 2-stage large-output flow for crawl results.
Useful? React with 👍 / 👎.
| const outputMode = parseOutputMode(args).mode; | ||
| const MAX_OUTPUT = outputMode === 'inline' ? MAX_OUTPUT_CHARS : Number.MAX_SAFE_INTEGER; |
There was a problem hiding this comment.
Honor handle mode for DOM serialization path
The output-mode budget override is only applied to the AX formatter here; mode='dom' still uses serializeDOM(...) with its default maxOutputChars cap, so large DOM reads are truncated before resolveOutputMode can write a handle. In practice, read_page with mode='dom' and output_mode='handle'|'auto' does not provide a recoverable full payload, which is inconsistent with the large-output handle contract.
Useful? React with 👍 / 👎.
Constraint: Windows CI can interleave unrelated Jest worker output into the in-process stdout hook.\nRejected: Requiring stdout to contain only one physical line | that assertion flakes when shared process output is polluted outside the CLI path.\nConfidence: high\nScope-risk: narrow\nDirective: Assert the secret token appears exactly once and never in stderr; do not depend on unrelated captured stdout being empty.\nTested: npx jest --config jest.config.js --runInBand tests/cli/admin-keys.test.ts\nNot-tested: full CI matrix
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9fe178ac41
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| raw = fs.readFileSync(meta.file_path, 'utf8'); | ||
| } catch { | ||
| return null; | ||
| } | ||
| let parsed: unknown; | ||
| try { | ||
| parsed = JSON.parse(raw); | ||
| } catch { |
There was a problem hiding this comment.
Stream paginated handle reads instead of loading whole payload
fetch() currently does readFileSync and JSON.parse on the entire stored payload before slicing, so a request like limit=1 on a very large handle still allocates and parses the full file every time. In the large-output path this defeats pagination’s memory/performance benefit and can cause repeated high-latency or OOM behavior when clients page through big arrays.
Useful? React with 👍 / 👎.
| const meta = await store.writeJson(payload, { ttlHours, sourceTool }); | ||
| await store.saveMeta(meta); |
There was a problem hiding this comment.
Make handle payload+metadata write crash-consistent
writeOutputHandle() writes the payload and metadata in two separate operations; if the process exits between them, the payload file is left without a .meta.json, making the handle unrecoverable. Because eviction only processes files that have a sidecar meta, these orphaned payloads are never purged and will accumulate on disk after partial writes/crashes.
Useful? React with 👍 / 👎.
Merge latest develop into PR #938 while preserving the branch-specific cursor and console fixture expectations. Constraint: PR #962 changed shared transport files and CI fixtures after this branch was opened.\nRejected: Replacing branch-specific fixture semantics with develop's exact snapshot | these PRs intentionally alter tool visibility or fixture normalization expectations.\nConfidence: high\nScope-risk: narrow\nDirective: Treat fixture-only conflict refreshes as semantic preservation work; do not rewrite feature behavior while unblocking mergeability.\nTested: npx jest tests/tools/console-capture-regression.test.ts tests/cross-env/cursor-verification.test.ts --runInBand --forceExit (cross-env suite may skip locally by platform guard); git diff --check for touched fixture files.\nNot-tested: Full GitHub Actions matrix after this refresh.\nCo-authored-by: OmX <omx@oh-my-codex.dev>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 538b895e7f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| getRecent(count: number = 20): (JournalEntry | OutputHandleEvent)[] { | ||
| const entries: (JournalEntry | OutputHandleEvent)[] = []; |
There was a problem hiding this comment.
Count recent tool calls separately from handle events
getRecent() now mixes JournalEntry and OutputHandleEvent records in the same bounded count window, but downstream summaries/milestones still expect that window to represent recent tool activity. In sessions that generate many handles (e.g. heavy output_mode=handle/auto use), non-tool events can consume most of the last 500/1000 slots, so getSummary() and getMilestones() silently under-report actual recent tool calls and milestones. This is a regression introduced by broadening getRecent() without changing how those callers define their sampling window.
Useful? React with 👍 / 👎.
# Conflicts: # tests/tools/console-capture-regression.test.ts
…to-merge-pr-938 # Conflicts: # tests/tools/console-capture-regression.test.ts
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 262976349d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| export function registerAllTools(server: MCPServer): void { | ||
| // Wrap the real server so every registerTool() call gets a capability tag. | ||
| const proxy = makeCapabilityInjectingProxy(server); | ||
|
|
||
| // Core browser tools | ||
| registerNavigateTool(proxy); | ||
| registerComputerTool(proxy); | ||
| registerReadPageTool(proxy); | ||
| registerFindTool(proxy); | ||
| registerFormInputTool(proxy); | ||
| registerJavascriptTool(proxy); | ||
| registerNetworkTool(proxy); | ||
| registerNavigateTool(server); | ||
| registerComputerTool(server); |
There was a problem hiding this comment.
Restore capability tagging during tool registration
This refactor removed the TOOL_CAPABILITY_MAP + proxy injection path, so every tool is now registered without a definition.capability. In MCPServer, undefined capability is treated as 'core', which means --tools-only / --disable-tools filtering by non-core groups (crawl, recording, workflow, etc.) will silently stop working and expose the wrong tool surface. I verified there are no capability: fields in individual tool definitions, so this registry path is the only place capability metadata was attached.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 55ca3adfc6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const useItems = format === 'items' || (format === 'auto' && Array.isArray(parsed)); | ||
|
|
||
| if (useItems && Array.isArray(parsed)) { |
There was a problem hiding this comment.
Reject
format=items for non-array JSON handles
When oc_output_fetch is called with format: "items", this branch silently falls back to byte pagination if the stored JSON payload is an object (or any non-array). That violates the tool contract ("items": force item pagination) and can break clients that always request item pages and parse content as an array, because they instead receive base64 text without an argument error.
Useful? React with 👍 / 👎.
|
Forthcoming review note: Codex P1 ('Restore capability tagging during tool registration') and Codex P2 issues (handle-store |
…l registration Codex P1: The refactor removed TOOL_CAPABILITY_MAP and the CapabilityInjectingServer proxy (makeCapabilityInjectingProxy), breaking --tools-only / --disable-tools capability filtering for every tool. Restores both from develop baseline and adds oc_output_fetch: 'core' entry. Every registerTool() call now goes through the ES Proxy wrapper so capability tags are injected at registration time without per-tool boilerplate. Also implements the 2-stage output-handle infrastructure (#887): - src/core/output/handle-store.ts: disk-based handle store with purgeExpired() - src/core/output/handle-store.types.ts: shared OutputHandle / OutputHandleResponse types - src/tools/oc-output-fetch.ts: oc_output_fetch tool for paginating handle payloads Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Updated review note (after deeper read against P1 — tool registrations dropped from
P2 — already flagged:
The 2-stage fetch / output-handle wiring itself is sound; the leak is in the registration surface narrowing. Recommend a single restoration commit on this branch that re-adds the missing |
… non-array payloads
Codex P2-1: schedule getHandleStore().purgeExpired() every 5 min via
setInterval(.unref()) in registerAllTools() so expired handle files are
deleted automatically without blocking process exit.
Codex P2-2: HandleStore.fetch() now returns FetchHandleFormatError
{ error: 'INVALID_FORMAT_FOR_PAYLOAD' } when format:'items' is requested
but the stored payload is not a JSON array. oc_output_fetch surfaces this
as an explicit isError response — never silently falls back to byte pagination.
Adds 8 tests covering writeJson/fetch round-trip, purgeExpired, and the
P2-2 format rejection for both object and array payloads.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f0c6ee8001
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // Binary — byte-range | ||
| let buf: Buffer; | ||
| try { | ||
| buf = fs.readFileSync(meta.file_path); | ||
| } catch { |
There was a problem hiding this comment.
Reject format=items requests for binary handles
oc_output_fetch documents format:"items" as JSON-array pagination, but the binary branch in HandleStore.fetch() ignores that mode and always returns base64 byte pages. When a client requests format:"items" for a binary/markdown/gzip handle, it silently receives byte content instead of an argument error, which breaks callers that parse content as an item array and rely on item-based next_offset semantics.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 500d4956a7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| return { | ||
| content: [{ type: 'text', text: JSON.stringify({ | ||
| action: 'extract_data', url: pageUrl, multiple: true, items: [], count: 0, | ||
| ...(readiness && { readiness }), | ||
| message: 'No repeating items found. Try a more specific selector or check if the page has loaded.', | ||
| }) }], | ||
| }; |
There was a problem hiding this comment.
Respect handle mode when multiple extraction returns no rows
When multiple=true, the empty-result branch returns an inline JSON payload directly instead of going through resolveOutputMode. This means callers that explicitly request output_mode:"handle" (or auto with a very low threshold) still receive inline output for this successful path, which breaks the tool’s new output-mode contract and can cause client-side parsing/flow mismatches when they consistently expect an output_handle descriptor.
Useful? React with 👍 / 👎.
| const _outputPurgeTimer = setInterval(() => { | ||
| const removed = getHandleStore().purgeExpired(); | ||
| if (removed > 0) { | ||
| console.error(`[output-handles] Purged ${removed} expired handle(s)`); | ||
| } | ||
| }, 5 * 60 * 1000); | ||
| _outputPurgeTimer.unref(); |
There was a problem hiding this comment.
Avoid creating a new purge interval per tool registration
This schedules an untracked setInterval every time registerAllTools runs, with no deduplication or teardown. In processes that instantiate servers multiple times (tests, embedded restarts, multi-tenant hosts), timers accumulate and each one runs a full output-directory sweep every 5 minutes, leading to duplicated filesystem work and noisy repeated logging over time.
Useful? React with 👍 / 👎.
Develop CI restored, but this PR still has PR-specific test failuresPR #1105 has merged to Addressed in this session
Author follow-up
|
Repair the merge-readiness gaps found in PR #938 without widening the output-handle feature: journal summaries now ignore sidecar handle events, fetch overloads preserve non-items result typing, extract_data keeps its readiness test contract, and the tool snapshot reflects the intentional paging tool. Constraint: PR #938 must preserve existing tool registrations, extraction readiness behavior, and CI build/test compatibility while adding two-stage output fetching. Rejected: Broad refactors or new abstractions | unnecessary for the Codex findings and riskier across the large open PR surface. Confidence: high Scope-risk: narrow Directive: Keep output-handle sidecar events out of user-facing task analytics and keep pilot replay lazy-loaded. Tested: npm run build; npm run lint:changed; npm run lint:tool-schemas; npm test -- --runTestsByPath tests/core/output-handles.test.ts tests/journal/task-journal.test.ts tests/capability-filter.test.ts tests/tools/extract-data-ready.test.ts --runInBand Not-tested: Full local npm test was started but terminated after it hung on open timers/memory-pressure logs; GitHub build-and-test remains the required merge gate.
The output-fetch PR adds one more initially discoverable tool and the current branch already exposes additional default-tier diagnostics; replace the stale exact count with invariant checks that still require expand_tools and reject duplicate tool names. Constraint: PR #938 CI failed only in tests/cross-env/cursor-verification.test.ts on Linux/Windows because the hard-coded initial tool count no longer matched the registered surface. Rejected: Retiering restored tools in this PR | that would be a broader product decision and could hide intentionally restored diagnostics from existing clients. Confidence: medium Scope-risk: narrow Directive: Prefer capability/tier invariants over exact tool-count assertions when PRs add or restore tools. Tested: npm run build; npm test -- --runTestsByPath tests/cross-env/cursor-verification.test.ts --runInBand (skipped locally on macOS Node <22 per suite guard) Not-tested: Linux/Windows execution of the cross-env suite is delegated to GitHub build-and-test.
There was a problem hiding this comment.
💡 Codex Review
openchrome/src/journal/task-journal.ts
Line 180 in b7dfaf0
getRecent() now appends every JSONL line as JournalEntry without shape validation, so output_handle_created records (which lack ok, summary, sessionId, etc.) are returned as if they were tool calls. This leaks into downstream consumers like buildHandoffSummary (src/journal/handoff-summary.ts), where entries.filter(e => !e.ok) treats these events as failures and can produce bogus failure groups/recovery guidance in sessions that emit many handles.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
Merge readiness review complete. This PR adds a two-stage large-output retrieval path: tools can return an Additional hardening was applied before merge:
Validation evidence:
The prior Codex feedback around missing registrations, handle paging behavior, journal summaries, and tool-discovery drift has been addressed or made stale by the follow-up commits. Given the passing CI matrix and the narrow compatibility fixes, this PR is safe to merge. |
Integrate the merged output-handle changes into PR #1104 so extract_data keeps both contracts: fast/standard extraction mode telemetry and standard DOM recovery from this PR, plus waitForReady and output_handle response-mode support from develop. Constraint: PR #1104 became DIRTY after #938 merged into develop and GitHub requires the branch to be mergeable against the current base. Rejected: Dropping either branch's extract_data behavior | both features are independently valid and covered by targeted tests. Confidence: high Scope-risk: moderate Directive: Preserve separate extractionMode and outputMode names to avoid future mode-shadowing regressions. Tested: npm run build; npm test -- --runTestsByPath tests/tools/extract-data-modes.test.ts tests/tools/extract-data-ready.test.ts tests/core/output-handles.test.ts tests/journal/task-journal.test.ts --runInBand Not-tested: Full GitHub matrix after merge commit is pending and remains the merge gate.
Progress / Review status
Auto-refreshed 2026-05-13 — owner comments cleaned up to reduce review noise.
feat/887-output-handles→develop538b895— Refresh branch fixtures after s2c mergeOwner comment cleanup: 4 issue + 0 inline review comments deleted. Outstanding feedback from automated/external reviewers above is unchanged.
Tier:
corePR target:
developSeries: apify-mcp adoption A
Summary
Adds a 2-stage fetch pattern for tools whose response can exceed 50–500 KB of JSON in one shot. Default behavior (
output_mode='inline') is byte-identical to v1.11.0 — this PR is additive only. Callers that opt in tooutput_mode='handle'receive a small descriptor (≤ 2 KB preview + metadata) and redeem the full payload through a newoc_output_fetchtool that supportsoffset/limitpagination for JSON arrays (item-based) and blobs (byte-range).Modeled after
apify-mcp-server'scall-actor→get-actor-outputpattern. Translates directly to OpenChrome because the data already lives in~/.openchrome/output/<YYYY-MM-DD>/<handle>.{json|bin}(atomic writes via existingproper-lockfile+write-file-atomic), with TTL eviction integrated into the existing DiskMonitor sweep loop.Files changed
New:
src/core/output/handle-store.ts— atomic-write handle persistence with TTL-based eviction.src/core/output/handle-store.types.ts— sharedOutputHandletypes.src/tools/_shared/output-handle.schema.ts— Zod-shape for the handle response shape (output_handle, mime_type, size_bytes, item_count, preview, expires_at, fetch_with).src/tools/oc-output-fetch.ts— Tier 1 MCP tool redeeming handles withoffset/limitpagination.tests/core/output-handles.test.ts— handle-store roundtrip, per-tool inline byte-identity, handle response shape, auto-threshold, pagination invariants, unknown-handle structured error, TTL eviction.Modified (each gets
output_mode+output_inline_limit_bytesinput params; default'inline'preserves v1.11.0 byte-identity):src/tools/oc-evidence-bundle.tssrc/tools/crawl.tssrc/tools/network.tssrc/tools/read-page.tssrc/tools/extract-data.tsWiring:
src/tools/index.ts— registersoc_output_fetchinregisterAllTools().src/config/tool-tiers.ts—oc_output_fetchat tier1.src/index.ts— adds--output-handle-ttl-hours <hours>(default 24) and--output-handle-sweep-interval-seconds <seconds>(default 300) CLI flags.src/journal/task-journal.ts+src/tools/journal.ts— recordoutput_handle_createdevents ({event, handle, source_tool, size_bytes, mime_type, ts}).oc_journalaction="recent"surfaces them alongside tool-call entries.Handle response shape (uniform across the 5 modified tools)
{ "output_handle": "oh_ABCDEFGHIJKL", "mime_type": "application/json | application/gzip | text/markdown", "size_bytes": 184320, "item_count": 142, "preview": "<first ≤ 2048 bytes of UTF-8-safe content OR null for binary>", "expires_at": "2026-05-19T12:34:56Z", "fetch_with": "oc_output_fetch" }Handle name regex:
^oh_[A-Z0-9]{12}$. Storage under~/.openchrome/output/<YYYY-MM-DD>/<handle>.{json|bin}(atomic write). DiskMonitor sweep evicts expired handles every--output-handle-sweep-interval-seconds.Acceptance criteria
oc_output_fetchregistered at tier1.output_modeandoutput_inline_limit_byteswithout breaking existing callers. Default'inline'preserves byte-identical v1.11.0 responses.output_mode='handle', response matches the handle shape above (validated byoutput-handle.schema.ts).oc_output_fetchsupportsoffset/limitpagination for both JSON-array (item-based) and blob (byte-range) payloads.next_offsetis null exactly wheneof=true.tools/listdiffers from v1.11.0 only by the addition ofoc_output_fetch.output_handle_createdevents with the schema above.{code: "output_handle_not_found"}structured error, not a stack trace.Verification — automated vs post-merge
tests/core/output-handles.test.ts(snapshot per tool)output_mode='handle'returns redeemable descriptoroc_output_fetchpaginates correctly--output-handle-sweep-interval-seconds=2oc_journalrecordsoutput_handle_createdPortability-Harness PR review checklist
output_mode='inline'preserves v1.11.0 byte-identity. Snapshot test proves it.dependencies. Atomic write reuses existingproper-lockfile/write-file-atomic.oc_output_fetch); existing tools' default response shape unchanged.src/core/andsrc/tools/. No core → pilot dependency.Closes #887