You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Tier: core (P2-compliant additive: existing response shape preserved by default) PR target: develop Series: apify-mcp adoption A
Background
apify-mcp-server invokes Actors with call-actor, returns a run-id + truncated preview, and the agent retrieves the full payload via a separate get-actor-output tool with offset/limit. The pattern exists for one reason: prevent large tool responses from blowing up the host LLM's context window.
OpenChrome has several tools whose response can exceed 50–500 KB of JSON in one shot:
extract_data (src/tools/extract-data.ts) — extracted records.
When the agent only needs a summary or first N records, the entire payload still occupies the LLM context for the rest of the conversation. With prompt-cache TTL of 5 minutes and Sonnet/Haiku models being the common deployment target, this is the single highest leverage cost-cutting opportunity OpenChrome has not yet taken. Apify's pattern translates directly: the data already lives in ~/.openchrome/trace/ (JSONL trace storage, src/core/trace/storage.ts) — we just need an opt-in handle response mode plus a fetch tool.
Portability-Harness Contract alignment:
P1 (tool-server identity): handles are deterministic file paths; no background work added.
Modified: src/tools/oc-evidence-bundle.ts, src/tools/crawl.ts, src/tools/network.ts, src/tools/read-page.ts, src/tools/extract-data.ts — add output_mode plumbing in input schema and result branching.
Modified: src/tools/index.ts — register oc_output_fetch in registerAllTools() (after the existing core tools at the head of the registry).
Modified: src/config/tool-tiers.ts — add oc_output_fetch at tier 1 in TOOL_TIERS (the type is 1 | 2 | 3).
Modified: src/index.ts — add two new CLI flags: --output-handle-ttl-hours <hours> (default 24) and --output-handle-sweep-interval-seconds <seconds> (default 300, lowerable for tests). Both flags are net-new and shipped by this PR; no fallback is required.
Modified: src/journal/task-journal.ts (and src/tools/journal.ts if needed) — extend the journal to record output_handle_created events alongside the existing tool-call entries. This is a small but real extension: today the journal records tool invocations, not arbitrary events. Acceptance criteria below cover the schema.
Acceptance Criteria
oc_output_fetch registered as Tier 1; TOOL_TIERS snapshot test updated.
All five target tools accept output_mode and output_inline_limit_bytes inputs without breaking existing callers (defaults preserve byte-identical v1.11.0 responses).
When output_mode='handle', response strictly matches the handle response shape above (validated by a Zod schema kept in src/tools/_shared/output-handle.schema.ts).
Handles persist under ~/.openchrome/output/<date>/ with atomic writes; DiskMonitor prunes expired handles in the next 5-minute sweep tick after expiry.
oc_output_fetch supports offset/limit pagination for both JSON-array payloads (item-based) and blob payloads (byte-range); next_offset is null exactly when eof=true.
Default-mode snapshot test in tests/core/registration-default.snapshot.ts (or extension thereof) shows that tools/list differs from v1.11.0 only by the addition of oc_output_fetch.
Journal infrastructure extended to record output_handle_created events. New event schema: {event: "output_handle_created", handle, source_tool, size_bytes, mime_type, ts}. oc_journalaction="recent" surfaces them in the same list as tool-call entries.
Handle redaction: any handle created from a tool that had content-sanitization applied (--no-sanitize-content not set) preserves the sanitized payload on disk; raw content is never stored unsanitized.
npm run build && npm test && npm run lint:tier green.
PR body documents tier:core and lists every modified tool name.
Verification (post-merge, via openchrome MCP)
A tester instance of openchrome (instance T) drives a target instance (instance A) through MCP tools/list + tool calls.
Pass: returned ≤ limit, total ≥ returned, next_offset consistent with eof.
Scenario 4 — unknown handle returns structured error, not a stack trace
RESP=$(mcpA '{"jsonrpc":"2.0","id":6,"method":"tools/call","params":{"name":"oc_output_fetch","arguments":{"output_handle":"oh_DEADBEEF0000"}}}')echo"$RESP"| jq -e '.result.isError == true and (.result.content[0].text | fromjson | .error.code == "output_handle_not_found")'>/dev/null \
&&echo OK || { echo"FAIL: missing or malformed not-found error";exit 1; }
Pass: OK. Error is structured (error.code='output_handle_not_found').
Scenario 5 — handle response size invariant
The real invariant is that the handle response itself is small and bounded, regardless of underlying payload size. Test against a fixture site committed at tests/fixtures/sites/many-pages/ (a local httpbin-style server serving ≥ 30 internal pages) so the test is hermetic.
Record both numbers in scripts/verify/A-output-handles-bytes.txt. Pass: handle-mode response body ≤ 4 KB. (Descriptor + 2 KB preview cap means typical handle responses are 2.5–3.5 KB; the 4 KB ceiling is the actual invariant. The inline number is recorded for documentation only — it depends on workload and is not a pass/fail gate.)
Scenario 6 — TTL eviction (uses the two new sweep flags shipped by this PR)
kill$A_PID;wait$A_PID2>/dev/null
node dist/index.js --http "$INSTANCE_A_PORT" \
--output-handle-ttl-hours 0 \
--output-handle-sweep-interval-seconds 2 &
A_PID=$!
sleep 1
# Create a handle
HANDLE=$(mcpA '{"jsonrpc":"2.0","id":11,"method":"tools/call","params":{"name":"read_page","arguments":{"output_mode":"handle"}}}' \| jq -r '.result.content[0].text | fromjson | .output_handle')# Wait for one full sweep cycle
sleep 3
# Redeem should now fail with structured not-found
RESP=$(mcpA "{\"jsonrpc\":\"2.0\",\"id\":12,\"method\":\"tools/call\",\"params\":{\"name\":\"oc_output_fetch\",\"arguments\":{\"output_handle\":\"$HANDLE\"}}}")echo"$RESP"| jq -e '.result.isError == true and (.result.content[0].text | fromjson | .error.code == "output_handle_not_found")'>/dev/null \
&&echo OK || { echo"FAIL: handle not evicted after TTL";exit 1; }
Pass: redeem after TTL returns output_handle_not_found. Handle file is gone from ~/.openchrome/output/. Both flags above are new in this PR.
Scenario 7 — oc_journal records handle creation
The journal tool today accepts {action: "summary" | "recent", count}. This PR extends action="recent" to include output_handle_created entries alongside tool-call entries (see Acceptance Criteria).
JOURNAL=$(mcpA '{"jsonrpc":"2.0","id":9,"method":"tools/call","params":{"name":"oc_journal","arguments":{"action":"recent","count":50}}}')# The journal returns either text or structured JSON depending on the action; for "recent" with the# extension this PR ships, it returns a JSON array. Either parse from .result.content[0].text directly# or via fromjson if the server text-wraps it (handled by both branches below).
TEXT=$(echo "$JOURNAL"| jq -r '.result.content[0].text')echo"$TEXT"| jq -e 'if type=="array" then .[] else . end | select(.event=="output_handle_created")'>/dev/null 2>&1 \
|| (echo "$TEXT"| grep -q 'output_handle_created') \
&&echo OK || { echo"FAIL: no output_handle_created event in journal";exit 1; }
Pass: OK.
Issue closure criteria
Scenarios 1–4 and 7 pass strictly; Scenario 5 ratio recorded and ≤ 0.10; Scenario 6 confirmed manually. Reproducer script at scripts/verify/A-output-handles.mjs.
Tier:
core(P2-compliant additive: existing response shape preserved by default)PR target:
developSeries: apify-mcp adoption A
Background
apify-mcp-serverinvokes Actors withcall-actor, returns a run-id + truncated preview, and the agent retrieves the full payload via a separateget-actor-outputtool with offset/limit. The pattern exists for one reason: prevent large tool responses from blowing up the host LLM's context window.OpenChrome has several tools whose response can exceed 50–500 KB of JSON in one shot:
oc_evidence_bundle(src/tools/oc-evidence-bundle.ts) — bundle descriptor + (currently) inlined bytes when small.crawl(src/tools/crawl.ts) — full crawl tree.network(src/tools/network.ts) — network slice JSON.read_page(src/tools/read-page.ts) — sanitized DOM/text.extract_data(src/tools/extract-data.ts) — extracted records.When the agent only needs a summary or first N records, the entire payload still occupies the LLM context for the rest of the conversation. With prompt-cache TTL of 5 minutes and Sonnet/Haiku models being the common deployment target, this is the single highest leverage cost-cutting opportunity OpenChrome has not yet taken. Apify's pattern translates directly: the data already lives in
~/.openchrome/trace/(JSONL trace storage,src/core/trace/storage.ts) — we just need an opt-in handle response mode plus a fetch tool.Portability-Harness Contract alignment:
moderemainsinline; v1.11.0 callers see byte-identical responses.Proposed Implementation
Tool surface changes
Add an optional
output_modeinput parameter to the five tools above:Semantics:
inline(default): current behavior, byte-identical to v1.11.0.handle: response payload is written to trace storage; the tool returns a small descriptor only.auto: behaves likeinlineif the serialized payload ≤output_inline_limit_bytes, otherwise spills tohandle.Handle response shape (uniform across all five tools):
{ "output_handle": "oh_<base32-12>", "mime_type": "application/json" | "application/gzip" | "text/markdown", "size_bytes": 184320, "item_count": 142, "preview": "<first ≤ 2048 bytes of UTF-8-safe content OR null for binary>", "expires_at": "2026-05-19T12:34:56Z", "fetch_with": "oc_output_fetch" }New tool:
oc_output_fetchAdd
src/tools/oc-output-fetch.tsexposing:Returns:
{ "output_handle": "oh_...", "offset": 0, "limit": 200, "returned": 200, "total": 1420, "next_offset": 200 | null, "content": <items | base64-bytes>, "eof": false }Storage
~/.openchrome/output/<YYYY-MM-DD>/<output_handle>.{json|bin}(atomic write via existingsrc/utils/atomic-file.ts).--output-handle-ttl-hours, default 24).output/directory (reuse existing 5-minute sweeper insrc/core/trace/storage.ts).output_handle_createdevent so handles are discoverable viaoc_journal.Tier registration
oc_output_fetchis Tier 1 (always exposed) — without it, agents have no way to redeem handles.Files touched
src/tools/oc-output-fetch.ts,src/core/output/handle-store.ts,tests/core/output-handles.test.ts.src/tools/oc-evidence-bundle.ts,src/tools/crawl.ts,src/tools/network.ts,src/tools/read-page.ts,src/tools/extract-data.ts— addoutput_modeplumbing in input schema and result branching.src/tools/index.ts— registeroc_output_fetchinregisterAllTools()(after the existing core tools at the head of the registry).src/config/tool-tiers.ts— addoc_output_fetchat tier1inTOOL_TIERS(the type is1 | 2 | 3).src/index.ts— add two new CLI flags:--output-handle-ttl-hours <hours>(default 24) and--output-handle-sweep-interval-seconds <seconds>(default 300, lowerable for tests). Both flags are net-new and shipped by this PR; no fallback is required.src/journal/task-journal.ts(andsrc/tools/journal.tsif needed) — extend the journal to recordoutput_handle_createdevents alongside the existing tool-call entries. This is a small but real extension: today the journal records tool invocations, not arbitrary events. Acceptance criteria below cover the schema.Acceptance Criteria
oc_output_fetchregistered as Tier 1;TOOL_TIERSsnapshot test updated.output_modeandoutput_inline_limit_bytesinputs without breaking existing callers (defaults preserve byte-identical v1.11.0 responses).output_mode='handle', response strictly matches the handle response shape above (validated by a Zod schema kept insrc/tools/_shared/output-handle.schema.ts).~/.openchrome/output/<date>/with atomic writes; DiskMonitor prunes expired handles in the next 5-minute sweep tick after expiry.oc_output_fetchsupportsoffset/limitpagination for both JSON-array payloads (item-based) and blob payloads (byte-range);next_offsetis null exactly wheneof=true.tests/core/registration-default.snapshot.ts(or extension thereof) shows thattools/listdiffers from v1.11.0 only by the addition ofoc_output_fetch.output_handle_createdevents. New event schema:{event: "output_handle_created", handle, source_tool, size_bytes, mime_type, ts}.oc_journalaction="recent"surfaces them in the same list as tool-call entries.--no-sanitize-contentnot set) preserves the sanitized payload on disk; raw content is never stored unsanitized.npm run build && npm test && npm run lint:tiergreen.tier:coreand lists every modified tool name.Verification (post-merge, via openchrome MCP)
A tester instance of openchrome (
instance T) drives a target instance (instance A) through MCPtools/list+ tool calls.Setup
Scenario 1 — default mode is byte-identical to v1.11.0
Pass:
OK. Default callers never see the new envelope.Scenario 2 —
output_mode='handle'returns a redeemable descriptorPass: handle matches regex, size > 0, preview ≤ 2048 bytes.
Scenario 3 —
oc_output_fetchpaginates correctlyPass:
returned ≤ limit,total ≥ returned,next_offsetconsistent witheof.Scenario 4 — unknown handle returns structured error, not a stack trace
Pass:
OK. Error is structured (error.code='output_handle_not_found').Scenario 5 — handle response size invariant
The real invariant is that the handle response itself is small and bounded, regardless of underlying payload size. Test against a fixture site committed at
tests/fixtures/sites/many-pages/(a local httpbin-style server serving ≥ 30 internal pages) so the test is hermetic.Record both numbers in
scripts/verify/A-output-handles-bytes.txt.Pass: handle-mode response body ≤ 4 KB. (Descriptor + 2 KB preview cap means typical handle responses are 2.5–3.5 KB; the 4 KB ceiling is the actual invariant. The inline number is recorded for documentation only — it depends on workload and is not a pass/fail gate.)
Scenario 6 — TTL eviction (uses the two new sweep flags shipped by this PR)
Pass: redeem after TTL returns
output_handle_not_found. Handle file is gone from~/.openchrome/output/. Both flags above are new in this PR.Scenario 7 —
oc_journalrecords handle creationThe journal tool today accepts
{action: "summary" | "recent", count}. This PR extendsaction="recent"to includeoutput_handle_createdentries alongside tool-call entries (see Acceptance Criteria).Pass:
OK.Issue closure criteria
Scenarios 1–4 and 7 pass strictly; Scenario 5 ratio recorded and ≤ 0.10; Scenario 6 confirmed manually. Reproducer script at
scripts/verify/A-output-handles.mjs.Out of scope
output_mode='handle'is the prescribed lever; inline-mode bytes are unchanged.read_pageto handle mode for sanitized small pages —mode='auto'covers it.Dependencies
--slim) and refactor(core): standardize tool descriptions with 'When to use / When NOT to use' guidance #841 (description standardization). Composes well with both.References
apify/apify-mcp-server—call-actor+get-actor-outputpattern (Apache-2.0).docs/roadmap/portability-harness-contract.md— P1, P2, P4 alignment.src/core/trace/storage.ts— existing JSONL store and DiskMonitor sweep loop.src/utils/atomic-file.ts— atomic write primitive reused for handle persistence.Revision history
OpenChrome 실검증 체크리스트
검증 대상
검증 증거
npm run build통과.npm run lint:tier통과: 521 modules / 1239 dependencies, no dependency violations.npm run lint:tool-schemas통과: 82 baselined violations, 0 new.oc_connection_healthconnected, localhost fixturenavigate성공.이슈별 코드/테스트 근거
산출물
.omx/reverify-evidence/targeted-jest.log.omx/reverify-evidence/lint-tier.log.omx/reverify-evidence/lint-tool-schemas.log.omx/reverify-evidence/openchrome-live-smoke.log