Skip to content

feat(journal): compact resume handoff summary for long-running sessions #1027

@shaun0927

Description

@shaun0927

Background

OpenChrome already journals tool calls and exposes checkpoint/session continuity primitives. The missing layer is a compact, caller-facing handoff summary that helps an LLM resume after context compaction, process restart, or a long-running browser session without redoing completed work.

Relevant OpenChrome anchors:

  • src/journal/task-journal.ts — recent entries, milestones, summaries, redaction.
  • src/tools/journal.ts — journal-facing MCP surface, using the current journal-facing MCP surface; if absent, extend src/tools/journal.ts or add it in this issue.
  • src/tools/checkpoint.ts — task progress and tab state persistence.
  • src/mcp-server.ts_sessionContext and hint injection into tool results.

Related issues to avoid duplicating:

Goal

Add a compact resume/handoff summary surface over existing journal/checkpoint data so host agents can recover state without OpenChrome owning the agent lifecycle.

Non-goals

Proposed scope

Expose a summary through the existing journal/checkpoint tool surface, by extending the existing journal/checkpoint tool surface. Preferred order:

  • oc_journal action: handoff_summary
  • oc_checkpoint action: handoff
  • a small new read-only tool only if the current tool contracts make extension awkward

The output should be structured JSON with:

  • period: start/end timestamps and source checkpoint id when supplied
  • currentState: latest known session id, tab ids, URL/title, tab health when present in persisted artifacts; otherwise return an explicit unavailable reason
  • completedMilestones: condensed journal milestones since checkpoint or timestamp
  • recentFailures: failed tool calls grouped by tool/signature/error class
  • stuckSignals: recent ProgressTracker/HintEngine stuck or stalling hints when present in persisted artifacts; otherwise return an explicit unavailable reason
  • pendingSteps: from checkpoint task state when present
  • recommendedRecoveryOptions: non-authoritative options such as refresh DOM snapshot, choose a different selector, reload page, ask user for auth/CAPTCHA
  • limits: explicit statement when data is unavailable, redacted, stale, or outside retention

Implementation checkpoints

  1. Identify the current journal MCP surface and choose the smallest compatible action name.
  2. Build the read-only summary generator over TaskJournal and checkpoint metadata.
  3. Add deterministic redaction, grouping, scoping, and size caps.
  4. Add empty-state and restart-persistence tests.
  5. Add the real-server E2E scenario and document the exact output schema.

Acceptance criteria

  • Summary generation is read-only and safe to call frequently.
  • Sensitive args remain redacted using the journal redaction rules.
  • The summary remains small and bounded; large tool args/results are summarized, not embedded.
  • The summary can be scoped by since, checkpointId, and/or sessionId where existing data permits.
  • Empty/no-journal states return a useful empty summary, not an error.
  • The response clearly distinguishes evidence from recommendations.

Tests

Add tests covering:

  • no journal/checkpoint data returns an empty but valid summary.
  • milestones are included and ordered.
  • failures are grouped without leaking redacted args.
  • scoping by checkpoint or timestamp excludes older entries.
  • output remains under a deterministic size cap for a large synthetic journal.

Real OpenChrome verification after implementation

Add an E2E scenario using a real OpenChrome MCP server:

  1. Start OpenChrome with a temp home/profile.
  2. Navigate to a local fixture page.
  3. Perform at least one successful read/action and one intentionally failing call, such as find for a missing selector or an interaction with a stale/missing ref.
  4. Save an oc_checkpoint with completed/pending steps.
  5. Call the handoff summary action.
  6. Assert the summary contains:
    • latest URL/title or tab id evidence,
    • the completed milestone,
    • the failed call grouped under recentFailures,
    • redacted sensitive fields if any were included in tool args,
    • a bounded recommendation list that does not claim automatic recovery occurred.
  7. Restart OpenChrome with the same temp home and assert the handoff summary can still be produced from persisted artifacts.

Suggested command evidence:

  • New targeted journal/handoff unit tests.
  • npm run test:e2e -- --runTestsByPath tests/e2e/scenarios/journal-handoff.e2e.ts.

Direction-fit review notes

This is directionally sound only as a read-only summary layer. It should help host LLMs reduce wandering after compaction but must not become a second task ledger or planner. If #855 or #873 already implements equivalent fields by the time this is picked up, close this as duplicate or narrow it to journal-derived failure/stuck summarization.

Curated scope, overlap handling, and verification checklist

Scope classification

  • Canonical lane: compact caller-facing resume/handoff summary over journal/checkpoint data.
  • Primary deliverable: resume summary surface that helps a host recover after context compaction/restart without OpenChrome owning lifecycle.
  • Open PR: feat(journal): add compact resume handoff summary (#1027) #1113 (feat/1027-journal-handoff). Continue there.
  • Non-goal: new agent lifecycle manager, full task ledger replacement, unbounded journal dump, or storing secrets in summaries.

Overlap and conflict resolution

Implementation checklist

  • Add or extend a journal-facing MCP surface for compact resume/handoff summary with latest goal/context, completed work, current page/tab, blockers, and evidence links.
  • Derive summaries from existing journal/checkpoint data with redaction and strict size bounds.
  • Include source entry IDs/timestamps so the summary remains auditable.
  • Add tests for normal summary, empty journal, long journal truncation, redaction, restart/reload, and checkpoint linkage.
  • Document when hosts should request the summary before/after compaction.

Success criteria

  • A host can resume a long session using a compact summary without replaying the full journal.
  • The summary is bounded, redacted, and source-linked.
  • Missing journal/checkpoint data yields clear partial/empty diagnostics.
  • Existing journal behavior remains backward-compatible.

Post-merge OpenChrome live verification checklist

  • Run a local multi-step session, create checkpoint/journal entries, then call the resume summary surface.
  • Verify the summary contains completed work, current context, blockers if any, and source evidence links within the documented size bound.
  • Restart OpenChrome and verify the same summary can be regenerated or loaded.
  • Include sanitized summary output and source IDs in merge notes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1P1 highenhancementNew feature or requestobservabilityObservabilityreliabilityReliability and stability improvement

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions