diff --git a/docs/rfcs/live-to-final-assistant-replies.md b/docs/rfcs/live-to-final-assistant-replies.md
index 1fa138a6f4..68e8093dda 100644
--- a/docs/rfcs/live-to-final-assistant-replies.md
+++ b/docs/rfcs/live-to-final-assistant-replies.md
@@ -1,82 +1,173 @@
# Live-to-Final Assistant Replies for Long-Running Agent Sessions
-- **Status:** Proposed
+- **Status:** Accepted (parent contract; implementation tracked in [#3400](https://github.com/nesquena/hermes-webui/issues/3400))
- **Author:** @franksong2702
- **Created:** 2026-06-03
- **Tracking issue:** [#3400](https://github.com/nesquena/hermes-webui/issues/3400)
-## Background
+## Background: Long-Running Sessions Are The Anchor
-This RFC is anchored on long-running agent sessions.
+This RFC defines the product model for assistant replies in long-running agent
+sessions.
-Short conversations are useful sanity checks, but they do not exercise the
-hardest browser-agent states. A long-running session can spend minutes waiting,
-make many tool calls, produce a long final answer, cross context-pressure
-boundaries, hit tool-call or retry limits, lose network continuity, and still
-need to recover into a readable final transcript.
+Short conversations are still useful sanity checks, but they do not exercise
+the hardest browser-agent states. A long-running session can:
-The product model should therefore be defined against the long-running case. A
+- keep the user waiting for minutes,
+- make many tool calls,
+- produce a long final answer,
+- create or update workspace artifacts,
+- cross Auto Compression boundaries,
+- hit tool-call, retry, or iteration limits,
+- lose browser, network, or SSE continuity,
+- receive a user cancel or interruption request while startup is still racing,
+- switch sessions or reload before the turn settles.
+
+The design should therefore be judged against the long-running case first. A
short conversation should be the same lifecycle with fewer events, not a
separate UI model.
-## Problem
+The goal is not to add a Worklog widget, and it is not to make Auto Compression
+or duplicate stream ownership the headline. Those are supporting slices and
+edge cases. The headline is one coherent assistant reply lifecycle: live work,
+supporting activity, terminal outcome, and final answer.
+
+## Product Problem
Hermes WebUI currently uses one chat surface to represent several different
-things:
+meanings:
- the assistant's live process text while work is still running,
- tool activity and lifecycle status that support that work,
- recovery or replay state after refresh, reconnect, or session switching,
+- terminal outcomes such as cancel, interruption, no response, or tool limit,
- the final answer after the turn settles.
-Those meanings have repeatedly competed for the same visual space. The result is
-that some long-running sessions feel noisy, some look silent while the agent is
-working, some recover into a different shape after reconnect, and some terminal
-edge cases can appear completed even when no final answer was produced.
+Those meanings have repeatedly competed for the same visual space. Some
+long-running sessions feel noisy, some look silent while the agent is working,
+some recover into a different shape after reconnect, and some terminal edge
+cases can appear completed even when no final answer was produced.
-This RFC defines the product semantics for that lifecycle.
+This RFC defines the product semantics that implementation PRs and follow-up
+RFCs should preserve.
-## Public Issue Signals
+## Scope
-The public issue history already shows the same problem recurring from several
-directions. The table below uses representative examples; the broader inventory
-lives in [#3400](https://github.com/nesquena/hermes-webui/issues/3400).
+### This RFC owns
-| Signal | Examples | Product implication |
+- The visible lifecycle of one assistant reply from live work to final or
+ terminal outcome.
+- The boundary between process prose, tool activity, lifecycle status, and the
+ final answer.
+- Long-running edge-case semantics for Auto Compression, no-final answers,
+ tool/iteration limits, cancel/interruption, replay/reconnect/session switch,
+ produced artifacts/output handoff, and sidebar/session ownership.
+- The classification of work into implemented slices, active PRs, confirmed
+ follow-ups, and child RFCs.
+
+### This RFC does not own
+
+- Pixel-level styling.
+- Provider/model selection.
+- A backend tool-event schema change such as a shared display-title field.
+- A new runtime adapter, runner process, storage format, or SSE protocol.
+- Rich artifact rendering, executable HTML, visualization plugins, or Canvas
+ editing surfaces. This RFC only owns how produced artifacts remain findable
+ from the reply lifecycle.
+- The full command semantics for Queue, Steer, Stop-and-send, and Interrupt.
+ Those belong to the pending-intent control-surface contract tracked by
+ [#3058](https://github.com/nesquena/hermes-webui/issues/3058) and
+ [#3061](https://github.com/nesquena/hermes-webui/pull/3061).
+
+## Public Inventory
+
+This inventory groups representative public issues and PRs by the
+long-running-session concern they expose. It is not a claim that every linked
+item is solved by this RFC. The classification column records durable scope,
+not current open/merged/superseded state: for live status, the tracking issue
+[#3400](https://github.com/nesquena/hermes-webui/issues/3400) is authoritative.
+
+| Concern | Representative signals | Current classification |
| --- | --- | --- |
-| Working output and final answer can blur together | [#536](https://github.com/nesquena/hermes-webui/issues/536) | Running process and final answer need separate semantics. |
-| Compression state is hard to represent cleanly | [#469](https://github.com/nesquena/hermes-webui/issues/469), [#2973](https://github.com/nesquena/hermes-webui/issues/2973), [#3079](https://github.com/nesquena/hermes-webui/issues/3079) | Context compression should be visible while useful, but not become final transcript content. |
-| Replay, reconnect, and session switching can lose active context | [#2283](https://github.com/nesquena/hermes-webui/issues/2283), [#2924](https://github.com/nesquena/hermes-webui/issues/2924), [#3391](https://github.com/nesquena/hermes-webui/issues/3391) | A recovered session should rebuild the same reply lifecycle as the live render. |
-| Tool, activity, thinking, and progress rendering can become noisy or silent | [#1298](https://github.com/nesquena/hermes-webui/issues/1298), [#3014](https://github.com/nesquena/hermes-webui/issues/3014), [#3015](https://github.com/nesquena/hermes-webui/issues/3015) | Process text should stay primary; tool activity should remain supporting detail. |
-| Terminal turns can end without a real final answer | [#3315](https://github.com/nesquena/hermes-webui/issues/3315), [#3316](https://github.com/nesquena/hermes-webui/issues/3316) | No-final, compression-exhausted, and tool-limit outcomes need explicit terminal states. |
-| Stream ownership and cancellation affect what the user sees | [#3344](https://github.com/nesquena/hermes-webui/issues/3344), [#3345](https://github.com/nesquena/hermes-webui/issues/3345) | One visible turn must own its live, terminal, and final events. |
-| Session awareness affects live work visibility | [#856](https://github.com/nesquena/hermes-webui/issues/856), [#1370](https://github.com/nesquena/hermes-webui/issues/1370), [#1436](https://github.com/nesquena/hermes-webui/issues/1436) | Sidebar/session state must not contradict the visible active session. |
-| Busy input changes live-session control | [#720](https://github.com/nesquena/hermes-webui/issues/720), [#965](https://github.com/nesquena/hermes-webui/pull/965), [#1062](https://github.com/nesquena/hermes-webui/pull/1062) | Queue, steer, and interrupt are adjacent controls for long-running sessions, but command-level behavior belongs to a separate control-surface contract. |
-
-## Goals
-
-- Define the product model for assistant replies in long-running sessions.
-- Make live process text, tool activity, lifecycle status, and the final answer
- share one coherent turn lifecycle.
-- Preserve the same lifecycle through replay, reconnect, refresh, and session
- switching.
-- Name terminal outcomes honestly when the run does not produce a normal final
- answer.
-- Define which long-running edge cases belong to the first slice and which
- should be handled by later slices.
-
-## Non-goals
-
-This RFC does not define:
-
-- pixel-level styling,
-- provider/model selection,
-- command-level queue/steer/interrupt behavior,
-- a new runtime adapter, storage format, or SSE protocol,
-- a backend tool-event schema change such as a shared display-title field.
+| Live work vs final answer boundary | [#536](https://github.com/nesquena/hermes-webui/issues/536), [#3400](https://github.com/nesquena/hermes-webui/issues/3400), [#3464](https://github.com/nesquena/hermes-webui/pull/3464) | Main product scope. #3464 landed the first RFC; this document is the parent contract for follow-up slices. |
+| First live-to-final reply implementation | [#3401](https://github.com/nesquena/hermes-webui/pull/3401), [#3014](https://github.com/nesquena/hermes-webui/issues/3014), [#3015](https://github.com/nesquena/hermes-webui/pull/3015) | First implementation slice. It should keep using `Refs #3400`; it does not close the umbrella. |
+| Auto Compression visibility and context pressure | [#469](https://github.com/nesquena/hermes-webui/issues/469), [#2973](https://github.com/nesquena/hermes-webui/issues/2973), [#3079](https://github.com/nesquena/hermes-webui/issues/3079), [#3315](https://github.com/nesquena/hermes-webui/issues/3315), [#3316](https://github.com/nesquena/hermes-webui/pull/3316) | Supporting edge case. Running compression is live lifecycle status; compression-exhausted/no-final finalization is a terminal-state follow-up. |
+| Replay, reconnect, session switch, and reattach | [#2283](https://github.com/nesquena/hermes-webui/pull/2283), [#2924](https://github.com/nesquena/hermes-webui/issues/2924), [#3391](https://github.com/nesquena/hermes-webui/pull/3391) | Supporting recovery infrastructure. The product requirement is same lifecycle after replay, or an explicit degraded/restoring state. |
+| Tool, activity, thinking, and visible progress | [#1298](https://github.com/nesquena/hermes-webui/issues/1298), [#3014](https://github.com/nesquena/hermes-webui/issues/3014), [#3015](https://github.com/nesquena/hermes-webui/pull/3015) | Main reply-rendering concern. Process prose stays primary; tool/reasoning/debug detail stays supporting. |
+| No-final and terminal failure outcomes | [#3315](https://github.com/nesquena/hermes-webui/issues/3315), [#3316](https://github.com/nesquena/hermes-webui/pull/3316) | Confirmed follow-up / active PR scope. A tool-tail or compression-exhausted run must not settle as normal completion without a real final answer. |
+| Cancellation and stream ownership | [#3344](https://github.com/nesquena/hermes-webui/issues/3344), [#3345](https://github.com/nesquena/hermes-webui/pull/3345), [#3475](https://github.com/nesquena/hermes-webui/issues/3475), [#3476](https://github.com/nesquena/hermes-webui/pull/3476) | Supporting cancel/recovery scope. Early-cancel worker reconciliation is addressed by [#3476](https://github.com/nesquena/hermes-webui/pull/3476); frontend cancel owner-guard hardening is the remaining follow-up. |
+| Produced artifacts and output handoff | [#2655](https://github.com/nesquena/hermes-webui/issues/2655), [#2673](https://github.com/nesquena/hermes-webui/pull/2673), [#2881](https://github.com/nesquena/hermes-webui/issues/2881), [#2938](https://github.com/nesquena/hermes-webui/pull/2938), [#3329](https://github.com/nesquena/hermes-webui/pull/3329), [#3348](https://github.com/nesquena/hermes-webui/pull/3348), [#3528](https://github.com/nesquena/hermes-webui/issues/3528) | Supporting session-output concern. Existing Artifacts and `workspace://` surfaces make produced files findable; long-running replay/cancel/terminal paths must not lose the tool metadata needed to recover that handoff. |
+| Sidebar/session ownership and active-session awareness | [#856](https://github.com/nesquena/hermes-webui/issues/856), [#1370](https://github.com/nesquena/hermes-webui/pull/1370), [#1436](https://github.com/nesquena/hermes-webui/issues/1436) | Confirmed follow-up scope when sidebar/session metadata contradicts the visible active turn. |
+| User intervention during live work | [#720](https://github.com/nesquena/hermes-webui/issues/720), [#965](https://github.com/nesquena/hermes-webui/pull/965), [#1062](https://github.com/nesquena/hermes-webui/pull/1062), [#3058](https://github.com/nesquena/hermes-webui/issues/3058), [#3061](https://github.com/nesquena/hermes-webui/pull/3061) | Child RFC scope. This parent RFC only requires that controls preserve ownership, replay, and terminal honesty. |
## Product Model
+### Lifecycle flow
+
+The lifecycle below is a product-state model, not a backend schema or
+wire-event contract. At settle time, the visible reply state should be derived
+from durable transcript truth, available terminal evidence, and reply
+ownership. A turn should not be marked `completed` only because live activity
+or partial assistant prose existed earlier.
+
+```mermaid
+%%{init: {"theme": "neutral"}}%%
+flowchart TD
+ A([User sends message]) --> B["Turn created
reply ownership established"]
+ B --> C["Live phase
process prose + quiet tool activity"]
+ C --> D{Lifecycle event}
+
+ D -- stream continues --> C
+ D -- reload / reconnect / session switch --> E["Recovery and replay
rebuild the same lifecycle from durable state"]
+ E --> F{Same turn recovered?}
+ F -- yes --> C
+ F -- not yet --> G["Restoring or degraded state
do not mark completed from missing live data"]
+ G --> D
+
+ D -- user cancels --> H["Cancel requested
settle only the owned reply"]
+ H --> I["Settle decision
durable transcript truth + terminal evidence + reply ownership"]
+ D -- run ended / terminal evidence --> I
+
+ I --> J{Event belongs to
the current visible reply?}
+ J -- no --> K["Ignore stale event
do not mutate the current visible reply"]
+ J -- yes --> L{Final assistant answer present
and terminal evidence is normal?}
+
+ L -- yes --> M["completed
activity summary above final answer"]
+ L -- no --> N{Specific terminal outcome}
+ N -- cancelled --> O["cancelled
user stopped the turn"]
+ N -- interrupted --> P["interrupted
continuity lost before final answer"]
+ N -- compression_exhausted --> Q["compression_exhausted
compression could not continue safely"]
+ N -- tool_limit_reached --> R["tool_limit_reached
tool / retry / iteration ceiling hit"]
+ N -- no_response --> S["no_response
no usable assistant final content"]
+ N -- other failure --> T["error
fallback for other terminal failures"]
+
+ M --> U["Settled reply visible
supporting activity collapsed;
artifacts and workspace outputs findable"]
+ O --> U
+ P --> U
+ Q --> U
+ R --> U
+ S --> U
+ T --> U
+```
+
+### Reply ownership
+
+One visible assistant reply belongs to one user turn and one active run/stream
+identity while that run is active.
+
+Requirements:
+
+- A live event should attach to the assistant reply that owns the run.
+- A later turn in the same session must not inherit stale live events from an
+ older stream.
+- A background session can continue running, but its live stream should not
+ mutate the visible pane for another session.
+- A terminal event should settle the same turn it belongs to, or route through
+ a background/error path if the user is no longer viewing that session.
+- Sidebar state should not contradict the visible owner. If the sidebar says a
+ session is running, opening it should show live work, a restoring/degraded
+ state, or an honest terminal state.
+
### Live phase
While a turn is running, the assistant reply should read as a live process
@@ -89,6 +180,8 @@ Requirements:
- Tool rows and tool groups are collapsed by default.
- Full commands, arguments, raw output, and large payloads stay behind deeper
disclosure.
+- Thinking/reasoning that is not user-facing progress should not be the only
+ visible signal that work is happening.
- The run timer/status belongs with the active live turn, not as a top
transcript artifact.
- Running-only lifecycle markers are transient.
@@ -104,17 +197,17 @@ Requirements:
- A compact activity summary appears above the final answer.
- The activity summary is collapsed by default.
- Expanding it reveals readable process history and tool history.
-- Raw command/output detail remains behind a deeper disclosure.
+- Raw command/output detail remains behind deeper disclosure.
- The final answer remains ordinary assistant prose below the summary.
-- Running-only markers disappear from the settled transcript unless they explain
- a visible error or recovery outcome.
+- Running-only markers disappear from the settled transcript unless they
+ explain a visible error or recovery outcome.
- Very long final answers remain complete and readable. They should not be
hidden inside the activity summary or replaced by a progress/status artifact.
### Recovery and replay
-Refresh, reconnect, session switching, and replay should preserve the same reply
-model.
+Refresh, reconnect, session switching, and replay should preserve the same
+reply model.
Requirements:
@@ -124,43 +217,46 @@ Requirements:
- If the exact live scene cannot be reconstructed immediately, the UI should
show an explicit restoring or degraded state instead of an empty running
shell.
+- Replay must be idempotent. It should not duplicate tokens, progress prose,
+ reasoning, tool rows, compression rows, or terminal cards.
- Old in-progress browser state must not override durable session truth.
- Recovery/control events stay internal unless they describe a user-visible
terminal outcome.
### Terminal outcomes
-Every turn needs a terminal outcome. A turn without a final answer must not look
-like a normal completed answer.
+Every turn needs a terminal outcome. A turn without a final answer must not
+look like a normal completed answer.
Required product states:
-- **completed**: the assistant produced a final answer and the turn settled
- normally.
-- **cancelled**: the user stopped the turn.
-- **interrupted**: browser, stream, worker, or network continuity was lost
- before a final answer was produced.
-- **compression exhausted**: context compression could not create enough room to
- continue safely.
-- **tool limit reached**: the run hit a tool-call, retry, or iteration ceiling
- before a final answer was produced.
-- **no response**: the provider or runtime returned no usable assistant content.
-- **error**: fallback for failures that do not fit the above states.
-
-Copy can evolve, but these semantic distinctions should stay stable in live
-rendering, settled rendering, and replay.
+| State | Meaning |
+| --- | --- |
+| `completed` | The assistant produced a final answer and the turn settled normally. |
+| `cancelled` | The user stopped the turn. |
+| `interrupted` | Browser, stream, worker, runtime, or network continuity was lost before a final answer was produced. |
+| `compression_exhausted` | Context compression could not create enough room to continue safely. |
+| `tool_limit_reached` | The run hit a tool-call, retry, or iteration ceiling before a final answer was produced. |
+| `no_response` | The provider or runtime returned no usable assistant final content. |
+| `error` | Fallback for failures that do not fit the above states. |
+
+These identifiers name product states, not a wire/enum or persisted schema
+contract; consistent with Scope, this RFC does not mandate a backend field or
+event shape for them. Copy can evolve, but these semantic distinctions should
+stay stable in live rendering, settled rendering, and replay.
When more than one terminal condition applies, the more specific condition
-should win over the generic fallback. For example, cancelled, compression
-exhausted, tool limit reached, and no response should not be flattened into a
-plain error only because the turn also failed to produce a final answer.
+should win over the generic fallback. For example, `cancelled`,
+`compression_exhausted`, `tool_limit_reached`, and `no_response` should not be
+flattened into a plain `error` only because the turn also failed to produce a
+final answer.
## Long-Running Edge Cases
### Auto Compression
-Auto Compression is a context lifecycle transition, not a tool call and not final
-answer content.
+Auto Compression is a context lifecycle transition, not a tool call and not
+final answer content.
Expected behavior:
@@ -171,26 +267,24 @@ Expected behavior:
remain understandable without turning compression into the main transcript.
- Do not keep compression status text in the settled transcript unless it
explains an error or recovery state.
+- If compression fails to create enough room, surface `compression_exhausted`
+ or another specific terminal outcome instead of normal completion.
- Compression success in the UI does not by itself prove model-facing context
- was pruned; that remains a separate runtime/context invariant.
-
-### Very long final answers
-
-Long-running sessions can end with a final answer that is itself lengthy.
+ was pruned; that remains a runtime/context invariant covered by the run-state
+ consistency contract.
-Expected behavior:
+Confirmed follow-up scope:
-- The final answer remains the primary settled assistant content.
-- Supporting activity stays above it and collapsed by default.
-- Streaming and settle transitions should not jump the user away from the final
- answer or make the answer look like tool output.
-- Any additional collapse, preview, or navigation affordance for very long final
- answers should preserve the full answer as ordinary assistant prose.
+- Add or standardize an explicit per-pass compression completion event if the
+ UI otherwise has to infer completion from later stream events.
+- Keep compression-exhausted/no-final handling aligned with
+ [#3315](https://github.com/nesquena/hermes-webui/issues/3315) and
+ [#3316](https://github.com/nesquena/hermes-webui/pull/3316).
-### Tool-call and retry ceilings
+### Tool-call, retry, and iteration ceilings
-Long-running sessions can exhaust tool-call limits, retry budgets, or iteration
-ceilings before a final answer is available.
+Long-running sessions can exhaust tool-call limits, retry budgets, or
+iteration ceilings before a final answer is available.
Expected behavior:
@@ -198,10 +292,12 @@ Expected behavior:
- Preserve the readable work history that led to the limit.
- Keep the final area honest: show that the run stopped because a limit was
reached rather than inventing a final answer.
-- Do not persist internal control prompts as ordinary user-visible transcript
- content.
+- Internal continuation or control prompts used by the runtime must not persist
+ as ordinary user-authored transcript content.
+- The product state should not depend on whether the limit came from provider
+ policy, Hermes Agent iteration budget, or WebUI adapter/runtime policy.
-### No-final and provider failure
+### No-final answer and provider failure
Tool-heavy turns can end with tool output, provider failure, or no usable final
assistant message.
@@ -209,10 +305,38 @@ assistant message.
Expected behavior:
- Detect the absence of a final assistant answer at settle time.
-- Surface a terminal state such as no response, interrupted, compression
- exhausted, tool limit reached, or error.
+- Surface a terminal state such as `no_response`, `interrupted`,
+ `compression_exhausted`, `tool_limit_reached`, or `error`.
- Do not mark the turn completed only because some assistant/tool activity
occurred earlier.
+- Do not treat internal context-compaction reference material as a final
+ assistant answer.
+
+### Cancel and interruption
+
+Cancel is a user-visible terminal action, not just browser cleanup.
+
+Expected behavior:
+
+- If the user cancels before the run fully starts, the backend still reconciles
+ against the live worker state where possible.
+- If the user cancels after live text, reasoning, or tools have appeared,
+ already-visible work should not be silently lost.
+- The frontend cancel path should close the SSE source it owns and only clear
+ busy state for the stream it actually cancelled.
+- A cancelled turn should settle as `cancelled`, not as provider `no_response`.
+- A network or worker interruption should settle as `interrupted` or restoring,
+ not as normal completion.
+
+Classification:
+
+- The early startup cancel race tracked by
+ [#3475](https://github.com/nesquena/hermes-webui/issues/3475) is addressed by
+ [#3476](https://github.com/nesquena/hermes-webui/pull/3476).
+- The owner-aware browser cancel cleanup tracked by
+ [#3344](https://github.com/nesquena/hermes-webui/issues/3344) and
+ [#3345](https://github.com/nesquena/hermes-webui/pull/3345) remains a
+ focused follow-up.
### Reconnect and session switch
@@ -225,103 +349,151 @@ Expected behavior:
- Slow rebuild should be visibly restoring or degraded, not blank.
- Sidebar/session metadata should not point the user at a stale or wrong active
session.
+- Replay should use the same visible lifecycle as live rendering rather than a
+ flattened alternate presentation.
-### User intervention
+Confirmed follow-up scope:
+
+- A clearer restoring/degraded state during slow reattach.
+- Native `Last-Event-ID` or equivalent reconnect cursor support when it is
+ ready to replace or complement the current replay cursor path.
+- Additional tests that prove live and replay use the same lifecycle for
+ process prose, tool rows, compression status, and terminal states.
+
+### Tool-only or low-prose runs
-During long-running work, the user may need to queue follow-up input, steer the
-current direction, or interrupt the run.
+Some valid long-running turns may produce little or no visible process prose
+before the final answer, especially when the model runs a dense sequence of
+tools.
Expected behavior:
-- These controls should not corrupt the live-to-final reply lifecycle.
-- Queue/steer/interrupt command semantics should be defined in a separate
- control-surface contract.
-- This RFC only requires that live-session controls preserve clear ownership,
- terminal outcomes, and replayable state.
+- The UI should not fabricate assistant prose.
+- Tool activity should remain readable enough that the turn does not look
+ empty or broken.
+- Empty placeholders should be filtered rather than rendered as blank prose.
+- If no final answer arrives, the terminal state should explain that outcome
+ instead of leaving only a tool list.
-## Delivery Plan
+### Very long final answers
-### Slice 1: live-to-final reply lifecycle
+Long-running sessions can end with a final answer that is itself lengthy.
-The first implementation slice is represented by #3401. It should demonstrate
-the core reply model:
+Expected behavior:
-- live process text is primary,
-- tool activity is quiet and progressively disclosed,
-- running-only compression status is transient,
-- the settled activity summary appears above the final answer,
-- settle-time rendering does not falsely present a no-final turn as completed,
-- replay and reattach rebuild the same visible structure,
-- stream ownership fixes are limited to preserving the visible turn's ownership,
- terminal events, and replay.
+- The final answer remains the primary settled assistant content.
+- Supporting activity stays above it and collapsed by default.
+- Streaming and settle transitions should not jump the user away from the final
+ answer or make the answer look like tool output.
+- Any additional collapse, preview, outline, or navigation affordance for very
+ long final answers must preserve the full answer as ordinary assistant prose.
-This slice should use `Refs #3400`; it should not close the umbrella issue.
+### Produced artifacts and output handoff
-### Slice 2: terminal and recovery stabilization
+Long-running sessions often create or update files in the workspace, such as
+plans, reports, patches, data files, generated markdown, or other artifacts.
+Those artifacts are part of what the user needs from the completed work, even
+when they are not the final answer text itself.
-The next slice should close the edge cases that make long-running sessions look
-misleading after they stop or recover:
+Expected behavior:
-- cancelled and interrupted final rendering,
-- compression-exhausted terminal rendering,
-- tool-limit / max-retry terminal rendering,
-- no-final-answer provider failure classification,
-- explicit restoring/degraded state during slow reattach,
-- empty process placeholders that make tool-only runs look broken,
-- live-vs-settled label clarity for tool activity.
+- Existing artifact surfaces, such as the session Artifacts tab and
+ `workspace://` links, remain supporting navigation surfaces rather than
+ replacing the final answer.
+- If a turn creates or edits workspace artifacts, the settled reply should not
+ hide the fact that those artifacts exist or make them impossible to find.
+- Reconnect, replay, session switching, cancel, interruption, and no-final
+ terminal paths should preserve enough tool/artifact metadata to rebuild the
+ same artifact handoff.
+- A terminal failure should still distinguish between "no final answer" and
+ "some artifacts were produced before the run stopped".
+- Large generated files or rich artifact types should route through the
+ workspace/artifact preview model instead of being expanded into the main chat
+ transcript by default.
+
+Confirmed follow-up scope:
+
+- Keep artifact recoverability aligned with the session-scoped Artifacts tab
+ work in [#2655](https://github.com/nesquena/hermes-webui/issues/2655) and
+ [#2673](https://github.com/nesquena/hermes-webui/pull/2673).
+- Keep final-answer artifact links aligned with the `workspace://` preview
+ path from [#2881](https://github.com/nesquena/hermes-webui/issues/2881) and
+ [#2938](https://github.com/nesquena/hermes-webui/pull/2938).
+- Treat interrupted/cancelled tool-history loss, such as
+ [#3528](https://github.com/nesquena/hermes-webui/issues/3528), as a
+ live-to-final recoverability bug when it prevents artifact reconstruction.
+
+### Sidebar and session ownership
+
+Long-running sessions are not only a chat-pane concern. The sidebar and session
+metadata help users find active work and later terminal outcomes.
-### Slice 3: live-session control surface
+Expected behavior:
-The next adjacent product area is user intervention during live work:
+- A session row's running indicator should reflect a real active run or a
+ clearly restorable state, not stale persisted metadata alone.
+- Background completion, cancellation, or failure should be represented without
+ stealing the visible pane from the user.
+- Session switching should not erase pending live context, in-flight snapshots,
+ tool history, or terminal outcome state.
+- Maintenance writes, stale cleanup, and background repair should not make old
+ sessions look newly active unless meaningful user/assistant activity happened.
-- queue follow-up input while a turn is running,
-- steer a live turn without losing ownership of the current reply,
-- interrupt a live turn and preserve the user's corrective intent,
-- define busy-input defaults and prompt visibility,
-- ensure these controls replay and settle into the same terminal model.
+### User intervention
-This slice should reference the existing busy-input / CLI-parity history, but it
-should be designed as a control-surface contract rather than as a reply-content
-change.
+During long-running work, the user may queue follow-up input, steer the current
+direction, or stop the run and send a replacement.
-### Slice 4: session and protocol integration
+Expected behavior:
-Broader integration work should stay separate from the reply-content model:
+- These controls should not corrupt the live-to-final reply lifecycle.
+- Queue/Steer/Stop-and-send/Interrupt command semantics should be defined in a
+ separate control-surface contract.
+- This RFC only requires that live-session controls preserve clear ownership,
+ terminal outcomes, and replayable state.
-- native `Last-Event-ID` or equivalent reconnect cursor support,
-- sidebar/session awareness for active long-running work,
-- session-list disappearance or stale-session repair,
-- shared tool display-title normalization across legacy live stream, persisted
- tool calls, replay, gateway paths, and future adapter/runner paths.
+The current child contract is tracked by
+[#3058](https://github.com/nesquena/hermes-webui/issues/3058) and
+[#3061](https://github.com/nesquena/hermes-webui/pull/3061). That child RFC
+should own questions such as:
-These are important follow-ups, but they should not be mixed into the first
-reply-lifecycle implementation slice.
+- whether Queue is browser-backed or server-backed in each slice,
+- when Queue can upgrade to Steer,
+- what Stop-and-send means,
+- how delivered vs applied Steer is represented,
+- what happens to leftover Steer after the run ends.
-## Review Checklist
+## Delivery And Follow-Up Map
-Use this checklist when reviewing PRs against this RFC:
+Use this map to keep implementation PRs and child RFCs scoped. The "vehicle"
+column names a durable track, not live merge state; the tracking issue
+[#3400](https://github.com/nesquena/hermes-webui/issues/3400) is authoritative
+for current open/merged/superseded status.
-- Does the change preserve long-running session readability?
-- Does live process text stay primary over tool metadata?
-- Are tool details available without becoming the main transcript?
-- Does the final answer remain separate from supporting activity?
-- Are compression, no-final, tool-limit, cancel, and interrupt outcomes
- classified honestly?
-- Does reconnect/session switch rebuild the same reply lifecycle?
-- Do internal recovery or control messages stay out of ordinary chat content?
-- Is the PR's slice clear: lifecycle, terminal/recovery, live controls, or
- session/protocol integration?
+| Track | Scope | Current vehicle |
+| --- | --- | --- |
+| Parent product RFC | Define the long-running live-to-final assistant reply lifecycle and review checklist. | This RFC; tracking issue [#3400](https://github.com/nesquena/hermes-webui/issues/3400). |
+| First reply lifecycle implementation | Live process prose, quiet tool activity, settled activity summary above final answer, replay/reattach consistency, live-only compression status, supporting stream ownership fixes. | [#3401](https://github.com/nesquena/hermes-webui/pull/3401). |
+| Terminal/no-final stabilization | Compression exhausted, tool-tail/no-final transcript shape, context-compaction marker suppression, terminal error routing. | [#3315](https://github.com/nesquena/hermes-webui/issues/3315), [#3316](https://github.com/nesquena/hermes-webui/pull/3316). |
+| Cancel ownership hardening | Frontend cancel should close its own SSE source and clear only its own busy state. | [#3344](https://github.com/nesquena/hermes-webui/issues/3344), [#3345](https://github.com/nesquena/hermes-webui/pull/3345). |
+| Early-cancel startup race | Backend cancel should still interrupt the worker when the SSE registry detached before startup fully settled. | [#3475](https://github.com/nesquena/hermes-webui/issues/3475), [#3476](https://github.com/nesquena/hermes-webui/pull/3476). |
+| Pending-intent control surface | Queue, Steer, Stop-and-send, Interrupt, delivered/applied/leftover semantics. | [#3058](https://github.com/nesquena/hermes-webui/issues/3058), [#3061](https://github.com/nesquena/hermes-webui/pull/3061). |
+| Reattach and replay polish | Slow rebuild degraded state, replay/body timing, native cursor support, same lifecycle through replay. | Follow-up issue/PR or child RFC if protocol semantics expand. |
+| Tool-limit and max-iteration terminal state | Limit reached state, control prompt visibility, no fake final answer. | Follow-up issue/PR; may involve Hermes Agent if the runtime owns the limit signal. |
+| Artifact handoff and recoverability | Preserve the link between final/terminal replies and workspace artifacts created or edited during the turn. | Existing Artifacts and `workspace://` surfaces; follow-up issue/PR when replay, cancel, or terminal paths lose artifact metadata. |
+| Sidebar/session ownership | Active/terminal state in session rows, stale spinner repair, session-list disappearance, background terminal feedback. | Follow-up issue/PR under session/runtime contracts. |
+| Very long final answer ergonomics | Optional navigation/outline/preview affordances that preserve the final answer as normal prose. | Open product discussion; no implementation vehicle yet. |
## Relationship To Existing Contracts
-This RFC sits above the current run-state and adapter contracts:
+This RFC sits above the current runtime, recovery, and adapter contracts:
- [`webui-run-state-consistency-contract.md`](webui-run-state-consistency-contract.md)
defines how transcript, context, stream, replay, compression, and session
metadata stay coherent.
- [`canonical-session-resolution.md`](canonical-session-resolution.md) defines
- how URL, local browser state, sidebar rows, and compression lineage resolve to
- one visible session target.
+ how URL, local browser state, sidebar rows, and compression lineage resolve
+ to one visible session target.
- [`turn-journal.md`](turn-journal.md) defines crash-safe submitted-turn and
interrupted-turn recovery semantics.
- [`hermes-run-adapter-contract.md`](hermes-run-adapter-contract.md) defines
@@ -330,11 +502,48 @@ This RFC sits above the current run-state and adapter contracts:
This RFC defines the product meaning those lower-level contracts need to
preserve for long-running assistant replies.
+The pending-intent control-surface RFC tracked by
+[#3058](https://github.com/nesquena/hermes-webui/issues/3058) and
+[#3061](https://github.com/nesquena/hermes-webui/pull/3061) should be treated
+as a child contract: it can define user intervention semantics without
+redefining the live-to-final reply lifecycle.
+
+## Review Checklist
+
+Use this checklist when reviewing PRs against this RFC:
+
+- Does the change preserve long-running session readability?
+- Does live process text stay primary over tool metadata?
+- Are tool details available without becoming the main transcript?
+- Does the final answer remain separate from supporting activity?
+- Are compression, no-final, tool-limit, cancel, and interrupt outcomes
+ classified honestly?
+- Does reconnect/session switch rebuild the same reply lifecycle or degrade
+ explicitly?
+- If the turn produced workspace artifacts, can the user still find them after
+ settle, replay, reconnect, cancel, or terminal failure?
+- Do internal recovery or control messages stay out of ordinary chat content?
+- Does sidebar/session state agree with the visible active or terminal turn?
+- Is the PR's slice clear: lifecycle, terminal/recovery, cancel ownership,
+ live controls, sidebar/session ownership, or protocol integration?
+- If the change belongs to Queue/Steer/Stop-and-send/Interrupt, is it routed to
+ the child control-surface RFC instead of being hidden inside this parent RFC?
+
## Open Questions
-- Should very long final answers need additional navigation or preview
- affordances beyond the standard chat transcript behavior?
-- Should repeated compression passes in one turn be shown as separate transient
- statuses or summarized into one compression lifecycle marker?
-- Should queue, steer, and interrupt receive a dedicated public control-surface
- RFC, or should that contract live inside the existing adapter/control RFC?
+Open questions are limited to product choices that are not already decided by
+this RFC, an active implementation PR, or a child RFC.
+
+- Should very long final answers gain additional navigation, outline, or
+ preview affordances beyond standard chat transcript behavior? If yes, what
+ threshold triggers them and how do they preserve the answer as ordinary
+ assistant prose?
+- When a turn produces multiple workspace artifacts, should the final answer
+ include an automatic artifact summary or navigation affordance, or should the
+ product rely on the existing Artifacts tab and explicit `workspace://` links?
+- What is the minimum sidebar signal for background long-running sessions that
+ have completed, failed, cancelled, or need attention while the user was
+ viewing another session?
+- Which terminal outcomes should offer inline recovery actions, such as retry,
+ continue, inspect details, or reopen from checkpoint, and which should remain
+ informational only?