Fix compression-exhausted stream finalization by franksong2702 · Pull Request #3316 · nesquena/hermes-webui

franksong2702 · 2026-06-01T06:38:47Z

Thinking Path

Long single-turn, tool-heavy sessions can exhaust context after Hermes Agent fails to compress effectively.
WebUI must distinguish streamed interim/progress text from a real final assistant answer.
A persisted transcript ending in a tool result, assistant tool-call turn, or internal context-compaction reference marker is not a completed answer.
The UI should surface compression exhaustion as an error and keep internal reference-only summaries out of the settled transcript.

What Changed

Classifies compression exhaustion errors from agent/provider result text.
Treats failed, partial, compression_exhausted, tool-tail transcripts, assistant tool-call tails, and context-compaction marker tails as terminal failures instead of completed turns.
Removes the _assistant_added short-circuit so final-answer validation always checks the persisted transcript.
Keeps [CONTEXT COMPACTION — REFERENCE ONLY] content out of settled transcript rendering while preserving transient running compression status.
Adds regression coverage for terminal failure detection, context-compaction marker filtering, and final-answer semantics.
Updates the changelog for the user-visible behavior change.

Why It Matters

This prevents long tool-heavy sessions from appearing completed when Hermes Agent stopped before writing a final assistant answer. It also prevents internal context-compaction reference text from being rendered as user-facing final content.

Related to #3315, NousResearch/hermes-agent#36624, and NousResearch/hermes-agent#36626.

Verification

python -m pytest tests/test_auto_compression_terminal_failure.py tests/test_auto_compression_card.py tests/test_issues_373_374_375.py tests/test_issue765_streaming_persistence.py::TestIssue765FollowupHardening::test_silent_failure_path_does_not_reacquire_agent_lock -q
node --check static/ui.js
node --check static/messages.js
git diff --check

Risks / Follow-ups

This is the WebUI companion to the Hermes Agent compression fix; it does not by itself reduce agent-side context size.
The change intentionally avoids rendering internal compaction reference text in settled transcripts, while still allowing transient compression status during active runs.

Contract Routing

Contract family: runtime streaming finalization and session transcript visibility.
Evidence: focused regression tests for terminal failure detection, final-answer semantics, and settled transcript rendering.
Contract change: none intended; this restores the invariant that completed UI state requires a real final assistant answer.

Model Used

AI-assisted implementation with OpenAI GPT-5 Codex in a local coding workflow. The assistant inspected repository code, wrote targeted tests, implemented the fix, and ran the verification commands above.

nesquena-hermes · 2026-06-01T20:54:04Z

Triage: hold + changes-requested — thanks @franksong2702. The goal is right (surfacing compression-exhaustion as a real error instead of a falsely-"completed" turn) and the classifier itself is sound — I verified _session_lacks_final_assistant_answer() empirically: it correctly returns HAS-final-answer for a normal completed turn and a tool→final-text turn, and terminal-failure only for tool-result/tool-call/empty-assistant/marker tails. I picked this up to advance toward release and ran it through the full gate (test suite + Opus + the Codex regression gate). Opus cleared it, but the Codex gate caught a state-consistency ordering bug on the compression path that I then confirmed against the code, so I'm holding it rather than shipping.

(CORE) Terminal-failure handling can run BEFORE the compression session-id migration, leaving frontend/backend session state inconsistent when compression exhaustion fires after the agent rotated session_id.

The new terminal-failure check is at api/streaming.py:5439-5443:

_terminal_failure = (
    _agent_result_terminal_failure(result)
    or _session_lacks_final_assistant_answer(_all_result_messages)
)
if _terminal_failure:
    _assistant_added = False
if _terminal_failure or (not _assistant_added and not _token_sent):
    ... # emits the apperror + returns

But the compression session-id migration + snapshot preservation block runs much later, at api/streaming.py:5680+:

_preserve_pre_compression_snapshot(s, old_sid)   # ~5680
... # migrate locks/cache, register continuation, emit `compressed`, save against new sid

Hermes Agent rotates agent.session_id during compression (agent/conversation_compression.py:505-520); WebUI only mirrors that rotation in the 5680+ block. So on a compression-exhausted result that arrives after the rotation, the terminal-failure path at 5443 emits the error + returns before 5680 ever runs — which means:

_preserve_pre_compression_snapshot() is skipped → the pre-compression history may not be archived
continuation/session migration is skipped → the error transcript is saved against the old WebUI session id
the frontend apperror path appends a synthetic error to the old activeSid, not the migrated continuation session
→ frontend/backend session state diverges (this is the same compression-rotation subsystem where we recently held fix: keep gateway context visible in chat transcripts #3300 for a transcript-loss regression, so we're being extra careful here).

Suggested fix (needs your design call — I didn't want to hot-patch compression-rotation ordering under release pressure):

Factor the compression-rotation side-effect block (_preserve_pre_compression_snapshot + lock/cache migration + continuation registration) into a helper and run it BEFORE any return from the terminal-failure apperror path — OR move the terminal-failure check below the compression migration.
Persist the terminal error on the migrated continuation session, and in the frontend (static/messages.js apperror handler) adopt the settled/migrated session like the done path does, instead of only pushing a local synthetic error into the old active session.
Add a regression test for: compression exhausted AFTER session-id rotation → assert the snapshot is preserved, the continuation session is registered, and the error lands on the migrated session.

Everything else is good — the compression_exhausted classification, the label cascade, and the frontend clear-compression-UI handling are all correct. One small non-blocking note from Opus: _classify_provider_error only inspects the error string, so if a future agent path sets result['compression_exhausted']=True with an empty/non-matching error message it falls back to the generic "No response from provider" label (your included test sets both fields, so the current path is covered).

The rest of this session's transcript/streaming fixes (#3102 edit-replay, #3321 recovery-control filter) already shipped, so this isn't a wholesale rejection — just this one ordering interaction to sort out. Happy to pair on the helper-extraction if useful. No rush.

Fix compression-exhausted stream finalization

4225d20

This was referenced Jun 1, 2026

Fix auto compression for tool-heavy sessions NousResearch/hermes-agent#36626

Open

Streaming should show compression-exhausted tool-tail runs as errors #3315

Open

Auto compression can exhaust context in tool-heavy sessions NousResearch/hermes-agent#36624

Open

Merge master into branch

c7e1644

nesquena-hermes added hold changes-requested Maintainer left detailed feedback requesting changes; PR is waiting on author to address labels Jun 1, 2026

franksong2702 mentioned this pull request Jun 2, 2026

Redesign live-to-final assistant replies for running agent sessions #3400

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix compression-exhausted stream finalization#3316

Fix compression-exhausted stream finalization#3316
franksong2702 wants to merge 2 commits into
nesquena:masterfrom
franksong2702:franksong2702/fix-auto-compression-tool-heavy-streams

franksong2702 commented Jun 1, 2026 •

edited

Loading

Uh oh!

nesquena-hermes commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

franksong2702 commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Thinking Path

What Changed

Why It Matters

Verification

Risks / Follow-ups

Contract Routing

Model Used

Uh oh!

nesquena-hermes commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

franksong2702 commented Jun 1, 2026 •

edited

Loading