Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,10 @@

## [Unreleased]

### Fixed

- **Chat transcript no longer renders blank (only date separators) for a session full of empty-content recovered turns (#3875).** Two compounding bugs. **Render side:** the per-message render extracted a Thinking trace only from inline `<think>`/channel/turn tags inside `content` — it never read a message's separate `reasoning` field (kept that way deliberately by #2565). An assistant turn with *empty visible content* but a populated `reasoning` field (e.g. a run-journal-recovered anchor: empty `content` + `reasoning` + `_recovered_from_run_journal`) therefore rendered no Thinking card and collapsed to an empty hidden anchor; a session made entirely of such rows painted as nothing but `SUNDAY`/`SATURDAY` date dividers. In legacy (Compact-tool-activity-off) mode, `renderMessages()` now falls back to the message's `reasoning`/`reasoning_content` field (via `_assistantReasoningPayloadText`) when the inline pass finds nothing AND the turn has no visible content, so the reasoning surfaces as a Thinking card and the turn is never blank — scoped to empty-content turns so normal answer-bearing messages are unchanged, and legacy-only because the simplified/Worklog path already derives reasoning (with an exact-answer echo-strip) higher up. **Data side:** run-journal recovery created an empty assistant "anchor" to host recovered tool cards for a tool-first stream, but the lazy read-side retry path re-ran recovery on every `get_session()` and a tool-first stream has no text to dedup on — so each retry, and each distinct interrupted stream over a session's life, appended another empty anchor, letting a heavily-interrupted session accumulate thousands of empty rows. `ensure_assistant_anchor` now reuses an existing empty recovered anchor for the same stream instead of appending a fresh one (one anchor per stream). (#3875)

## [v0.51.346] — 2026-06-09 — Release LJ (PWA notification controls)

### Added
Expand Down
24 changes: 24 additions & 0 deletions api/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -1421,6 +1421,30 @@ def ensure_assistant_anchor(created_at: float | None = None) -> int:
# A stream can start with tools before any text. Keep those tools
# visible after restart with an empty recovered assistant anchor instead
# of inventing synthetic progress prose.
#
# Dedup guard (#3875): reuse an existing empty recovered anchor for THIS
# stream instead of appending a fresh one. The lazy read-side retry path
# (_retry_journal_recovery_in_place) re-runs this recovery on repeated
# get_session() calls, and a tool-first stream that never emitted text
# has no content to dedup on (flush_assistant() returns early on empty),
# so without this guard each retry — and each distinct interrupted stream
# over the session's life — appends another empty anchor. A session that
# was interrupted-and-recovered many times then accumulates thousands of
# empty content-less assistant rows, bloating the file and (combined with
# the render path) painting the transcript blank. One anchor per stream
# is all that's needed to host its recovered tool cards.
for _existing_idx in range(len(session.messages) - 1, -1, -1):
_m = session.messages[_existing_idx]
if not isinstance(_m, dict):
continue
if (
_m.get('_recovered_from_run_journal')
and _m.get('_recovered_stream_id') == stream_id
and _m.get('role') == 'assistant'
and not str(_m.get('content') or '').strip()
):
current_assistant_idx = _existing_idx
return _existing_idx
Comment thread
greptile-apps[bot] marked this conversation as resolved.
session.messages.append({
'role': 'assistant',
'content': '',
Expand Down
23 changes: 23 additions & 0 deletions static/ui.js
Original file line number Diff line number Diff line change
Expand Up @@ -8240,6 +8240,29 @@ function renderMessages(options){
seg.setAttribute('data-live-assistant','1');
}
if(_ERR_MSG_RE.test(String(content||'').trim())) seg.dataset.error='1';
// A turn whose visible content is empty but which carries a separate
// `reasoning` field (e.g. a run-journal-recovered anchor: empty content +
// reasoning + `_recovered_from_run_journal`) extracts NO inline thinkingText
// and would render no Thinking Card at all — collapsing to an empty hidden
// anchor. A session made entirely of such rows then paints blank (only date
// separators) — the #3875 reporter's exact case (Compact tool activity OFF,
// i.e. legacy mode). Surface the message's reasoning payload as the Thinking
// Card source for these empty-content turns so the turn is never blank.
//
// LEGACY-MODE ONLY (!isSimplifiedToolCalling()): the simplified/Worklog path
// already derives reasoning above (line ~8149 via
// _worklogReasoningTextFromMessage, which strips an exact visible-answer echo
// so reasoning duplicating a sibling answer is not re-shown). Repopulating the
// raw reasoning here would bypass that echo-strip and re-render the duplicate
// as a Worklog Thinking card (Codex gate catch). In legacy mode there is no
// Worklog folding, so the raw payload is the correct Thinking-card source.
// Stays OUT of the inline-content `thinkingText` extraction block (#2565) and
// only fires for empty-content/no-inline-thinking turns, so answer-bearing
// messages are unchanged.
if(!isUser&&!isSimplifiedToolCalling()&&!thinkingText&&!String(content||'').trim()&&!filesHtml&&!statusHtml){
const _reasoningPayload=_assistantReasoningPayloadText(m);
if(_reasoningPayload) thinkingText=_reasoningPayload;
}
if(thinkingText&&window._showThinking!==false){
if(isSimplifiedToolCalling()&&_assistantThinkingBelongsInWorklog(m, rawIdx, toolCallAssistantIdxs)) assistantThinking.set(rawIdx, thinkingText);
else if(window._showThinking!==false) seg.insertAdjacentHTML('beforeend', _thinkingCardHtml(thinkingText));
Expand Down
57 changes: 57 additions & 0 deletions tests/test_issue3875_blank_transcript_failsafe.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,3 +87,60 @@ def test_failsafe_preserves_collapsed_worklog_when_visible_answer_exists():
assert "assistant-segment-anchor" in failsafe
# The live turn drives its own state and must be excluded from the sweep.
assert "liveAssistantTurn" in failsafe


def test_render_surfaces_reasoning_field_for_empty_content_turn():
"""#3875 (real reporter case): an assistant turn with empty visible content but
a separate `reasoning` field must surface that reasoning as a Thinking card.

The per-segment inline thinkingText extraction only mines <think>/channel/turn
tags out of `content`; it must NOT read `m.reasoning` (that constraint is
enforced by #2565 — reasoning metadata stays low-priority Worklog detail, never
inline-content extraction). So a run-journal-recovered anchor (empty content +
reasoning + `_recovered_from_run_journal`) extracted no thinkingText, rendered no
Thinking card, and collapsed to an empty hidden anchor — a session of such rows
painted blank (only date separators). The fix, at the segment-emission point
(AFTER the #2565-guarded inline extraction block, in LEGACY mode only), falls
back to `_assistantReasoningPayloadText(m)` and reuses `thinkingText` ONLY when
there is no inline thinkingText AND no visible content/files/status — keeping
reasoning out of the forbidden extraction block while ensuring the turn is never
blank. Legacy-only because the simplified/Worklog path already derives reasoning
(with an exact-visible-answer echo-strip) higher up.
"""
body = _function_body(UI_JS, "renderMessages")
# The fallback exists at emission time (after the #2565-guarded inline
# extraction block), reusing thinkingText but sourced from the reasoning payload.
assert "_assistantReasoningPayloadText(m)" in body, (
"the empty-turn path must fall back to the message's reasoning payload"
)
# It is scoped to LEGACY mode (simplified path already derives reasoning with
# echo-strip) AND empty-content turns with no inline thinkingText, so an
# answer-bearing message's rendering is unchanged and a Worklog echo is not
# double-rendered (Codex gate catch).
assert "!isUser&&!isSimplifiedToolCalling()&&!thinkingText&&!String(content||'').trim()&&!filesHtml&&!statusHtml" in body, (
"the reasoning fallback must be scoped to legacy-mode empty-content/no-inline-thinking turns"
)


def test_reasoning_fallback_stays_out_of_inline_extraction_block_2565():
"""Guard against regressing #2565: the reasoning fallback must NOT live in the
inline-content `thinkingText` extraction block (between `let thinkingText='';`
and `const isUser=...`). That block must never reference `m.reasoning`."""
src = UI_JS
extraction = src.split("let thinkingText='';", 1)[1].split("const isUser=m.role==='user';", 1)[0]
assert "m.reasoning" not in extraction
assert "m.reasoning_content" not in extraction
assert "_assistantReasoningPayloadText" not in extraction, (
"the reasoning fallback must live at the segment-emission point, not inside "
"the inline-content extraction block (#2565)"
)


def test_assistant_reasoning_payload_reads_reasoning_fields():
"""The fallback relies on `_assistantReasoningPayloadText` reading the message's
`reasoning` / `reasoning_content` fields (not just inline content tags)."""
payload_fn = _function_body(UI_JS, "_assistantReasoningPayloadText")
# Reads the direct reasoning fields off the message object.
assert "m.reasoning_content||m.reasoning||m.thinking||m._reasoning" in payload_fn


130 changes: 130 additions & 0 deletions tests/test_issue3875_recovery_anchor_dedup.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
"""Regression tests for #3875 — blank transcript from accumulated empty recovery anchors.

A session that was interrupted-and-recovered many times accumulated thousands of
empty-content assistant rows tagged ``_recovered_from_run_journal``. Each was an
"anchor" created by ``_append_journaled_partial_output`` / ``ensure_assistant_anchor``
to host recovered tool cards for a tool-first stream (one that emitted tools before
any visible text). Because a tool-first stream has no text to dedup on, the read-side
lazy-retry path re-created a fresh empty anchor on every retry, and every distinct
interrupted stream added its own — so the session filled with empty rows. Combined
with the render path (which dropped empty-content reasoning-only rows), the transcript
painted blank (only date separators).

This file covers the DATA side: the anchor-dedup guard so recovery reuses a single
empty anchor per stream instead of appending an unbounded run of them. The render-side
fix (surface the message's `reasoning` field so an empty-content turn never paints
blank) is covered by tests/test_issue3875_blank_transcript_failsafe.py.
"""
from __future__ import annotations

import pytest

import api.profiles as profiles
from api.models import Session, _append_journaled_partial_output
from api.run_journal import append_run_event


@pytest.fixture
def hermes_home(tmp_path, monkeypatch):
# Mirror tests/test_session_lost_response_regression.py — isolate HERMES_HOME
# so Session.save() + run-journal writes land in a throwaway sandbox.
home = tmp_path / "hermes_home"
home.mkdir()
(home / "sessions").mkdir()
monkeypatch.setenv("HERMES_HOME", str(home))
monkeypatch.setattr(profiles, "_DEFAULT_HERMES_HOME", home)
return home


def _tool_first_journal(session_id: str, stream_id: str) -> None:
"""Write a run journal for a stream that emitted a tool BEFORE any text —
the shape that forces an empty assistant anchor on recovery."""
append_run_event(session_id, stream_id, "tool", {"name": "search_files", "preview": "q=x"})
append_run_event(session_id, stream_id, "tool_complete", {"name": "search_files", "preview": "done"})


def test_tool_first_recovery_creates_single_empty_anchor(hermes_home):
sid = "issue3875_anchor"
stream_id = "stream-A"
_tool_first_journal(sid, stream_id)
s = Session(session_id=sid, title="repro", messages=[{"role": "user", "content": "go"}])

# First recovery: one empty anchor is created to host the recovered tool card.
assert _append_journaled_partial_output(s, stream_id, dedupe_existing=True) is True
anchors = [
m for m in s.messages
if isinstance(m, dict)
and m.get("_recovered_from_run_journal")
and m.get("role") == "assistant"
and not str(m.get("content") or "").strip()
]
assert len(anchors) == 1, "first recovery should create exactly one empty anchor"


def test_repeated_recovery_does_not_accumulate_empty_anchors(hermes_home):
"""The #3875 root cause: re-running recovery for the SAME stream must reuse the
existing empty anchor, not pile up a fresh one each time."""
sid = "issue3875_repeat"
stream_id = "stream-B"
_tool_first_journal(sid, stream_id)
s = Session(session_id=sid, title="repro", messages=[{"role": "user", "content": "go"}])

for _ in range(20):
_append_journaled_partial_output(s, stream_id, dedupe_existing=True)

anchors = [
m for m in s.messages
if isinstance(m, dict)
and m.get("_recovered_from_run_journal")
and m.get("role") == "assistant"
and not str(m.get("content") or "").strip()
]
assert len(anchors) == 1, (
f"repeated recovery for one stream must reuse a single empty anchor, "
f"got {len(anchors)} (the unbounded-accumulation bug)"
)


def test_distinct_streams_get_distinct_anchors(hermes_home):
"""Dedup is scoped per stream — two genuinely different interrupted streams
each keep their own anchor (we are not collapsing unrelated turns)."""
sid = "issue3875_multi"
s = Session(session_id=sid, title="repro", messages=[{"role": "user", "content": "go"}])
for stream_id in ("stream-X", "stream-Y", "stream-Z"):
_tool_first_journal(sid, stream_id)
# run each twice to prove per-stream reuse on top of per-stream distinctness
_append_journaled_partial_output(s, stream_id, dedupe_existing=True)
_append_journaled_partial_output(s, stream_id, dedupe_existing=True)

anchors = [
m for m in s.messages
if isinstance(m, dict)
and m.get("_recovered_from_run_journal")
and m.get("role") == "assistant"
and not str(m.get("content") or "").strip()
]
stream_ids = {m.get("_recovered_stream_id") for m in anchors}
assert len(anchors) == 3, f"expected one anchor per distinct stream, got {len(anchors)}"
assert stream_ids == {"stream-X", "stream-Y", "stream-Z"}


def test_text_bearing_recovery_still_appends_real_content(hermes_home):
"""The dedup guard must not suppress recovery of genuine visible text — a
stream that emitted tokens still produces a content-bearing recovered row."""
sid = "issue3875_text"
stream_id = "stream-T"
append_run_event(sid, stream_id, "token", {"text": "Hello "})
append_run_event(sid, stream_id, "token", {"text": "world"})
append_run_event(sid, stream_id, "done", {})
s = Session(session_id=sid, title="repro", messages=[{"role": "user", "content": "go"}])

assert _append_journaled_partial_output(s, stream_id, dedupe_existing=True) is True
recovered_text = [
m for m in s.messages
if isinstance(m, dict)
and m.get("_recovered_from_run_journal")
and str(m.get("content") or "").strip()
]
assert any("Hello world" in m["content"] for m in recovered_text), (
"token-bearing recovery must still append the visible assistant text"
)
Loading