Skip to content

fix(parser): deduplicate duplicate summary entries from pre-v2.1.128 sessions#79

Merged
delexw merged 2 commits into
mainfrom
fix-issue-78
May 11, 2026
Merged

fix(parser): deduplicate duplicate summary entries from pre-v2.1.128 sessions#79
delexw merged 2 commits into
mainfrom
fix-issue-78

Conversation

@delexw
Copy link
Copy Markdown
Owner

@delexw delexw commented May 10, 2026

Summary

Fixes #78 — pre-v2.1.128 Claude Code sessions can contain duplicate summary-type JSONL entries. When a sub-agent was idle, Claude Code re-emitted the same summary entry on every tick of the sub-agent loop, resulting in:

  • Duplicate CompactMsg blocks rendered in the UI
  • Inflated token counts in session metadata
  • Potential conversation tree corruption if duplicate entries disrupted the UUID parent-chain

Root Cause

Claude Code v2.1.128 fixed the server-side emission, but JSONL files written before that version remain on disk with the duplicate entries. The parser needs to handle them gracefully.

Fix

In read_session_incremental (src-tauri/src/parser/session.rs), a HashSet<(String, String, String)> now tracks seen (agentName, teamName, summary_text) triples. When a summary entry's key is already in the set, it is skipped — only the first occurrence is kept.

Deduplication happens after the live-chain UUID resolution step and operates only on the classification pass, so raw_entries is left intact. This preserves the UUID parent-chain walk (resolve_live_chain_uuids) which must see every entry to correctly build the live-branch set.

All other entry types (user, assistant, system, etc.) are unaffected.

Tests Added

Four new unit tests in src-tauri/src/parser/session.rs:

Test Assertion
duplicate_summary_entries_produce_single_compact_msg 3 identical entries → 1 CompactMsg
distinct_summary_entries_are_all_kept 3 distinct summaries → 3 CompactMsgs
duplicate_summary_entries_from_different_agents_are_kept_separately Same text from agent1 + agent2 → 2 CompactMsgs; 3rd entry (dup of agent1) skipped
summary_dedup_does_not_affect_non_summary_entries user entry and non-summary types pass through unchanged

Checklist

  • Root cause fixed (not silenced)
  • Existing 381 Rust tests pass
  • 351 vitest frontend tests pass
  • cargo fmt clean
  • cargo clippy clean

delexw added 2 commits May 10, 2026 12:30
Pre-v2.1.128 Claude Code wrote duplicate `summary`-type JSONL entries
when sub-agents were idle, causing the same summary state to be re-emitted
on every tick of the sub-agent loop. This produced duplicate CompactMsg
blocks in the UI, inflated token counts, and potential conversation tree
corruption.

Fix: in read_session_incremental, track a HashSet of seen
(agentName, teamName, summary_text) triples. When a summary entry's key
is already in the set, skip it and keep the first occurrence. All other
entry types are unaffected. The raw_entries slice is left intact so the
UUID parent-chain walk (resolve_live_chain_uuids) continues to work
correctly.

Adds four targeted tests covering: identical duplicates collapse to one
CompactMsg, distinct summaries all kept, cross-agent dedup independence,
and non-summary entries unaffected.

Fixes #78
@delexw delexw merged commit aa22a61 into main May 11, 2026
1 check failed
@delexw delexw deleted the fix-issue-78 branch May 11, 2026 00:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Compat] Claude Code v2.1.128: Duplicate sub-agent summary entries in pre-fix JSONL sessions

1 participant