fix(parser): deduplicate duplicate summary entries from pre-v2.1.128 sessions#79
Merged
Conversation
Pre-v2.1.128 Claude Code wrote duplicate `summary`-type JSONL entries when sub-agents were idle, causing the same summary state to be re-emitted on every tick of the sub-agent loop. This produced duplicate CompactMsg blocks in the UI, inflated token counts, and potential conversation tree corruption. Fix: in read_session_incremental, track a HashSet of seen (agentName, teamName, summary_text) triples. When a summary entry's key is already in the set, skip it and keep the first occurrence. All other entry types are unaffected. The raw_entries slice is left intact so the UUID parent-chain walk (resolve_live_chain_uuids) continues to work correctly. Adds four targeted tests covering: identical duplicates collapse to one CompactMsg, distinct summaries all kept, cross-agent dedup independence, and non-summary entries unaffected. Fixes #78
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #78 — pre-v2.1.128 Claude Code sessions can contain duplicate
summary-type JSONL entries. When a sub-agent was idle, Claude Code re-emitted the same summary entry on every tick of the sub-agent loop, resulting in:CompactMsgblocks rendered in the UIRoot Cause
Claude Code v2.1.128 fixed the server-side emission, but JSONL files written before that version remain on disk with the duplicate entries. The parser needs to handle them gracefully.
Fix
In
read_session_incremental(src-tauri/src/parser/session.rs), aHashSet<(String, String, String)>now tracks seen(agentName, teamName, summary_text)triples. When asummaryentry's key is already in the set, it is skipped — only the first occurrence is kept.Deduplication happens after the live-chain UUID resolution step and operates only on the classification pass, so
raw_entriesis left intact. This preserves the UUID parent-chain walk (resolve_live_chain_uuids) which must see every entry to correctly build the live-branch set.All other entry types (
user,assistant,system, etc.) are unaffected.Tests Added
Four new unit tests in
src-tauri/src/parser/session.rs:duplicate_summary_entries_produce_single_compact_msgCompactMsgdistinct_summary_entries_are_all_keptCompactMsgsduplicate_summary_entries_from_different_agents_are_kept_separatelyCompactMsgs; 3rd entry (dup of agent1) skippedsummary_dedup_does_not_affect_non_summary_entriesuserentry and non-summary types pass through unchangedChecklist
cargo fmtcleancargo clippyclean