Problem
SegmentWriter::seal compresses sealed segments by spawning a background thread, but the writer does not retain or join the JoinHandle.
Drop also calls seal, so process shutdown can return while gzip compression is still in progress. If the process exits at that point, Pensieve can leave a partially written .notepack.gz, an orphaned uncompressed .notepack, or miss the sealed-segment notification that the ClickHouse indexer expects after compression completes.
Why this matters
Segment files are the canonical archive. Compression must complete deterministically before shutdown is considered clean, especially before publishing or syncing segments for research access.
Relevant code
crates/pensieve-ingest/src/pipeline/segment.rs
seal
compress_file_static
Drop for SegmentWriter
docs/stability_fixes_plan.md already describes this class of risk under segment compression shutdown behavior.
Suggested fix
Track outstanding compression workers in SegmentWriter and join them during explicit shutdown and Drop.
Consider writing compressed output to a temporary path first, then atomically renaming to .notepack.gz only after GzEncoder::finish succeeds. This would prevent incomplete gzip files from looking like finished archive objects.
Acceptance criteria
- Explicit finalization waits for all segment compression to complete.
- Dropping
SegmentWriter does not leave compression threads detached.
- A compressed segment is only exposed at the final
.notepack.gz path after successful compression.
- Existing indexer notifications still fire after compression completes.
- Tests cover shutdown/finalization with compression enabled.
just precommit passes before merging.
Problem
SegmentWriter::sealcompresses sealed segments by spawning a background thread, but the writer does not retain or join theJoinHandle.Dropalso callsseal, so process shutdown can return while gzip compression is still in progress. If the process exits at that point, Pensieve can leave a partially written.notepack.gz, an orphaned uncompressed.notepack, or miss the sealed-segment notification that the ClickHouse indexer expects after compression completes.Why this matters
Segment files are the canonical archive. Compression must complete deterministically before shutdown is considered clean, especially before publishing or syncing segments for research access.
Relevant code
crates/pensieve-ingest/src/pipeline/segment.rssealcompress_file_staticDrop for SegmentWriterdocs/stability_fixes_plan.mdalready describes this class of risk under segment compression shutdown behavior.Suggested fix
Track outstanding compression workers in
SegmentWriterand join them during explicit shutdown andDrop.Consider writing compressed output to a temporary path first, then atomically renaming to
.notepack.gzonly afterGzEncoder::finishsucceeds. This would prevent incomplete gzip files from looking like finished archive objects.Acceptance criteria
SegmentWriterdoes not leave compression threads detached..notepack.gzpath after successful compression.just precommitpasses before merging.