Skip to content

Join segment compression threads before shutdown #13

@erskingardner

Description

@erskingardner

Problem

SegmentWriter::seal compresses sealed segments by spawning a background thread, but the writer does not retain or join the JoinHandle.

Drop also calls seal, so process shutdown can return while gzip compression is still in progress. If the process exits at that point, Pensieve can leave a partially written .notepack.gz, an orphaned uncompressed .notepack, or miss the sealed-segment notification that the ClickHouse indexer expects after compression completes.

Why this matters

Segment files are the canonical archive. Compression must complete deterministically before shutdown is considered clean, especially before publishing or syncing segments for research access.

Relevant code

  • crates/pensieve-ingest/src/pipeline/segment.rs
    • seal
    • compress_file_static
    • Drop for SegmentWriter
  • docs/stability_fixes_plan.md already describes this class of risk under segment compression shutdown behavior.

Suggested fix

Track outstanding compression workers in SegmentWriter and join them during explicit shutdown and Drop.

Consider writing compressed output to a temporary path first, then atomically renaming to .notepack.gz only after GzEncoder::finish succeeds. This would prevent incomplete gzip files from looking like finished archive objects.

Acceptance criteria

  • Explicit finalization waits for all segment compression to complete.
  • Dropping SegmentWriter does not leave compression threads detached.
  • A compressed segment is only exposed at the final .notepack.gz path after successful compression.
  • Existing indexer notifications still fire after compression completes.
  • Tests cover shutdown/finalization with compression enabled.
  • just precommit passes before merging.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions