Stream ClickHouse segment indexing in bounded batches

## Problem

`ClickHouseIndexer::read_segment` currently reads and parses an entire segment into a `Vec<EventRow>` before insertion. Large compressed segments can decompress to hundreds of megabytes, and materializing all event rows can push memory much higher.

The `ClickHouseConfig::batch_size` field exists, but indexing currently writes all parsed rows through one insert flow after the full segment is loaded.

## Why this matters

The archive is intended to scale to very large segment files and research-sized datasets. Segment processing tools and indexing should share a bounded-memory reader so large archives can be processed safely.

## Relevant code

- `crates/pensieve-ingest/src/pipeline/clickhouse.rs`
  - `index_segment`
  - `read_segment`
  - `index_segment_file`
- `docs/stability_fixes_plan.md` already calls out this memory issue.

## Suggested implementation

Introduce a streaming segment reader that yields records/events one at a time or in configured batches.

Then update ClickHouse indexing to:

- read records incrementally
- insert batches of `batch_size`
- avoid retaining the entire segment in memory
- reuse the same reader for research export tools where practical

## Acceptance criteria

- ClickHouse indexing respects `ClickHouseConfig::batch_size`.
- Indexing does not materialize the full segment as one `Vec<EventRow>`.
- Existing indexing behavior is preserved for valid segments.
- Tests cover multi-batch segment indexing or the streaming reader directly.
- Shared segment-reading code can be reused by export/inspect tools.
- `just precommit` passes before merging.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stream ClickHouse segment indexing in bounded batches #21

Problem

Why this matters

Relevant code

Suggested implementation

Acceptance criteria

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Stream ClickHouse segment indexing in bounded batches #21

Description

Problem

Why this matters

Relevant code

Suggested implementation

Acceptance criteria

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions