Skip to content

storage: expose reader/writer persistence APIs for index types #197

@Fieldnote-Echo

Description

@Fieldnote-Echo

Context

The public persistence API is currently path-based: Rank::write/load, RankQuant::write/load, Bitmap::write/load, and SignBitmap::write/load. That is sufficient for simple local files and OrdinalDB can use it for a directory-backed .odb bundle.

Longer-term, downstream wrappers may need to embed ordvec index bytes inside another container, write to temp files atomically, stream from object storage, or verify bytes before choosing a final path. Public write_to<W: Write> / read_from<R: Read + Seek> style APIs would let those wrappers avoid temp-file shims while keeping ordvec as the owner of the persisted format and validation logic.

Evidence

  • Module docs state the supported persistence API is path-based write() / load() on the index types: src/rank_io.rs:38-44.
  • Internal serializers/loaders already operate through BufWriter, BufReader, Read, Seek, and Write, but are pub(crate) and path-oriented: src/rank_io.rs:58-60, src/rank_io.rs:484-508, src/rank_io.rs:576-600, src/rank_io.rs:686-712, src/rank_io.rs:778-817.
  • Public index methods expose only path-based persistence: src/rank.rs:515-526, src/quant.rs:452-462, src/bitmap.rs:468-478, src/sign_bitmap.rs:307-317.

Proposed Shape

Sketch:

impl RankQuant {
    pub fn write_to<W: std::io::Write>(&self, writer: W) -> std::io::Result<()>;
    pub fn read_from<R: std::io::Read + std::io::Seek>(reader: R) -> std::io::Result<Self>;
}

Repeat for Rank, Bitmap, and SignBitmap. Path-based methods can delegate to stream methods.

Acceptance Criteria

  • Public stream APIs exist for all persisted index types.
  • Path-based write/load preserve their current behavior by delegating to stream APIs or sharing implementation.
  • Load validation remains identical: magic/version/dim/shape/payload/trailing-garbage checks and row invariants still return io::Error, never panics.
  • Tests roundtrip via Cursor<Vec<u8>> and compare to file roundtrip.
  • Tests cover malformed/truncated/trailing-garbage inputs through stream APIs.
  • Docs make clear that stream APIs do not change the on-disk format.

Non-goals

  • No new file/container format.
  • No OrdinalDB .odb bundle support inside ordvec.
  • No async I/O requirement.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions