feat: caller-owned serial CSR batched/buffered two-stage primitives (ordvec 0.5.0)#208
feat: caller-owned serial CSR batched/buffered two-stage primitives (ordvec 0.5.0)#208Fieldnote-Echo wants to merge 12 commits into
Conversation
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
…for buffer reuse Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
… stage-1) Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
…havior-preserving) Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
…ee serial batched rerank) Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
|
/agentic_review |
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
Code Review by Qodo
Context used 1. Unsorted candidates behavior undocumented
|
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
PR Summary by QodoAdd caller-owned serial CSR batched two-stage search primitives (ordvec 0.5.0) WalkthroughsDescription• Add serial CSR candidate generation and buffered batched subset rerank APIs. • Introduce reusable scratch/TopK reuse to eliminate rerank allocations after warmup. • Expand determinism/edge-case tests and bump release/docs to ordvec 0.5.0. Diagramgraph TD
Caller(["Caller runtime"]) -->|"queries,m"| Sign["SignBitmap"] -->|"CSR candidates"| CSR[("CandidateBatch") ] -->|"offsets+candidates"| Rank["RankQuant"] -->|"k"| Out["Scores/indices"]
Caller -->|"reuse"| Scratch[["SubsetScratch"]] --> Rank
subgraph Legend
direction LR
_caller(["Caller"]) ~~~ _mod["Module"] ~~~ _buf[("Buffer") ] ~~~ _scratch[["Scratch"]]
end
High-Level AssessmentThe following are alternative approaches to this PR: 1. Provide a bundled two-stage helper API
2. Use Vec> for candidates instead of CSR
3. Expose an iterator/streaming rerank interface
Recommendation: Keep the PR’s CSR + allocation-free File ChangesEnhancement (4)
Tests (2)
Documentation (3)
Other (9)
|
There was a problem hiding this comment.
Code Review
This pull request bumps the version of the ordvec crate and its bindings to 0.5.0 and introduces caller-owned serial batched/buffered two-stage primitives. These include SignBitmap::top_m_candidates_batched_serial_csr for serial CSR candidate generation, and RankQuant::search_asymmetric_subset_batched_serial (along with its allocation-free _into variant) for serial batched subset reranking, supported by the new CandidateBatch and SubsetScratch types. The review feedback focuses on two performance optimizations in src/quant.rs: first, avoiding redundant zero-initialization of the sub_packed scratch buffer by using reserve and extend_from_slice instead of resize and copy_from_slice; second, eliminating the redundant full-buffer preclear of output scores and indices by clearing them only for empty rows within the query loop.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
Address gemini review on #208 (two medium findings): - subset_rerank_row_into gather: clear + reserve + extend_from_slice instead of resize(_, 0) + indexed copy, avoiding zero-init of bytes immediately overwritten (still allocation-free after warmup). - search_asymmetric_subset_batched_serial_into: drop the redundant whole-buffer preclear; each row is written once (non-empty via finalize_into_with_scratch, empty rows sentinel-padded in-loop). Validation still precedes all writes, so the validation-panic-leaves-buffers-intact guarantee holds. No behavior change; full suite + cross-tier parity + allocation-free tests green. Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
|
/agentic_review |
|
Code review by qodo was updated up to the latest commit 5ea2804 |
Address qodo review on #208 (requirement gap): the CSR per-row length bound (each row, duplicates included, must be <= self.len()) and the full panic contract were undocumented on search_asymmetric_subset_batched_serial_into — surprising for callers supplying their own candidate sets. Document the cap and expand the panic list; cross-reference from the allocating wrapper. Docs-only. Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
|
/agentic_review |
|
Code review by qodo was updated up to the latest commit 22551af |
|
@codex review |
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
ordvec 0.5.0 — Track 1: caller-owned serial CSR batched/buffered two-stage primitives
Adds the additive, caller-owned (no internal rayon) serial CSR primitives that downstream engines need to run a fully self-scheduled two-stage
SignBitmap → RankQuantsearch, removing per-query allocation/orchestration overhead from the rerank stage.Added (all additive / semver-minor)
SignBitmap::top_m_candidates_batched_serial_csr(queries, m) -> CandidateBatch— serial CSR candidate generation (stage 1).CandidateBatch { candidates: Vec<u32>, offsets: Vec<usize> }with documented CSR invariants +query_count/candidates_for_query/is_empty/has_no_candidates.RankQuant::search_asymmetric_subset_batched_serial(queries, offsets, candidates, k) -> SearchResults— allocating ergonomic batched rerank (stage 2).RankQuant::search_asymmetric_subset_batched_serial_into(.., &mut SubsetScratch, &mut out_scores, &mut out_indices)— allocation-free after warmup; the integration contract for runtimes that own their own thread pool (call from a bounded pool, GIL released).SubsetScratch(opaque reusable scratch).TopK::{reset_with_tie_keys, finalize_into_with_scratch},l2_normalise_into.Design
top_m_candidates_batched/search_asymmetric*are unchanged. No bundled two-stage helper — the two CSR primitives stay composable.offsets+ flatcandidates), notVec<Vec<u32>>.out_k = k.min(len), underfull rows sentinel-padded (NEG_INFINITY/-1); duplicate candidate ids scored independently; tie policy(score desc, global id asc). Hard asserts on contract violation, validated before any output write.search_asymmetric_subsetrefactored to share the single-row helper — behavior-preserving (full existing suite green).Tests / gate
*_into == *_serial, full-candidate-set ==search_asymmetric, duplicate[7,7,3,7]parity, malformed-CSR / OOB / non-finite panics, edges (k=0/nq=0/n=0/k>n/m>n/m=0), allocation-free (scratch capacity stable across calls), cross-tier scalar-reference parity, and adeterminism_contracttie-policy case for the new path.fmt,clippy --all-targets --all-features -D warnings,test(+experimental, +--no-default-features), MSRV 1.89,--locked,RUSTDOCFLAGS=-D warnings doc, doc-tests,+nightly fuzz build.Release
RELEASING.md) + CHANGELOG. Publish remains gated — nocargo publish/maturin publishin this PR.Closes #115. Addresses #116 (caller-owned candidate/filter surface) and #125 (host-controlled parallelism) — maintainer to confirm full closure.
🤖 Generated with Claude Code