feat(experimental): stateless dense bucket-overlap contingency + projection API (#219)#224
Conversation
…ection API (#219) New experimental-gated `Contingency` (src/contingency.rs): the full nb×nb bucket-overlap table C[a][b] = |{coords: query∈a ∧ doc∈b}| for two equal-length &[u8] bucket-code slices, built in one O(dim) histogram pass. Projections mirror ordgraph::edge: top_overlap / diagonal_agreement / band_agreement / top_group_overlap / bucket_l1_distance / coarsened_counts / rankquant_symmetric_score + a general project(&weights) for learned/custom nb×nb matrices, and a Projection enum. Stateless — no index, no persistence, never wired into a search path; this is the pairwise-evidence container ordgraph migrates to. Kernel: scalar reference + AVX-512 (avx512f/bw/vpopcntdq) histogram with a masked 64-byte tail (live-lane masking avoids tail bucket-0 false matches), runtime is_x86_feature_detected dispatch matching #[target_feature], portable scalar fallback. #![deny(unsafe_op_in_unsafe_fn)] honored; no new deps. bucket_l1_distance returns u64 (the distance-weighted sum (nb-1)·dim overflows u32 for accepted dim up to u32::MAX). Counts are u32 (a cell <= dim). Tests reproduce ordgraph::edge's exact projection values + simd==scalar parity + constructor validation + the u32-overflow regression. Behind `experimental`, absent from the default build. Verified: fmt/clippy(-D warnings)/test green. Refs #219. Targets 0.6 (may land in 0.5.0 — maintainer decides). Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Code Review
This pull request introduces the Contingency struct and Projection enum under the experimental feature gate, providing a stateless dense bucket-overlap contingency table for two equal-length bucket codes with both scalar and AVX-512 SIMD implementations. The reviewer suggested several performance optimizations: pre-broadcasting bucket values into registers before the loop in the AVX-512 kernel to avoid redundant splat operations, directly computing the valid range of doc buckets in band_agreement to avoid branching, and factoring out the constant row term in rankquant_symmetric_score to reduce multiplications.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
- AVX-512 histogram kernel: precompute the nb bucket-value broadcast vectors once before the 64-byte block loop instead of recomputing set1_epi8 per block per bucket (gemini high). - band_agreement: iterate only the in-band columns instead of scanning all columns with an abs_diff filter. - rankquant_symmetric_score: factor the query weight out of the inner loop (nb multiplies instead of nb²). All exact: simd==scalar parity, rankquant-direct-sum parity, and projection parity tests unchanged and green. fmt/clippy clean. Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
|
/agentic_review |
Code Review by Qodo
Context used 1.
|
…flow) (#224) radius is an uncapped public parameter (Projection::BandAgreement{radius}); a near-usize::MAX value overflowed qb + radius (panic in debug, silent wrap to a wrong/zero total in release). Use qb.saturating_add(radius). Regression test asserts band_agreement(usize::MAX) == total_count. Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
|
/agentic_review |
|
Code review by qodo was updated up to the latest commit f965bf5 |
|
/agentic_review |
|
Code review by qodo was updated up to the latest commit f965bf5 |
qodo correctly flagged the infallible nb*nb allocation: nb is a caller-supplied usize (codes are u8, but nb is independent), so a large nb (e.g. 1<<20) would allocate a terabyte-scale table and abort the host. Cap nb at the u8 code domain (<=256) before the vec! — >256 buckets is also meaningless for u8 codes. (I had wrongly dismissed this as a false-positive; it is real.) Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
…(Codex) The nb<=256 bound (36d0a0b) rejected the large_nb_uses_scalar_and_skips_range_scan test's nb=300. nb=256 is the new max and is still > 255, so it exercises the same behavior: find_out_of_range is skipped (every u8 code is < 256) and the scalar path is taken (nb > 16). Full suite green (I had filtered the prior run to the new test and missed this — fixed). Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
Per qodo finding #2 (PR #224): the comments in lib.rs for MultiBucketBitmap and Contingency/Projection did not make explicit that: - MultiBucketBitmap is NOT the default single-score retrieval path - MultiBucketBitmap is gated behind the non-default `experimental` feature, is unstable, and is excluded from semver guarantees - Contingency/Projection are the stable side of the `experimental` gate and ARE covered by semver guarantees Expanded the lib.rs comments to include an explicit warning on MultiBucketBitmap's niche (bilinear decomposition research, not production retrieval), its storage overhead (2-4x vs RankQuant), and its instability. Clarified that Contingency/Projection are the stable surface. Added matching feature-gate and stability notes to the contingency.rs module-level docs. Doc-only change; no logic touched. Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
…e references) Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
|
/agentic_review |
|
Code review by qodo was updated up to the latest commit af3aab1 |
Signed-off-by: Nelson Spence <nelson@projectnavi.ai> # Conflicts: # src/lib.rs
…atched projections (#219) (#225) * feat(experimental): stateless dense bucket-overlap contingency + projection API (#219) New experimental-gated `Contingency` (src/contingency.rs): the full nb×nb bucket-overlap table C[a][b] = |{coords: query∈a ∧ doc∈b}| for two equal-length &[u8] bucket-code slices, built in one O(dim) histogram pass. Projections mirror ordgraph::edge: top_overlap / diagonal_agreement / band_agreement / top_group_overlap / bucket_l1_distance / coarsened_counts / rankquant_symmetric_score + a general project(&weights) for learned/custom nb×nb matrices, and a Projection enum. Stateless — no index, no persistence, never wired into a search path; this is the pairwise-evidence container ordgraph migrates to. Kernel: scalar reference + AVX-512 (avx512f/bw/vpopcntdq) histogram with a masked 64-byte tail (live-lane masking avoids tail bucket-0 false matches), runtime is_x86_feature_detected dispatch matching #[target_feature], portable scalar fallback. #![deny(unsafe_op_in_unsafe_fn)] honored; no new deps. bucket_l1_distance returns u64 (the distance-weighted sum (nb-1)·dim overflows u32 for accepted dim up to u32::MAX). Counts are u32 (a cell <= dim). Tests reproduce ordgraph::edge's exact projection values + simd==scalar parity + constructor validation + the u32-overflow regression. Behind `experimental`, absent from the default build. Verified: fmt/clippy(-D warnings)/test green. Refs #219. Targets 0.6 (may land in 0.5.0 — maintainer decides). Signed-off-by: Nelson Spence <nelson@projectnavi.ai> * feat(experimental): indexed MultiBucketBitmap contingency kernels + batched projections (#219) Stacked on feat/contingency-dense (parity test cross-checks the dense Contingency). Indexed counterpart to the stateless API — one query bucket code vs many doc bitmaps, emitting tables/projections per doc: - contingency_row(q_bitmaps, doc_idx) -> full nb×nb table |Q_a ∩ D_b|, single pass. - diagonal_overlap_row(..) -> the nb diagonal cells (nb popcount-AND passes, not nb²). - project_all_batched(q_bitmaps, &[&weights]) -> docs×projections: builds each doc's table ONCE then applies every weight matrix to the cached integers (no per-projection rescan, unlike bilinear_score). rayon over docs. Plus diagonal_weights()/banded_weights() weight-matrix constructors. Kernels: scalar + AVX-512 VPOPCNTDQ (contingency/diagonal_accumulate_avx512vpop) mirroring bitmap.rs masked-tail popcount-AND, runtime-dispatched, portable fallback; #![deny(unsafe_op_in_unsafe_fn)] honored; no new deps; not wired into any search path. Correctness gate: scalar == SIMD == diagonal == dense Contingency, EXACT integer equality across dims 384/768/1024 (full + masked tail) × bits {2,4}. bench_contingency.rs (SYNTHETIC, 3 regimes): build-once/project-many 5.7-6.5×, no-rescan 4.4-5.5×, SIMD 2.3-3.4×. Verified: fmt/clippy(-D warnings)/test(experimental + default) green. Behind `experimental`. Refs #219. Targets 0.6 (may land in 0.5.0 — maintainer decides). Signed-off-by: Nelson Spence <nelson@projectnavi.ai> * perf(contingency): apply gemini review optimizations (#224) - AVX-512 histogram kernel: precompute the nb bucket-value broadcast vectors once before the 64-byte block loop instead of recomputing set1_epi8 per block per bucket (gemini high). - band_agreement: iterate only the in-band columns instead of scanning all columns with an abs_diff filter. - rankquant_symmetric_score: factor the query weight out of the inner loop (nb multiplies instead of nb²). All exact: simd==scalar parity, rankquant-direct-sum parity, and projection parity tests unchanged and green. fmt/clippy clean. Signed-off-by: Nelson Spence <nelson@projectnavi.ai> * perf(multi_bucket): stack-allocate the per-doc projection table (#225) gemini: project_all_batched and its scalar twin allocated a Vec<u32> per document inside the rayon parallel map. nb is 2/4/16 (bits 1/2/4), so nb*nb <= 256 — use a stack [0u32; 256] sliced to nb*nb, removing the per-doc heap allocation and allocator contention from the batched hot loop. Exact parity unchanged (scalar==SIMD==diagonal==dense green). fmt/clippy clean. Signed-off-by: Nelson Spence <nelson@projectnavi.ai> * fix(contingency): saturating_add in band_agreement (qodo: radius overflow) (#224) radius is an uncapped public parameter (Projection::BandAgreement{radius}); a near-usize::MAX value overflowed qb + radius (panic in debug, silent wrap to a wrong/zero total in release). Use qb.saturating_add(radius). Regression test asserts band_agreement(usize::MAX) == total_count. Signed-off-by: Nelson Spence <nelson@projectnavi.ai> * fix(contingency): bound nb<=256 in Contingency::new (qodo) (#224) qodo correctly flagged the infallible nb*nb allocation: nb is a caller-supplied usize (codes are u8, but nb is independent), so a large nb (e.g. 1<<20) would allocate a terabyte-scale table and abort the host. Cap nb at the u8 code domain (<=256) before the vec! — >256 buckets is also meaningless for u8 codes. (I had wrongly dismissed this as a false-positive; it is real.) Signed-off-by: Nelson Spence <nelson@projectnavi.ai> * test(contingency): use nb=256 in large_nb test after the nb<=256 cap (Codex) The nb<=256 bound (36d0a0b) rejected the large_nb_uses_scalar_and_skips_range_scan test's nb=300. nb=256 is the new max and is still > 255, so it exercises the same behavior: find_out_of_range is skipped (every u8 code is < 256) and the scalar path is taken (nb > 16). Full suite green (I had filtered the prior run to the new test and missed this — fixed). Signed-off-by: Nelson Spence <nelson@projectnavi.ai> * docs: make MultiBucketBitmap experimental/unstable status explicit Per qodo finding #2 (PR #224): the comments in lib.rs for MultiBucketBitmap and Contingency/Projection did not make explicit that: - MultiBucketBitmap is NOT the default single-score retrieval path - MultiBucketBitmap is gated behind the non-default `experimental` feature, is unstable, and is excluded from semver guarantees - Contingency/Projection are the stable side of the `experimental` gate and ARE covered by semver guarantees Expanded the lib.rs comments to include an explicit warning on MultiBucketBitmap's niche (bilinear decomposition research, not production retrieval), its storage overhead (2-4x vs RankQuant), and its instability. Clarified that Contingency/Projection are the stable surface. Added matching feature-gate and stability notes to the contingency.rs module-level docs. Doc-only change; no logic touched. Signed-off-by: Nelson Spence <nelson@projectnavi.ai> * fix: remediate qodo findings on PR #225 (nb<=16 assert, experimental doc, speedup consistency) - Add release-active assert!(nb * nb <= 256, ...) before the stack-table slice in both project_all_batched and project_all_batched_scalar in src/multi_bucket.rs, making the nb<=16 invariant fail-loud locally instead of relying solely on MultiBucketBitmap::new's bits restriction. - Add a doc note to the indexed contingency/projection section header explaining that this surface is gated behind the non-default `experimental` feature, excluded from semver, and is not the default single-score retrieval path (RankQuant/Bitmap are preferred). - Fix inverted speedup output in examples/bench_contingency.rs: regime_a printed 1.0/speedup in the human table but speedup in DATA; regime_b printed 1.0/fastpath_speedup in the table column but fastpath_speedup in DATA. Both now print the non-inverted ratio consistently. Signed-off-by: Nelson Spence <nelson@projectnavi.ai> * docs: add MultiBucketBitmap non-default/experimental stability banner to module rustdoc qodo's 'Missing MultiBucketBitmap non-default docs' finding wants the experimental/non-default/semver-excluded note on the user-visible rustdoc, not an internal comment. Add a prominent stability banner to the module-level //! docs (what docs.rs renders): MultiBucketBitmap and the indexed contingency / projection kernels are gated behind the non-default `experimental` feature, are not stable public API, and are excluded from semver guarantees; the stable surface is the stateless dense Contingency / Projection API. Signed-off-by: Nelson Spence <nelson@projectnavi.ai> * docs: product-clean the contingency surface (remove internal-prototype references) Signed-off-by: Nelson Spence <nelson@projectnavi.ai> * docs: state MultiBucketBitmap is not a primary retrieval path (positioning) qodo's 'Missing MultiBucketBitmap non-default docs' finding asks the docs to explicitly position MultiBucketBitmap as niche — not the default single-score retrieval path — and name the alternatives that dominate. Extend the module banner with a 'Positioning' note: MultiBucketBitmap is a research/analysis substrate (never kernel-optimized for retrieval); for primary retrieval use RankQuant (symmetric/asymmetric), Bitmap (top-bucket popcount), and the two-stage candidate-gen -> rerank flow. The indexed contingency/projection surface is for analyzing bucket-overlap structure, not primary retrieval. Signed-off-by: Nelson Spence <nelson@projectnavi.ai> * docs: position MultiBucketBitmap in crate-root docs (project-doc surface) The retrieval-positioning note belongs on the crate-root docs.rs front page that every user sees, not only in the experimental-gated multi_bucket module (which a caller who never enables `experimental` never reads — yet they are exactly the audience choosing a retrieval path). Add the 'not a primary retrieval path; use RankQuant / Bitmap / two-stage' note to the lib.rs crate docs, right after the four headline substrate families. Uses a plain `MultiBucketBitmap` reference so the default (no-experimental) doc build stays link-clean; verified under RUSTDOCFLAGS=-D warnings both with and without the feature. Signed-off-by: Nelson Spence <nelson@projectnavi.ai> --------- Signed-off-by: Nelson Spence <nelson@projectnavi.ai> Co-authored-by: Navi Bot <267427491+project-navi-bot@users.noreply.github.com>
What
The stateless dense bucket-overlap contingency + projection API (#219, API 1 of 2),
experimental-gated.Contingency::new(&query_codes, &doc_codes, nb)builds the fullnb×nbtableC[a][b] = |{coords: query∈a ∧ doc∈b}|in one O(dim) histogram pass, then exposes projections mirroringordgraph::edge::Projection:top_overlap,diagonal_agreement,band_agreement(r),top_group_overlap(w),bucket_l1_distance,coarsened_counts(g),rankquant_symmetric_scoreproject(&weights)for learned/customnb×nbmatrices, + aProjectionenum.This is the pairwise-evidence container ordgraph migrates to (its
EdgeEvidence"this is evidence for X" semantics stay in ordgraph). Stateless — no index, no persistence, never wired into a search path.Kernel
Scalar reference + AVX-512 (avx512f/bw/vpopcntdq) histogram with a masked 64-byte tail (live-lane masking prevents tail bucket-0 false matches), runtime
is_x86_feature_detected!dispatch matching#[target_feature], portable scalar fallback.#![deny(unsafe_op_in_unsafe_fn)]honored, no new deps.Notes
bucket_l1_distancereturnsu64(the distance-weighted sum can exceedu32for accepteddim).experimental, absent from the default build.Background profiling (why this shape, and why
MultiBucketBitmapis not a retriever) lives on theexperiment/multibucket-profilebranch.