Skip to content

feat(experimental): indexed MultiBucketBitmap contingency kernels + batched projections (#219)#225

Merged
project-navi-bot merged 17 commits into
mainfrom
feat/contingency-indexed
Jun 15, 2026
Merged

feat(experimental): indexed MultiBucketBitmap contingency kernels + batched projections (#219)#225
project-navi-bot merged 17 commits into
mainfrom
feat/contingency-indexed

Conversation

@Fieldnote-Echo

Copy link
Copy Markdown
Owner

What

The indexed counterpart (#219, API 2 of 2), experimental-gated. Stacked on #224 — its parity test cross-checks the dense Contingency, so merge #224 first (this PR will retarget to main automatically).

Indexed methods on MultiBucketBitmap — one query bucket code vs many doc bitmaps:

  • contingency_row(q, doc_idx) — full nb×nb table per doc, single pass.
  • diagonal_overlap_row(..) — the nb diagonal cells (nb popcount-AND passes, not nb²).
  • project_all_batched(q, &[&weights]) — docs×projections, builds each doc's table once then applies all weight matrices (no per-projection rescan). rayon over docs.
  • diagonal_weights() / banded_weights() weight-matrix constructors.

Kernel

Scalar + AVX-512 VPOPCNTDQ mirroring bitmap.rs masked-tail popcount-AND, runtime-dispatched, portable fallback. #![deny(unsafe_op_in_unsafe_fn)] honored, no new deps, not wired into any search path.

Correctness

scalar == SIMD == diagonal == dense Contingency, exact integer equality, across dims 384/768/1024 (full + masked tail) × bits {2,4}.

Benchmarks (bench_contingency.rs, SYNTHETIC, 3 regimes)

build-once/project-many 5.7–6.5×; no-rescan 4.4–5.5×; SIMD 2.3–3.4×. The algorithmic wins (build-once, no-rescan) dominate; SIMD is the constant factor on top.

Refs #219. Targets 0.6; may land in 0.5.0 — maintainer decides.

…ection API (#219)

New experimental-gated `Contingency` (src/contingency.rs): the full nb×nb
bucket-overlap table C[a][b] = |{coords: query∈a ∧ doc∈b}| for two equal-length
&[u8] bucket-code slices, built in one O(dim) histogram pass. Projections mirror
ordgraph::edge: top_overlap / diagonal_agreement / band_agreement /
top_group_overlap / bucket_l1_distance / coarsened_counts / rankquant_symmetric_score
+ a general project(&weights) for learned/custom nb×nb matrices, and a Projection
enum. Stateless — no index, no persistence, never wired into a search path; this is
the pairwise-evidence container ordgraph migrates to.

Kernel: scalar reference + AVX-512 (avx512f/bw/vpopcntdq) histogram with a masked
64-byte tail (live-lane masking avoids tail bucket-0 false matches), runtime
is_x86_feature_detected dispatch matching #[target_feature], portable scalar
fallback. #![deny(unsafe_op_in_unsafe_fn)] honored; no new deps.

bucket_l1_distance returns u64 (the distance-weighted sum (nb-1)·dim overflows
u32 for accepted dim up to u32::MAX). Counts are u32 (a cell <= dim).

Tests reproduce ordgraph::edge's exact projection values + simd==scalar parity +
constructor validation + the u32-overflow regression. Behind `experimental`,
absent from the default build. Verified: fmt/clippy(-D warnings)/test green.

Refs #219. Targets 0.6 (may land in 0.5.0 — maintainer decides).

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
…atched projections (#219)

Stacked on feat/contingency-dense (parity test cross-checks the dense Contingency).

Indexed counterpart to the stateless API — one query bucket code vs many doc
bitmaps, emitting tables/projections per doc:
- contingency_row(q_bitmaps, doc_idx) -> full nb×nb table |Q_a ∩ D_b|, single pass.
- diagonal_overlap_row(..) -> the nb diagonal cells (nb popcount-AND passes, not nb²).
- project_all_batched(q_bitmaps, &[&weights]) -> docs×projections: builds each doc's
  table ONCE then applies every weight matrix to the cached integers (no per-projection
  rescan, unlike bilinear_score). rayon over docs.
Plus diagonal_weights()/banded_weights() weight-matrix constructors.

Kernels: scalar + AVX-512 VPOPCNTDQ (contingency/diagonal_accumulate_avx512vpop)
mirroring bitmap.rs masked-tail popcount-AND, runtime-dispatched, portable fallback;
#![deny(unsafe_op_in_unsafe_fn)] honored; no new deps; not wired into any search path.

Correctness gate: scalar == SIMD == diagonal == dense Contingency, EXACT integer
equality across dims 384/768/1024 (full + masked tail) × bits {2,4}.

bench_contingency.rs (SYNTHETIC, 3 regimes): build-once/project-many 5.7-6.5×,
no-rescan 4.4-5.5×, SIMD 2.3-3.4×.

Verified: fmt/clippy(-D warnings)/test(experimental + default) green. Behind
`experimental`. Refs #219. Targets 0.6 (may land in 0.5.0 — maintainer decides).

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@qodo-code-review

qodo-code-review Bot commented Jun 14, 2026

Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (0) 📎 Requirement gaps (1)

Context used

Grey Divider


Remediation recommended

1. Implicit nb<=16 assumption ✓ Resolved 🐞 Bug ⚙ Maintainability
Description
project_all_batched and project_all_batched_scalar slice a fixed [u32; 256] stack buffer as
[..nb*nb] without a local bound check; this will panic (slice OOB) if n_buckets is ever allowed
to exceed 16. This is currently prevented by MultiBucketBitmap::new restricting bits to 1|2|4,
but the dependency is implicit and fragile to future changes/refactors.
Code

src/multi_bucket.rs[R362-367]

+                // One accumulation pass over this doc's bitmaps. nb <= 16 ⇒
+                // nb*nb <= 256, so a stack table avoids a per-doc heap
+                // allocation inside the parallel map (allocator contention).
+                let mut table = [0u32; 256];
+                let table = &mut table[..nb * nb];
+                contingency_accumulate(q_bitmaps, doc, nb, qpb, table);
Evidence
The new optimization slices a fixed-size stack array by nb*nb (panic if nb*nb > 256). While
MultiBucketBitmap::new currently constrains bits to keep nb<=16, that safety condition is not
asserted where the slicing happens, making the code fragile if bucket-count support is expanded
later.

src/multi_bucket.rs[49-73]
src/multi_bucket.rs[336-370]
src/multi_bucket.rs[416-453]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`project_all_batched` (and its scalar twin) uses a fixed `[u32; 256]` stack buffer and then slices it with `[..nb * nb]`. This is only safe as long as `nb * nb <= 256` (i.e., `nb <= 16`). That invariant currently comes from `MultiBucketBitmap::new(bits)` restricting `bits` to `{1,2,4}`, but it is not asserted locally, so a future extension of supported `bits` could turn into a runtime panic inside the rayon loop.

### Issue Context
The current design intends to avoid per-doc heap allocation by using a stack buffer.

### Fix Focus Areas
- src/multi_bucket.rs[362-368]
- src/multi_bucket.rs[446-451]

### Suggested fix
Add an explicit guard before slicing, e.g.:
- `debug_assert!(nb * nb <= 256, "project_all_batched requires nb<=16 (nb={nb}) for stack table");`
(or an `assert!` if you want this to be a hard contract).

Optionally, centralize the constant (e.g., `const MAX_NB: usize = 16; const MAX_CELLS: usize = MAX_NB*MAX_NB;`) so any future change to supported `bits` forces touching this code.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Missing MultiBucketBitmap non-default docs 📎 Requirement gap ⚙ Maintainability
Description
The PR adds/expands MultiBucketBitmap’s indexed contingency + projection surface and documents
candidate-gen/recall-latency usage, but does not explicitly state (in project documentation) that
MultiBucketBitmap is niche and not the default single-score retrieval path (and which alternatives
dominate). This risks users adopting the experimental indexed API as a primary retrieval path,
contrary to the required positioning.
Code

src/multi_bucket.rs[R149-156]

+    /// Outer-product weights `(a − c)(b − c)` restricted to the band
+    /// `|a − b| <= half_width` (off-band entries zeroed). `half_width = 0`
+    /// keeps only the magnitude-weighted diagonal; `half_width >= nb − 1`
+    /// recovers the full [`Self::outer_product_weights`] matrix. Sweeping
+    /// `half_width` interpolates candidate-gen cost (non-zero band entries ⇒
+    /// popcount-AND passes per doc) between the diagonal and the exact
+    /// bilinear probe, tracing the recall/latency frontier.
+    pub fn banded_weights(&self, half_width: usize) -> Vec<f32> {
Evidence
Rule 6 requires explicit project documentation that MultiBucketBitmap is not the default
single-score retrieval path. The new docs/comments added around banded_weights and the indexed API
surface discuss candidate generation and performance frontiers, but do not provide the required
explicit positioning statement; existing docs mentioning MultiBucketBitmap also do not explicitly
cover the “not default single-score retrieval” requirement.

Explicitly document MultiBucketBitmap is not the default single-score retrieval path
src/multi_bucket.rs[149-156]
src/multi_bucket.rs[233-250]
README.md[60-70]
docs/RANK_MODES.md[434-439]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The compliance checklist requires explicit documentation that `MultiBucketBitmap` is *not* the default single-score retrieval path and is intended only for niche workloads (e.g., one-query-vs-many-docs with many projections/full evidence). The PR adds new indexed contingency/projection APIs and comments about recall/latency tradeoffs, but the repo docs do not clearly add the required positioning and alternatives.

## Issue Context
`MultiBucketBitmap` is experimental and easy to misinterpret as a recommended retrieval primitive because new APIs and comments describe candidate-gen/frontier tradeoffs without an explicit “not the default path” statement.

## Fix Focus Areas
- README.md[60-70]
- docs/RANK_MODES.md[434-439]
- src/multi_bucket.rs[149-156]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Speedup output inconsistent ✓ Resolved 🐞 Bug ≡ Correctness
Description
In examples/bench_contingency.rs, the benchmark output inconsistently inverts computed speedup
ratios between the human-readable table and the emitted DATA line, so humans and downstream
tooling can interpret the same measurement in opposite directions. This occurs in both regime_a
(printing 1.0 / speedup in the table but speedup in DATA) and regime_b where
fastpath_speedup = dense_full / diag_dispatched is printed as 1.0 / fastpath_speedup in the “vs
dispatched” table column despite the header/convention and DATA using the non-inverted value.
Code

examples/bench_contingency.rs[R324-346]

+        let speedup = rebuild_each / build_once;
+        println!("# (a) ONE PAIR / {n_proj} PROJECTIONS — dense Contingency (API 1)");
+        println!("{:<34} {:>14} {:>14}", "approach", "us/pair", "speedup");
+        println!(
+            "{:<34} {:>14.3} {:>14}",
+            "build-once, project-many",
+            build_once * 1e6,
+            "1.00x (ref)"
+        );
+        println!(
+            "{:<34} {:>14.3} {:>13.2}x",
+            "rebuild-per-projection",
+            rebuild_each * 1e6,
+            1.0 / speedup
+        );
+        println!(
+            "DATA\ta\tbits={}\tbuild_once_us={:.3}\trebuild_each_us={:.3}\tn_proj={}\tspeedup={:.2}",
+            (nb as f32).log2() as u32,
+            build_once * 1e6,
+            rebuild_each * 1e6,
+            n_proj,
+            speedup
+        );
Evidence
The cited code paths define a ratio once (e.g., speedup = rebuild_each / build_once in regime_a,
and fastpath_speedup = dense_full / diag_dispatched in regime_b) and then print different
directions of that same ratio depending on output channel: the human table uses println!
formatting that emits the inverse (1.0 / speedup or 1.0 / fastpath_speedup), while the DATA
line prints the original ratio (speedup / fastpath_speedup). In regime_b, this inversion also
conflicts with the table header (“vs dispatched”) and the scalar row convention (`diag_scalar /
diag_dispatched`), demonstrating that the same metric is reported in opposite directions.

examples/bench_contingency.rs[324-346]
examples/bench_contingency.rs[387-422]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`examples/bench_contingency.rs` computes speedup ratios but reports them inconsistently: the human-readable table inverts the ratio (prints `1.0 / ratio`) while the `DATA` line prints the non-inverted ratio, causing humans and downstream tooling to read opposite interpretations of the same measurement. This affects both `regime_a` (`speedup = rebuild_each / build_once`) and `regime_b` where `fastpath_speedup = dense_full / diag_dispatched` is inverted in the “vs dispatched” table cell despite the header/convention and the `DATA` line using the non-inverted value.

## Issue Context
Downstream parsing likely relies on the `DATA` line while humans read the table; both should represent the same ratio direction. In `regime_b`, the header says “vs dispatched” and the scalar row prints `diag_scalar / diag_dispatched`, so the dense-full row should follow the same convention and print `dense_full / diag_dispatched` unless you intentionally choose the inverse and rename accordingly.

## Fix Focus Areas
- examples/bench_contingency.rs[324-346]
- examples/bench_contingency.rs[387-422]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

@qodo-code-review

Copy link
Copy Markdown

PR Summary by Qodo

Add indexed contingency kernels and batched projections to MultiBucketBitmap (experimental)
✨ Enhancement 🧪 Tests 🕐 40+ Minutes

Grey Divider

Walkthroughs

Description
• Add per-doc indexed contingency table + diagonal fast path on MultiBucketBitmap.
• Add batched projections to reuse per-doc tables across many weight matrices.
• Add AVX-512 VPOPCNTDQ runtime-dispatched kernels with parity tests and benchmarks.
Diagram
graph TD
  A["bench_contingency example"] --> B["MultiBucketBitmap (API 2)"] --> C["contingency/diagonal accumulate"] --> D["runtime dispatch"]
  D --> E["AVX-512 VPOPCNTDQ kernel"]
  D --> F["scalar kernel"]
  B --> G["project_all_batched"]
  H["tests: parity + batching"] --> B
Loading
High-Level Assessment

The following are alternative approaches to this PR:

1. Reuse existing Bitmap overlap kernel via a shared primitive
  • ➕ Reduces duplication of masked-tail SIMD load/popcount logic
  • ➕ Centralizes unsafe SIMD code in one place, simplifying auditing
  • ➖ May require refactoring Bitmap internals into reusable building blocks
  • ➖ Harder to keep the nb×nb tiling fast if the primitive is too generic
2. Use portable SIMD (std::simd) instead of AVX-512 intrinsics
  • ➕ Less target-specific unsafe code; easier to maintain across platforms
  • ➕ Potentially clearer implementation with fewer cfg/target_feature gates
  • ➖ Current stable Rust SIMD support may not expose VPOPCNTDQ-equivalent performance
  • ➖ May regress peak throughput on AVX-512-capable hosts
3. Cache per-doc contingency tables inside the index
  • ➕ Could amortize table build cost across multiple queries/projections
  • ➕ May help workloads with repeated query patterns
  • ➖ Large memory overhead (O(n_docs * nb^2)) and cache invalidation complexity
  • ➖ Shifts index from compact bitmaps toward heavier per-doc materialization

Recommendation: The PR’s approach (build per-doc tables on demand, then batch-apply weight matrices) is the right tradeoff for issue #219: it captures the dominant algorithmic wins (build-once/project-many and no-rescan) without permanently increasing index memory. The main thing to consider is extracting shared masked-tail SIMD utilities to avoid long-term duplication with other bitmap kernels; portable SIMD is attractive but may not match AVX-512 VPOPCNTDQ performance today.

Grey Divider

File Changes

Enhancement (2)
bench_contingency.rs Add synthetic benchmark for dense vs indexed contingency/projections +497/-0

Add synthetic benchmark for dense vs indexed contingency/projections

• Introduces a reproducible, synthetic benchmark harness covering three regimes: build-once/project-many, diagonal fast-path over many docs, and batched multi-projection scoring without rescanning. Reports host SIMD capabilities and compares dispatched vs forced-scalar variants to quantify algorithmic and SIMD speedups.

examples/bench_contingency.rs


multi_bucket.rs Implement indexed contingency table kernels, diagonal fast path, and batched projections +747/-0

Implement indexed contingency table kernels, diagonal fast path, and batched projections

• Adds experimental indexed API methods to build per-doc nb×nb contingency tables ('contingency_row'), compute diagonal-only overlaps ('diagonal_overlap_row'), and score many projections per doc from one cached table ('project_all_batched', rayon-parallel). Implements scalar and AVX-512 VPOPCNTDQ masked-tail accumulation kernels with runtime dispatch, adds weight constructors ('diagonal_weights', 'banded_weights'), and introduces parity/batching correctness tests (scalar==SIMD==diagonal==dense Contingency).

src/multi_bucket.rs


Grey Divider

Qodo Logo

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an indexed contingency and projection surface for MultiBucketBitmap (addressing issue #219), including AVX-512 VPOPCNTDQ optimized kernels with portable scalar fallbacks, batched projection methods, and a comprehensive profiling harness in examples/bench_contingency.rs. The review feedback recommends optimizing the parallel batched projection methods (project_all_batched and its scalar twin) by replacing heap-allocated temporary vectors with stack-allocated arrays to avoid allocator lock contention and overhead.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread src/multi_bucket.rs Outdated
Comment thread src/multi_bucket.rs Outdated
@codecov

codecov Bot commented Jun 14, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

- AVX-512 histogram kernel: precompute the nb bucket-value broadcast vectors
  once before the 64-byte block loop instead of recomputing set1_epi8 per block
  per bucket (gemini high).
- band_agreement: iterate only the in-band columns instead of scanning all
  columns with an abs_diff filter.
- rankquant_symmetric_score: factor the query weight out of the inner loop
  (nb multiplies instead of nb²).

All exact: simd==scalar parity, rankquant-direct-sum parity, and projection
parity tests unchanged and green. fmt/clippy clean.

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
gemini: project_all_batched and its scalar twin allocated a Vec<u32> per
document inside the rayon parallel map. nb is 2/4/16 (bits 1/2/4), so nb*nb <= 256
— use a stack [0u32; 256] sliced to nb*nb, removing the per-doc heap allocation
and allocator contention from the batched hot loop. Exact parity unchanged
(scalar==SIMD==diagonal==dense green). fmt/clippy clean.

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
@Fieldnote-Echo

Copy link
Copy Markdown
Owner Author

/agentic_review

@qodo-code-review

qodo-code-review Bot commented Jun 14, 2026

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit f9df6e9

Fieldnote-Echo and others added 10 commits June 14, 2026 15:59
…flow) (#224)

radius is an uncapped public parameter (Projection::BandAgreement{radius}); a
near-usize::MAX value overflowed qb + radius (panic in debug, silent wrap to a
wrong/zero total in release). Use qb.saturating_add(radius). Regression test
asserts band_agreement(usize::MAX) == total_count.

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
qodo correctly flagged the infallible nb*nb allocation: nb is a caller-supplied
usize (codes are u8, but nb is independent), so a large nb (e.g. 1<<20) would
allocate a terabyte-scale table and abort the host. Cap nb at the u8 code domain
(<=256) before the vec! — >256 buckets is also meaningless for u8 codes.
(I had wrongly dismissed this as a false-positive; it is real.)

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
…(Codex)

The nb<=256 bound (36d0a0b) rejected the large_nb_uses_scalar_and_skips_range_scan
test's nb=300. nb=256 is the new max and is still > 255, so it exercises the same
behavior: find_out_of_range is skipped (every u8 code is < 256) and the scalar
path is taken (nb > 16). Full suite green (I had filtered the prior run to the
new test and missed this — fixed).

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
Per qodo finding #2 (PR #224): the comments in lib.rs for MultiBucketBitmap
and Contingency/Projection did not make explicit that:

- MultiBucketBitmap is NOT the default single-score retrieval path
- MultiBucketBitmap is gated behind the non-default `experimental` feature,
  is unstable, and is excluded from semver guarantees
- Contingency/Projection are the stable side of the `experimental` gate
  and ARE covered by semver guarantees

Expanded the lib.rs comments to include an explicit warning on
MultiBucketBitmap's niche (bilinear decomposition research, not production
retrieval), its storage overhead (2-4x vs RankQuant), and its instability.
Clarified that Contingency/Projection are the stable surface. Added matching
feature-gate and stability notes to the contingency.rs module-level docs.

Doc-only change; no logic touched.

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
…doc, speedup consistency)

- Add release-active assert!(nb * nb <= 256, ...) before the stack-table
  slice in both project_all_batched and project_all_batched_scalar in
  src/multi_bucket.rs, making the nb<=16 invariant fail-loud locally
  instead of relying solely on MultiBucketBitmap::new's bits restriction.
- Add a doc note to the indexed contingency/projection section header
  explaining that this surface is gated behind the non-default
  `experimental` feature, excluded from semver, and is not the default
  single-score retrieval path (RankQuant/Bitmap are preferred).
- Fix inverted speedup output in examples/bench_contingency.rs: regime_a
  printed 1.0/speedup in the human table but speedup in DATA; regime_b
  printed 1.0/fastpath_speedup in the table column but fastpath_speedup
  in DATA. Both now print the non-inverted ratio consistently.

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
… to module rustdoc

qodo's 'Missing MultiBucketBitmap non-default docs' finding wants the
experimental/non-default/semver-excluded note on the user-visible rustdoc, not
an internal comment. Add a prominent stability banner to the module-level //!
docs (what docs.rs renders): MultiBucketBitmap and the indexed contingency /
projection kernels are gated behind the non-default `experimental` feature,
are not stable public API, and are excluded from semver guarantees; the stable
surface is the stateless dense Contingency / Projection API.

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
…e references)

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
…t/contingency-indexed

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
@Fieldnote-Echo

Copy link
Copy Markdown
Owner Author

/agentic_review

@qodo-code-review

qodo-code-review Bot commented Jun 15, 2026

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit d62b411

…oning)

qodo's 'Missing MultiBucketBitmap non-default docs' finding asks the docs to
explicitly position MultiBucketBitmap as niche — not the default single-score
retrieval path — and name the alternatives that dominate. Extend the module
banner with a 'Positioning' note: MultiBucketBitmap is a research/analysis
substrate (never kernel-optimized for retrieval); for primary retrieval use
RankQuant (symmetric/asymmetric), Bitmap (top-bucket popcount), and the
two-stage candidate-gen -> rerank flow. The indexed contingency/projection
surface is for analyzing bucket-overlap structure, not primary retrieval.

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
@Fieldnote-Echo

Copy link
Copy Markdown
Owner Author

Resolved in 766e3fa. The module-level rustdoc for MultiBucketBitmap now carries an explicit Positioning note (in addition to the existing Stability note):

Positioning — not a primary retrieval path. MultiBucketBitmap is a niche research/analysis substrate for the bilinear bucket-overlap decomposition; it is not the default single-score retrieval path and was never kernel-optimized for that role. For primary nearest-neighbour retrieval, use the headline paths instead — RankQuant (symmetric and asymmetric float-query scoring), Bitmap (top-bucket popcount(Q AND D) candidate scoring), and the two-stage candidate-generation → rerank flow. Reach for this indexed contingency / projection surface only to analyze bucket-overlap structure, never as a primary retrieval index.

This states (a) the surface is niche, (b) it is not the default single-score retrieval path, and (c) the alternatives that dominate for retrieval — directly addressing the requirement-gap. Doc build is clean under RUSTDOCFLAGS=-D warnings (the RankQuant/Bitmap intra-doc links resolve).

@Fieldnote-Echo

Copy link
Copy Markdown
Owner Author

/agentic_review

@qodo-code-review

qodo-code-review Bot commented Jun 15, 2026

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit 766e3fa

…ace)

The retrieval-positioning note belongs on the crate-root docs.rs front page that
every user sees, not only in the experimental-gated multi_bucket module (which a
caller who never enables `experimental` never reads — yet they are exactly the
audience choosing a retrieval path). Add the 'not a primary retrieval path; use
RankQuant / Bitmap / two-stage' note to the lib.rs crate docs, right after the
four headline substrate families. Uses a plain `MultiBucketBitmap` reference so
the default (no-experimental) doc build stays link-clean; verified under
RUSTDOCFLAGS=-D warnings both with and without the feature.

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>
@Fieldnote-Echo

Copy link
Copy Markdown
Owner Author

Correction / stronger fix in 0f7d470. The positioning note is now on the crate-root lib.rs docs — the docs.rs front page every user sees regardless of the experimental feature (the prior note lived only in the experimental-gated multi_bucket module, which a caller who never enables experimental never reads — yet they are exactly the audience choosing a retrieval path). Right after the four headline substrate families, the crate docs now state:

These four families are the retrieval surface. The experimental MultiBucketBitmap indexed contingency / projection API is a niche research/analysis substrate for the bilinear bucket-overlap decomposition — it is not a default single-score retrieval path and was never kernel-optimized for that role. For primary nearest-neighbour retrieval use RankQuant, Bitmap, or the two-stage candidate-generation → rerank flow instead.

Verified link-clean under RUSTDOCFLAGS=-D warnings for cargo doc both with and without --features experimental. The module-level note remains as defense-in-depth.

@Fieldnote-Echo

Copy link
Copy Markdown
Owner Author

/agentic_review

@qodo-code-review

qodo-code-review Bot commented Jun 15, 2026

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit 0f7d470

Base automatically changed from feat/contingency-dense to main June 15, 2026 00:32
@project-navi-bot project-navi-bot dismissed their stale review June 15, 2026 00:32

The base branch was changed.

Signed-off-by: Nelson Spence <nelson@projectnavi.ai>

# Conflicts:
#	src/lib.rs
@project-navi-bot project-navi-bot self-requested a review June 15, 2026 00:34
@project-navi-bot project-navi-bot merged commit 6271f7d into main Jun 15, 2026
38 checks passed
@project-navi-bot project-navi-bot deleted the feat/contingency-indexed branch June 15, 2026 00:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants