Skip to content

api: expose two-stage candidate-count policy helper #189

@Fieldnote-Echo

Description

@Fieldnote-Echo

Context

Downstream users need a default way to choose the SignBitmap / Bitmap candidate budget before exact RankQuant rerank. If every integration hardcodes max(min_candidates, k * multiplier) differently, OrdVec’s two-stage behavior becomes fragmented and hard to benchmark.

This is related to #130, but distinct: #130 estimates allocation/scan cost. This issue is about a blessed retrieval-policy helper for candidate counts, especially for the SignBitmap → RankQuant default path used by OrdinalDB.

Related: #172 evaluates sign scaling and soft candidate-gate strategies; the helper can start conservative and evolve with evidence.

Evidence

  • SignBitmap::top_m_candidates takes an explicit m and clamps only to len: src/sign_bitmap.rs:133-159.
  • Bitmap::top_m_candidates takes an explicit m and clamps only to len: src/bitmap.rs:192-237.
  • RankQuant::search_asymmetric_subset clamps final k to the candidate list length: src/quant.rs:569-572.
  • Persisted-format docs say candidate-count selection is tracked outside the index bytes: docs/PERSISTED_FORMAT.md:89.

Proposed Shape

Sketch:

pub struct TwoStageCandidatePolicy {
    pub min_candidates: usize,
    pub k_multiplier: usize,
    pub max_candidates: Option<usize>,
}

impl Default for TwoStageCandidatePolicy { /* conservative default */ }

impl TwoStageCandidatePolicy {
    pub fn candidate_count(&self, k: usize, n_vectors: usize) -> Result<usize, CandidatePolicyError>;
}

Equivalent naming is fine. A small free function is also acceptable if a struct is overkill.

Acceptance Criteria

Non-goals

  • No automatic quality guarantee or benchmark claim.
  • No query planner.
  • No hidden dynamic tuning unless benchmarked and documented.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions