Skip to content

Remove benchmarks/rabitq_poc research scratch#97

Merged
RyanCodrai merged 1 commit into
mainfrom
remove-rabitq-poc
Jun 9, 2026
Merged

Remove benchmarks/rabitq_poc research scratch#97
RyanCodrai merged 1 commit into
mainfrom
remove-rabitq-poc

Conversation

@RyanCodrai

Copy link
Copy Markdown
Owner

What

Removes benchmarks/rabitq_poc/ — the numpy proof-of-concept scratch space used to validate two features before they were implemented in the Rust core.

Why

The features this directory validated have shipped and are covered by real tests:

  • RaBitQ-style per-vector length-renormalization (README "How it works" step 5) → turbovec/tests/distortion.rs, tqplus_calibration.rs
  • Filtered-search block-skip early-exitturbovec/tests/filtering.rs

The POC scripts are a numpy reimplementation of encode.rs that was a one-time check and has since drifted from the source. The result snapshots (*_baseline/*_proto JSON, comparison PNGs) have no ongoing consumer:

  • Nothing outside the directory imports or references it
  • benchmarks/create_diagrams.py reads benchmarks/results/, not these
  • CI does not touch it

58 files / 1,917 lines removed. No SHAs rewritten — recoverable from history (introduced in #35) if ever needed.

🤖 Generated with Claude Code

The RaBitQ length-renormalization and filtered-search block-skip features
this directory validated have shipped into the Rust core and are covered by
tests (distortion.rs, tqplus_calibration.rs, filtering.rs). The numpy POC
scripts and prototype-vs-baseline result snapshots have no ongoing consumer;
nothing outside the directory references them. Recoverable from history
(introduced in #35) if ever needed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@RyanCodrai RyanCodrai merged commit 3ce56b7 into main Jun 9, 2026
6 checks passed
@RyanCodrai RyanCodrai deleted the remove-rabitq-poc branch June 9, 2026 12:49
RyanCodrai added a commit that referenced this pull request Jun 9, 2026
* docs: correct API-reference drift across README and docs/

A doc audit against the current source surfaced several drifts. All
docs-only; no code behavior change.

README.md:
- Python persistence example used the non-existent .tq extension → .tv.
- Rust snippets called TurboQuantIndex::new / IdMapIndex::new (both
  return Result) and used the value directly — wouldn't compile. Added
  .unwrap() to mirror the crate's own doctest examples.

docs/api.md:
- bit_width documented as {2, 4}; the core accepts {2, 3, 4}.
- File-format diagrams were stale (described the pre-magic v1 layout):
  .tv now carries a TVPI magic + version 3 + per-vector scales + a TQ+
  trailer; .tvim is version 3 (was documented as 1, which the loader now
  rejects). Updated both diagrams and the surrounding notes.

docs/integrations/llama_index.md:
- Filter docs were inverted: NOT, ANY, ALL, and TEXT_MATCH_INSENSITIVE
  are all implemented (doc claimed they raise NotImplementedError), and
  TEXT_MATCH is case-sensitive (doc said case-insensitive). Corrected the
  operator/condition lists and the semantics note.
- Documented that an intra-batch duplicate node_id raises (vs the
  langchain/haystack keep-last behavior).

docs/integrations/agno.md:
- 'Every public method has an async counterpart' was false; listed the
  methods that actually have async variants and noted the sync-only ones.

Reported by @gabrielemidulla (#101); remaining items found in a full
doc sweep.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs: fix stale code-comments (SIMD, side-car ext, dead example ref)

Follow-up to the API-reference drift pass — same docs-only nature, now in
source comments:

- search.rs module doc claimed only 'AVX2 on x86'; it actually has an
  AVX-512BW kernel selected at runtime with an AVX2 fallback (plus a
  scalar fallback for other targets). Updated the module doc.
- search.rs AVX-512BW comment described a shared 'avx2_block_epilogue'
  helper that is never called; the real path flushes per batch via
  avx2_batch_flush_to_fa and finishes with avx2_post_flush_heap_update.
- llama_index.py from_persist_dir docstring said the side-car uses '.pkl';
  it is actually '.nodes.json'.
- dump_state.rs example referenced benchmarks/rabitq_poc/poc_apples_to_apples.py,
  which was removed in #97; reworded to describe the example generically.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(changelog): record #90/#89 integration fixes; fix Unreleased compare link

Add an [Unreleased] entry documenting the two merged integration fixes
(intra-batch duplicate-id orphaning #90, upsert delete-before-validate
#89), and repoint the stale [Unreleased] compare link from py-v0.4.2 to
the current v0.8.0 tag.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant