Remove benchmarks/rabitq_poc research scratch#97
Merged
Conversation
The RaBitQ length-renormalization and filtered-search block-skip features this directory validated have shipped into the Rust core and are covered by tests (distortion.rs, tqplus_calibration.rs, filtering.rs). The numpy POC scripts and prototype-vs-baseline result snapshots have no ongoing consumer; nothing outside the directory references them. Recoverable from history (introduced in #35) if ever needed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
RyanCodrai
added a commit
that referenced
this pull request
Jun 9, 2026
* docs: correct API-reference drift across README and docs/
A doc audit against the current source surfaced several drifts. All
docs-only; no code behavior change.
README.md:
- Python persistence example used the non-existent .tq extension → .tv.
- Rust snippets called TurboQuantIndex::new / IdMapIndex::new (both
return Result) and used the value directly — wouldn't compile. Added
.unwrap() to mirror the crate's own doctest examples.
docs/api.md:
- bit_width documented as {2, 4}; the core accepts {2, 3, 4}.
- File-format diagrams were stale (described the pre-magic v1 layout):
.tv now carries a TVPI magic + version 3 + per-vector scales + a TQ+
trailer; .tvim is version 3 (was documented as 1, which the loader now
rejects). Updated both diagrams and the surrounding notes.
docs/integrations/llama_index.md:
- Filter docs were inverted: NOT, ANY, ALL, and TEXT_MATCH_INSENSITIVE
are all implemented (doc claimed they raise NotImplementedError), and
TEXT_MATCH is case-sensitive (doc said case-insensitive). Corrected the
operator/condition lists and the semantics note.
- Documented that an intra-batch duplicate node_id raises (vs the
langchain/haystack keep-last behavior).
docs/integrations/agno.md:
- 'Every public method has an async counterpart' was false; listed the
methods that actually have async variants and noted the sync-only ones.
Reported by @gabrielemidulla (#101); remaining items found in a full
doc sweep.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* docs: fix stale code-comments (SIMD, side-car ext, dead example ref)
Follow-up to the API-reference drift pass — same docs-only nature, now in
source comments:
- search.rs module doc claimed only 'AVX2 on x86'; it actually has an
AVX-512BW kernel selected at runtime with an AVX2 fallback (plus a
scalar fallback for other targets). Updated the module doc.
- search.rs AVX-512BW comment described a shared 'avx2_block_epilogue'
helper that is never called; the real path flushes per batch via
avx2_batch_flush_to_fa and finishes with avx2_post_flush_heap_update.
- llama_index.py from_persist_dir docstring said the side-car uses '.pkl';
it is actually '.nodes.json'.
- dump_state.rs example referenced benchmarks/rabitq_poc/poc_apples_to_apples.py,
which was removed in #97; reworded to describe the example generically.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* docs(changelog): record #90/#89 integration fixes; fix Unreleased compare link
Add an [Unreleased] entry documenting the two merged integration fixes
(intra-batch duplicate-id orphaning #90, upsert delete-before-validate
#89), and repoint the stale [Unreleased] compare link from py-v0.4.2 to
the current v0.8.0 tag.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Removes
benchmarks/rabitq_poc/— the numpy proof-of-concept scratch space used to validate two features before they were implemented in the Rust core.Why
The features this directory validated have shipped and are covered by real tests:
turbovec/tests/distortion.rs,tqplus_calibration.rsturbovec/tests/filtering.rsThe POC scripts are a numpy reimplementation of
encode.rsthat was a one-time check and has since drifted from the source. The result snapshots (*_baseline/*_protoJSON, comparison PNGs) have no ongoing consumer:benchmarks/create_diagrams.pyreadsbenchmarks/results/, not these58 files / 1,917 lines removed. No SHAs rewritten — recoverable from history (introduced in #35) if ever needed.
🤖 Generated with Claude Code