Docs: correct API-reference drift, stale comments, and changelog#102
Merged
Conversation
A doc audit against the current source surfaced several drifts. All
docs-only; no code behavior change.
README.md:
- Python persistence example used the non-existent .tq extension → .tv.
- Rust snippets called TurboQuantIndex::new / IdMapIndex::new (both
return Result) and used the value directly — wouldn't compile. Added
.unwrap() to mirror the crate's own doctest examples.
docs/api.md:
- bit_width documented as {2, 4}; the core accepts {2, 3, 4}.
- File-format diagrams were stale (described the pre-magic v1 layout):
.tv now carries a TVPI magic + version 3 + per-vector scales + a TQ+
trailer; .tvim is version 3 (was documented as 1, which the loader now
rejects). Updated both diagrams and the surrounding notes.
docs/integrations/llama_index.md:
- Filter docs were inverted: NOT, ANY, ALL, and TEXT_MATCH_INSENSITIVE
are all implemented (doc claimed they raise NotImplementedError), and
TEXT_MATCH is case-sensitive (doc said case-insensitive). Corrected the
operator/condition lists and the semantics note.
- Documented that an intra-batch duplicate node_id raises (vs the
langchain/haystack keep-last behavior).
docs/integrations/agno.md:
- 'Every public method has an async counterpart' was false; listed the
methods that actually have async variants and noted the sync-only ones.
Reported by @gabrielemidulla (#101); remaining items found in a full
doc sweep.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Follow-up to the API-reference drift pass — same docs-only nature, now in source comments: - search.rs module doc claimed only 'AVX2 on x86'; it actually has an AVX-512BW kernel selected at runtime with an AVX2 fallback (plus a scalar fallback for other targets). Updated the module doc. - search.rs AVX-512BW comment described a shared 'avx2_block_epilogue' helper that is never called; the real path flushes per batch via avx2_batch_flush_to_fa and finishes with avx2_post_flush_heap_update. - llama_index.py from_persist_dir docstring said the side-car uses '.pkl'; it is actually '.nodes.json'. - dump_state.rs example referenced benchmarks/rabitq_poc/poc_apples_to_apples.py, which was removed in #97; reworded to describe the example generically. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…pare link Add an [Unreleased] entry documenting the two merged integration fixes (intra-batch duplicate-id orphaning #90, upsert delete-before-validate #89), and repoint the stale [Unreleased] compare link from py-v0.4.2 to the current v0.8.0 tag. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Addresses #101 (and more found in a full doc sweep). Docs-only / comments-only — no code behavior change.
User-facing docs
README.md
.tqextension →.tv.TurboQuantIndex::new/IdMapIndex::new(both returnResult) and used the value directly — wouldn't compile. Added.unwrap()to match the crate's own doctests.docs/api.md
bit_widthdocumented as{2, 4}; the core accepts{2, 3, 4}..tvnow carries aTVPImagic + version 3 + per-vectorscales+ a TQ+ trailer;.tvimis version 3 (was documented as 1, which the loader now rejects). Updated both diagrams and the surrounding notes.docs/integrations/llama_index.md
NOT,ANY,ALL, andTEXT_MATCH_INSENSITIVEare all implemented (doc claimed they raiseNotImplementedError), andTEXT_MATCHis case-sensitive (doc said case-insensitive). Corrected the operator/condition lists and the semantics note. Documented that an intra-batch duplicatenode_idraises (vs the langchain/haystack keep-last behavior).docs/integrations/agno.md
Code comments
search.rsmodule doc claimed only "AVX2 on x86"; it actually has an AVX-512BW kernel selected at runtime with an AVX2 fallback (+ scalar fallback). Also corrected an AVX-512 comment that named a never-calledavx2_block_epiloguehelper.llama_index.pyfrom_persist_dirdocstring said the side-car uses.pkl; it is.nodes.json.dump_state.rsexample referencedbenchmarks/rabitq_poc/poc_apples_to_apples.py, removed in Remove benchmarks/rabitq_poc research scratch #97; reworded generically.CHANGELOG
[Unreleased]entry for the merged bug: intra-batch duplicate IDs create orphaned vectors (langchain, agno) #90 / bug: upsert deletes old data before validating new data — data loss on validation failure #89 integration fixes.[Unreleased]compare link (py-v0.4.2→v0.8.0).Validation
cargo test -p turbovec --docpasses.cargo build -p turbovec --examplescompiles clean.Reported by @gabrielemidulla (#101); remaining items surfaced in a two-pass doc/comment audit.
🤖 Generated with Claude Code