-
Notifications
You must be signed in to change notification settings - Fork 21
[SVS] Add multi-vector support to the Tiered SVS Index #699
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some comments with explanation of changes.
dc7f3e9
to
1ff0032
Compare
276ac2a
to
6f2a5b5
Compare
4676dc5
to
25ec83b
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #699 +/- ##
=======================================
Coverage 96.84% 96.85%
=======================================
Files 122 122
Lines 7393 7439 +46
=======================================
+ Hits 7160 7205 +45
- Misses 233 234 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
25ec83b
to
e9a71ad
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I reviewed the code, still left to review the tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reviewed unit tests as well
439b0fa
to
fe6e952
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds multi‐vector support to the Tiered SVS Index by introducing a new template parameter (IsMulti) and modifying index parameter initializations and result‐merging logic accordingly. Key changes include:
- Extending test cases and index parameter structs to support multi‐vector mode.
- Updating factory and index construction functions to propagate the new “multi” flag.
- Adjusting diverse index operations (e.g. vector addition, deletion, merging) across SVS, HNSW, and Brute Force modules.
Reviewed Changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
tests/unit/test_svs_tiered.cpp | Updated test templates and functions to use the new multi flag and added new tests for multi-value scenarios. |
src/VecSim/index_factories/tiered_factory.cpp | Modified BFParams creation to set multi based on passed parameters. |
src/VecSim/algorithms/svs/svs_tiered.h | Updated merging logic and added multi-specific behavior in batch computation and index update flows. |
src/VecSim/algorithms/hnsw/hnsw_tiered.h | Updated vector label retrieval method to use getVectorLabel. |
src/VecSim/algorithms/brute_force/*.h | Added overrides for getElementIds in both single and multi implementations. |
Comments suppressed due to low confidence (2)
src/VecSim/algorithms/svs/svs_tiered.h:272
- [nitpick] Consider adding a brief documentation comment for the 'isMultiValue' parameter to explain how it affects the merging of query results.
VecSimQueryReply *compute_current_batch(size_t n_res, bool isMultiValue) {
src/VecSim/algorithms/hnsw/hnsw_tiered.h:574
- [nitpick] Verify that the change to use 'getVectorLabel' (instead of 'getLabelByInternalId') is consistent with similar naming conventions across the codebase to avoid potential confusion.
this->frontendIndex->getVectorLabel(this->frontendIndex->indexSize() - 1);
fe6e952
to
b2f7260
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks very good!
A few suggestions for better readability and tracking of this sophisticated mechanism that was added here
// No swap, just delete is marked by oldId == newId == deleted id | ||
this->swaps_journal.emplace_back(label, id, id); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we set label
to be SKIP_LABEL
directly from here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, missed this point.
It makes sense to write the SKIP_LABEL
value here.
Done.
tests/unit/test_svs_tiered.cpp
Outdated
|
||
// For single-value index, we expect to override the vector. | ||
// For multi-value index, we expect to add a new vector. | ||
const int update_count = is_multi ? 1 : 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice and thorough test!
For better readability, and since edge case logic is quite different here between single and multi, consider splitting it into 2 tests - one for single and one for multi.
Also, for readability, consider validating the journal state after each operation here by maintaining an expected journal, applying the expected changes to it after each operation, and validating it against the actual journal. This may require declaring this test as a friend of type TieredSVSIndex
(see how we do it in HNSW)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you.
Now there are 2 dedicated tests: testSwapJournalSingle
and testSwapJournalMulti
if (ft_ret == 0) { // Vector was overriden - add 'skiping' swap to the journal. | ||
for (auto id : this->frontendIndex->getElementIds(label)) { | ||
this->swaps_journal.emplace_back(SKIP_LABEL, id, id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is possible only for an index of type single, right? Consider documenting/asserting that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added an assert.
4b65a5d
to
8c67746
Compare
Describe the changes in the pull request
This PR adds multi-vector support to the Tiered SVS Index.
NOTE: The pool request is published as 'Draft' because it is based on #690 and cannot be simply merged yet
Which issues this PR fixes
Main objects this PR modified
Mark if applicable