Skip to content

Conversation

rfsaliev
Copy link
Collaborator

Describe the changes in the pull request

This PR adds multi-vector support to the Tiered SVS Index.

NOTE: The pool request is published as 'Draft' because it is based on #690 and cannot be simply merged yet

Which issues this PR fixes

  1. #...
  2. MOD...

Main objects this PR modified

  1. ...
  2. ...

Mark if applicable

  • This PR introduces API changes
  • This PR introduces serialization changes

Copilot

This comment was marked as outdated.

Copy link
Collaborator Author

@rfsaliev rfsaliev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments with explanation of changes.

@rfsaliev rfsaliev marked this pull request as ready for review June 24, 2025 11:09
@rfsaliev rfsaliev force-pushed the rfsaliev/svs-multi-tiered branch from dc7f3e9 to 1ff0032 Compare June 24, 2025 11:54
@rfsaliev rfsaliev force-pushed the rfsaliev/svs-multi branch from 276ac2a to 6f2a5b5 Compare June 24, 2025 12:38
@rfsaliev rfsaliev force-pushed the rfsaliev/svs-multi-tiered branch 2 times, most recently from 4676dc5 to 25ec83b Compare June 24, 2025 15:36
Copy link

codecov bot commented Jun 24, 2025

Codecov Report

Attention: Patch coverage is 96.39640% with 4 lines in your changes missing coverage. Please review.

Project coverage is 96.85%. Comparing base (63ba350) to head (8c67746).
Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
.../VecSim/algorithms/brute_force/brute_force_multi.h 66.66% 2 Missing ⚠️
...VecSim/algorithms/brute_force/brute_force_single.h 75.00% 2 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #699   +/-   ##
=======================================
  Coverage   96.84%   96.85%           
=======================================
  Files         122      122           
  Lines        7393     7439   +46     
=======================================
+ Hits         7160     7205   +45     
- Misses        233      234    +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Base automatically changed from rfsaliev/svs-multi to main June 25, 2025 10:29
@rfsaliev rfsaliev force-pushed the rfsaliev/svs-multi-tiered branch from 25ec83b to e9a71ad Compare June 25, 2025 16:14
Copy link
Collaborator

@alonre24 alonre24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I reviewed the code, still left to review the tests

Copy link
Collaborator

@alonre24 alonre24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reviewed unit tests as well

@rfsaliev rfsaliev force-pushed the rfsaliev/svs-multi-tiered branch 2 times, most recently from 439b0fa to fe6e952 Compare June 26, 2025 17:08
@rfsaliev rfsaliev requested review from Copilot and alonre24 June 26, 2025 17:14
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds multi‐vector support to the Tiered SVS Index by introducing a new template parameter (IsMulti) and modifying index parameter initializations and result‐merging logic accordingly. Key changes include:

  • Extending test cases and index parameter structs to support multi‐vector mode.
  • Updating factory and index construction functions to propagate the new “multi” flag.
  • Adjusting diverse index operations (e.g. vector addition, deletion, merging) across SVS, HNSW, and Brute Force modules.

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/unit/test_svs_tiered.cpp Updated test templates and functions to use the new multi flag and added new tests for multi-value scenarios.
src/VecSim/index_factories/tiered_factory.cpp Modified BFParams creation to set multi based on passed parameters.
src/VecSim/algorithms/svs/svs_tiered.h Updated merging logic and added multi-specific behavior in batch computation and index update flows.
src/VecSim/algorithms/hnsw/hnsw_tiered.h Updated vector label retrieval method to use getVectorLabel.
src/VecSim/algorithms/brute_force/*.h Added overrides for getElementIds in both single and multi implementations.
Comments suppressed due to low confidence (2)

src/VecSim/algorithms/svs/svs_tiered.h:272

  • [nitpick] Consider adding a brief documentation comment for the 'isMultiValue' parameter to explain how it affects the merging of query results.
        VecSimQueryReply *compute_current_batch(size_t n_res, bool isMultiValue) {

src/VecSim/algorithms/hnsw/hnsw_tiered.h:574

  • [nitpick] Verify that the change to use 'getVectorLabel' (instead of 'getLabelByInternalId') is consistent with similar naming conventions across the codebase to avoid potential confusion.
            this->frontendIndex->getVectorLabel(this->frontendIndex->indexSize() - 1);

Copy link
Collaborator

@alonre24 alonre24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good!
A few suggestions for better readability and tracking of this sophisticated mechanism that was added here

Comment on lines 786 to 795
// No swap, just delete is marked by oldId == newId == deleted id
this->swaps_journal.emplace_back(label, id, id);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we set label to be SKIP_LABEL directly from here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, missed this point.
It makes sense to write the SKIP_LABEL value here.
Done.


// For single-value index, we expect to override the vector.
// For multi-value index, we expect to add a new vector.
const int update_count = is_multi ? 1 : 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice and thorough test!
For better readability, and since edge case logic is quite different here between single and multi, consider splitting it into 2 tests - one for single and one for multi.
Also, for readability, consider validating the journal state after each operation here by maintaining an expected journal, applying the expected changes to it after each operation, and validating it against the actual journal. This may require declaring this test as a friend of type TieredSVSIndex (see how we do it in HNSW)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you.
Now there are 2 dedicated tests: testSwapJournalSingle and testSwapJournalMulti

Comment on lines 743 to 747
if (ft_ret == 0) { // Vector was overriden - add 'skiping' swap to the journal.
for (auto id : this->frontendIndex->getElementIds(label)) {
this->swaps_journal.emplace_back(SKIP_LABEL, id, id);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is possible only for an index of type single, right? Consider documenting/asserting that

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an assert.

@rfsaliev rfsaliev force-pushed the rfsaliev/svs-multi-tiered branch from 4b65a5d to 8c67746 Compare July 11, 2025 13:01
@rfsaliev rfsaliev requested a review from alonre24 July 22, 2025 14:07
@rfsaliev rfsaliev added this pull request to the merge queue Jul 22, 2025
Merged via the queue into main with commit 02ef0b1 Jul 22, 2025
19 checks passed
@rfsaliev rfsaliev deleted the rfsaliev/svs-multi-tiered branch July 22, 2025 17:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants