Skip to content

Expose SVS Vamana defaults + debug info [MOD-9759]#702

Merged
alonre24 merged 12 commits intomainfrom
define_svs_defaults
Jun 26, 2025
Merged

Expose SVS Vamana defaults + debug info [MOD-9759]#702
alonre24 merged 12 commits intomainfrom
define_svs_defaults

Conversation

@alonre24
Copy link
Collaborator

@alonre24 alonre24 commented Jun 19, 2025

Describe the changes in the pull request

Expose SVS Vamana default build parameters to be used by redisearch in ft.info, and implement the debugInfo API for svs index and tiered svs.
This includes adding unit tests for the svs factory to validate that both default and non-default parameters are passed properly, and refining the debug info call for the tiered index to compute the effective label set count internally, thereby avoiding the need to reacquire locks in indexLabelCount(). The last change was possible due to moving the getLabelSet virtual call to the abstract index level (all derived indexes were already implementing this call)

Mark if applicable

  • This PR introduces API changes
  • This PR introduces serialization changes

@alonre24 alonre24 requested a review from meiravgri June 19, 2025 13:05
Copy link
Collaborator

@meiravgri meiravgri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have some runtime getters for default values, since some may depend on other runtime parameters.

#define SVS_VAMANA_DEFAULT_ALPHA_IP 0.95f
#define SVS_VAMANA_DEFAULT_GRAPH_MAX_DEGREE 32
#define SVS_VAMANA_DEFAULT_CONSTRUCTION_WINDOW_SIZE 200
#define SVS_VAMANA_DEFAULT_MAX_CANDIDATE_POOL_SIZE (3 * SVS_VAMANA_DEFAULT_CONSTRUCTION_WINDOW_SIZE)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if SVS_VAMANA_DEFAULT_CONSTRUCTION_WINDOW_SIZE is defined by the user differently than the default, but SVS_VAMANA_DEFAULT_MAX_CANDIDATE_POOL_SIZE is not?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, in this case, the info report will not be correct. I believe we should not expose SVS_VAMANA_DEFAULT_MAX_CANDIDATE_POOL_SIZE at all for users and have it only as an internal param, but let's wait for svs devlopers decision

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean it shouldn’t be configurable at all, or just not exposed via the info command?
In any case, if we do expose it in the info output, I think it should reflect the correct runtime-derived value. So maybe the options are:

  1. Don’t expose it now and possibly introduce a runtime fetch API later if needed, or
  2. Remove it altogether if it's not meant to be user-facing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will not expose it now, and if we decide to leave these params as user-facing facing we will expose them to info via API call (with all others)

…VAMANA_DEFAULT_USE_SEARCH_HISTORY

Co-authored-by: meiravgri <[email protected]>
@codecov
Copy link

codecov bot commented Jun 19, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.86%. Comparing base (e53097d) to head (270e68b).
Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #702   +/-   ##
=======================================
  Coverage   96.85%   96.86%           
=======================================
  Files         122      122           
  Lines        7340     7359   +19     
=======================================
+ Hits         7109     7128   +19     
  Misses        231      231           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@alonre24 alonre24 requested a review from meiravgri June 19, 2025 13:59
meiravgri
meiravgri previously approved these changes Jun 19, 2025
@alonre24 alonre24 enabled auto-merge June 19, 2025 14:04
@alonre24 alonre24 added this pull request to the merge queue Jun 19, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jun 19, 2025
@alonre24 alonre24 changed the title Expose svs vamana defaults to be used by redisearch for info Expose SVS Vamana defaults + debug info Jun 22, 2025
@alonre24 alonre24 changed the title Expose SVS Vamana defaults + debug info Expose SVS Vamana defaults + debug info [MOD-9759] Jun 23, 2025
Comment on lines +307 to +317
{
// Compute the union of the labels from both indices.
std::vector<size_t> labels_union;
auto flat_labels = this->frontendIndex->getLabelsSet();
auto main_labels = this->backendIndex->getLabelsSet();
labels_union.reserve(flat_labels.size() + main_labels.size());
std::set_union(flat_labels.begin(), flat_labels.end(), main_labels.begin(),
main_labels.end(), std::back_inserter(labels_union));
info.commonInfo.indexLabelCount = labels_union.size();
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

following the comment on indexLabelCount api, can this code block replace the existing implmneation of indexLabelCount of TieredHNSW and TieredSVS?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not relevant since not adding the new API. In tiered indexes, this should be guarded anyway

Copy link
Collaborator

@rfsaliev rfsaliev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments to getLabelsSet()

@rfsaliev
Copy link
Collaborator

rfsaliev commented Jun 24, 2025

Codecov Report

Attention: Patch coverage is 90.00000% with 5 lines in your changes missing coverage. Please review.

@alonre24, TEST(SVSTest, quant_modes) in test_svs.cpp can be used to improve code coverage for different SVS compression methods.

@alonre24 alonre24 requested a review from meiravgri June 25, 2025 15:25
// labels in a tiered index.
template <typename DataType, typename DistType>
vecsim_stl::set<labelType> HNSWIndex_Single<DataType, DistType>::getLabelsSet() const {
std::shared_lock<std::shared_mutex> index_data_lock(this->indexDataGuard);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add a lock to getLabelsSet in svs as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, the main index guard that is assumed to be held from the tiered index should be enough (we never access internal svs locks from our code), right @rfsaliev ?

// labels in a tiered index, and caller should hold the index data guard.
// labels in a tiered index.
inline vecsim_stl::set<labelType> getLabelsSet() const override {
std::shared_lock<std::shared_mutex> index_data_lock(this->indexDataGuard);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider using the api lockSharedIndexDataGuard

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though we are using that api in many places, I believe it is a better practice to acquire locks like this where possible

@alonre24 alonre24 enabled auto-merge June 26, 2025 07:59
@meiravgri meiravgri self-requested a review June 26, 2025 08:01
@alonre24 alonre24 added this pull request to the merge queue Jun 26, 2025
Merged via the queue into main with commit 82d6e9d Jun 26, 2025
19 checks passed
@alonre24 alonre24 deleted the define_svs_defaults branch June 26, 2025 10:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants