Similarity Search Limited to First 10 Chunks Only #1091
                  
                    
                      zelhaddioui
                    
                  
                
                  started this conversation in
                General
              
            Replies: 1 comment
-
| I think the cause of the issue is that the similarity search method is limiting the number of candidates considered (numCandidates) in the Elasticsearch KNN query, which prevents it from evaluating all possible documents in the index. | 
Beta Was this translation helpful? Give feedback.
                  
                    0 replies
                  
                
            
  
    Sign up for free
    to join this conversation on GitHub.
    Already have an account?
    Sign in to comment
  
        
    
Uh oh!
There was an error while loading. Please reload this page.
-
Description:
I am encountering an issue with the ElasticsearchVectorStore class when performing a similarity search. Specifically, when I execute a search with a topK value set to 2, it seems to only apply the search to the first 10 chunks stored in Elasticsearch, rather than considering all the chunks.
Details:
Library Version: 1.0.0
Elasticsearch Version: 8.13.3
Code Example:
List similarDocuments = vectorStore.similaritySearch(
SearchRequest.query(message).withTopK(2)
);
Issue Observed:
When executing the above code, I expect to retrieve the top 2 most similar documents from all available chunks in Elasticsearch. However, it appears that the search is only applied to the first 10 chunks stored in Elasticsearch, rather than considering all chunks.
Additional Information:
I suspect that the issue might be related to how Elasticsearch pagination is handled or a limitation in the current implementation of the similarity search method. I would appreciate any guidance or fixes to ensure that the search applies to all chunks stored in Elasticsearch.
Beta Was this translation helpful? Give feedback.
All reactions