[FEA] General batch knn multi-gpu
algorithm implementation
#727
Labels
feature request
New feature or request
batch knn multi-gpu
algorithm implementation
#727
nn-descent was recently given a batching feature that used clustering + spilling to scale the algorithm to datasets that were much larger than the GPU. This implementation has proven to be very useful for scaling algorithms for data analysis, such as UMAP, TSNE and HDBSCAN.
While nn-descent directly approximates the knn graph well, it is sensitive to CPU threads, and so scaling it directly to multi-GPU introduces additional challenges. One way to handle this is to break apart the "batches" into isolated tasks, each with their own number of threads. However, since the batching algorithm is really agnostic of the ANN algorithm, it would be preferable to host a top-level API for this to more easily allow the ann algorithmt to be plugged in.
The text was updated successfully, but these errors were encountered: