Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CUDA 11.2] Experiment on RTX 3090 with faiss-gpu 1.6.5 #36

Open
GaoKangYu opened this issue Mar 30, 2021 · 0 comments
Open

[CUDA 11.2] Experiment on RTX 3090 with faiss-gpu 1.6.5 #36

GaoKangYu opened this issue Mar 30, 2021 · 0 comments
Labels
good first issue Good for newcomers

Comments

@GaoKangYu
Copy link

After installing all dependencies according to the README, I have encountered several errors.

By now most of them have been solved.

If you meet the same error, hope you can find some reference here.

My experiment env-info:

GPU : RTX 3090
CUDA : 11.2
Python : 3.8
Pytorch : 1.8.0+cu111
Scikit-learn : 0.24.1

Error 1

When I use faiss-gpu 1.6.3 under CUDA 10.2, process would be killed sometimes when computing 'jaccard distance'.

Abnormal Memory usage : Process was killed when computing 'jaccard distance'.

Solution

  • Upgrade scikit-learn to 0.20.2+.

  • Change n_jobs=-1 to 2 or 4.

#184

cluster = DBSCAN(eps=eps, min_samples=4, metric='precomputed', n_jobs=4)
cluster_tight = DBSCAN(eps=eps_tight, min_samples=4, metric='precomputed', n_jobs=4)
cluster_loose = DBSCAN(eps=eps_loose, min_samples=4, metric='precomputed', n_jobs=4)

Error 2

Abnormal GPU usage

When I use faiss-gpu 1.6.3 under CUDA 11.2, the model training can be processed but encountered CUDA error soon.

That's because faiss-gpu 1.6.3 is not compatible with CUDA 11.2.

Solution

  • Upgrade faiss-gpu to 1.6.5 by using:
conda install -c conda-forge faiss=1.6.5=py38h60a57df_0_cuda
  • Then I got traceback : "module 'faiss' has no attribute 'cast_integer_to long ptr'", solving that by:

#L15

#replacing "cast_integer_to_long_ptr" by "cast_integer_to_idx_t_ptr"
def swig_ptr_from_LongTensor(x):
    assert x.is_contiguous()
    assert x.dtype == torch.int64, 'dtype=%s' % x.dtype
    # error
    return faiss.cast_integer_to_idx_t_ptr(
        x.storage().data_ptr() + x.storage_offset() * 8)
  • In a word, thanks for yxgeee's great work. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants