Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Importing DistributedSamplerWrapper will invalidate the setting CUDA_VISIBLE_DEVICE. #1451

Open
5 of 10 tasks
zezhishao opened this issue Mar 13, 2024 · 1 comment
Open
5 of 10 tasks
Assignees
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@zezhishao
Copy link

zezhishao commented Mar 13, 2024

🐛 Bug Report

After from catalyst.data.sampler import DistributedSamplerWrapper, setting CUDA_VISIBLE_DEVICE will have no effect.
To me, this is a bit counterintuitive. Is this correct, I want to know what is the reason and how to fix it?

How To Reproduce

import os
import torch
from catalyst.data.sampler import DistributedSamplerWrapper

os.environ["CUDA_VISIBLE_DEVICES"] = "0"
device_num = torch.cuda.device_count()
print(device_num) # Ouput: 2
import os
import torch
# from catalyst.data.sampler import DistributedSamplerWrapper

os.environ["CUDA_VISIBLE_DEVICES"] = "0"
device_num = torch.cuda.device_count()
print(device_num) # Ouput: 1

Environment

Catalyst version: 22.04
PyTorch version: 2.2.1+cu118
Is debug build: No
CUDA used to build PyTorch: 11.8
TensorFlow version: N/A
TensorBoard version: 2.16.2

OS: Ubuntu 22.04.2 LTS
GCC version: (Ubuntu 11.3.0-1ubuntu1~22.04.1) 11.3.0
CMake version: Could not collect

Python version: 3.9
Is CUDA available: Yes
CUDA runtime version: Could not collect
GPU models and configuration: 
GPU 0: Tesla V100-PCIE-32GB
GPU 1: Tesla V100-PCIE-32GB

Nvidia driver version: 525.125.06
cuDNN version: Could not collect

Versions of relevant libraries:
[pip3] catalyst==22.4
[pip3] easy-torch==1.3.2
[pip3] numpy==1.22.4
[pip3] pytorch-triton==3.0.0+a9bc1a3647
[pip3] tensorboard==2.16.2
[pip3] tensorboard-data-server==0.7.2
[pip3] tensorboardX==2.6.2.2
[pip3] torch==2.2.1+cu118
[pip3] torchaudio==2.2.1+cu118
[pip3] torchvision==0.17.1+cu118
[conda] catalyst                  22.4                     pypi_0    pypi
[conda] easy-torch                1.3.2                    pypi_0    pypi
[conda] numpy                     1.22.4                   pypi_0    pypi
[conda] pytorch-triton            3.0.0+a9bc1a3647          pypi_0    pypi
[conda] tensorboard               2.16.2                   pypi_0    pypi
[conda] tensorboard-data-server   0.7.2                    pypi_0    pypi
[conda] tensorboardx              2.6.2.2                  pypi_0    pypi
[conda] torch                     2.2.1+cu118              pypi_0    pypi
[conda] torchaudio                2.2.1+cu118              pypi_0    pypi
[conda] torchvision               0.17.1+cu118             pypi_0    pypi

Checklist

  • bug description
  • steps to reproduce
  • expected behavior
  • environment
  • code sample / screenshots

FAQ

Please review the FAQ before submitting an issue:

@zezhishao zezhishao added bug Something isn't working help wanted Extra attention is needed labels Mar 13, 2024
Copy link

Hi! Thank you for your contribution! Please re-check all issue template checklists - unfilled issues would be closed automatically. And do not forget to join our slack for collaboration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants