While working on stable install testing (xref rapidsai/build-planning#227), I noticed that cucim imports were throwing an error on all CUDA and Python versions we were testing:
>>> import cucim
dlopen error libcuda.so.1: cannot open shared object file: No such file or directory
missing cuda symbols while dynamic loading
cuFile initialization failed
After meeting with the team, we identified that libcuda.so.1 isn't available when running in a non-GPU-enabled docker container (in this case rapidsai/citestwheel running without --gpus).
After adding --gpus all to the container instantiation, everything imports and runs without issue.
So, everything is working as expected, which is great! The question is whether or not cucim needs to link against the driver. If it doesn't, it means we can perform simple symbol loading tests on a CPU-only runner. That's not a huge priority, but it is a nice-to-have if the current driver linking isn't necessary.
In the short term, I'm going to ignore that particular error from cucim imports in rapidsai/integration#837
Medium-term, we plan to improve the depth of import and general usage tests which will require GPU runners anyway
While working on stable install testing (xref rapidsai/build-planning#227), I noticed that
cucimimports were throwing an error on all CUDA and Python versions we were testing:After meeting with the team, we identified that
libcuda.so.1isn't available when running in a non-GPU-enabled docker container (in this caserapidsai/citestwheelrunning without--gpus).After adding
--gpus allto the container instantiation, everything imports and runs without issue.So, everything is working as expected, which is great! The question is whether or not
cucimneeds to link against the driver. If it doesn't, it means we can perform simple symbol loading tests on a CPU-only runner. That's not a huge priority, but it is a nice-to-have if the current driver linking isn't necessary.In the short term, I'm going to ignore that particular error from
cucimimports in rapidsai/integration#837Medium-term, we plan to improve the depth of import and general usage tests which will require GPU runners anyway