-
Notifications
You must be signed in to change notification settings - Fork 175
Move pathfinder
to cuda-python top level
#723
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…HOUT trying to make anything work.
… trying to make anything work.
…make anything work.
…ns.abc.Sequence (Python 3.9+)
…init__.py (to make the dynamic version work).
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
path_finder
to cuda-python top levelpathfinder
to cuda-python top level
def load_nvidia_dynamic_lib(libname: str) -> LoadedDL: | ||
"""Load a NVIDIA dynamic library by name. | ||
Args: | ||
libname: The name of the library to load (e.g. "cudart", "nvvm", etc.) | ||
Returns: | ||
A LoadedDL object containing the library handle and path | ||
Raises: | ||
RuntimeError: If the library cannot be found or loaded | ||
""" | ||
return _load_nvidia_dynamic_lib.load_lib(libname) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could define this wrapper function in the _dynamic_libs
module and just import it here. This way your __init__.py
here wouldn't have a ton of function definitions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is meant to be the public API in one view.
This way your
__init__.py
here wouldn't have a ton of function definitions.
I understood exactly that was your goal: a flat list of available APIs.
Note that I moved the docstring here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is meant to be the public API in one view.
Indeed, everything we define (or import) here that is not prefixed with _
constitutes the public API of the module cuda.path_finder
. My above suggestion is just that we import load_nvidia_dynamic_lib
rather than define it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would make the public API far less obvious. E.g. to see the docstring, they'd need to open a private file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that's true. They could still do:
from cuda.path_finder import load_nvidia_dynamic_lib
help(load_nvidia_dynamic_lib)
Same as they would do now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That'll only work interactively.
What I have now can be inspected in obvious ways directly in the sources, e.g. when looking at the sources on github. I can send URLs pointing to specific APIs in this init.py file.
Each function here will just be:
def function(...) -> ...:
"""docstring""
return call_into_private_code(...)
That's exactly the public API, with a one-line call that's easy to ignore.
What's the point of hiding that away, especially hiding away the docstring and the type hints?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That'll only work interactively.
The docstrings and type hints would also get picked up:
- Sphinx (which we use to generate API docs
-pydoc
- the developers' IDEs/lsp
Those are primarily the ways consumers of a package interact with docstrings or type hints, rather than looking directly at the source.
What's the point of hiding that away, especially hiding away the docstring and the type hints?
It tightens the scope of __init__.py
, whose job is:
- to include any initialization code for the module
- to import stuff from submodules that the module wants to expose
- to define
__all__
for the module if needed
As examples, we can look at the __init__.py
from some other popular libraries:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It tightens the scope of init.py, whose job is
The most important job: Show the public API
You didn't answer why you want to hide that away.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm advocating for keeping function and type definitions outside of __init__.py
. I don't think that "hides" anything from the user as we still import them here.
The main reason I'm advocating for that is because __init__.py
typically only defines functions and types needed for module initialization, and imports anything else. I think the examples I linked to above are a good demonstration of that.
Defining functions and types beyond that serves to clutter __init__.py
.
I would argue it makes the code base less navigable than more for people looking at the source. Do I expect the function load_nvidia_dynamic_libs
to be defined in a file called _dynamic_libs.py
, or a file called __init__.py
?
If this all seems nitpicky, I apologize. While I do have a strong opinion here, I don't mind at all if another reviewer (or you, as the author of this PR) made the final call about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cuda.core's __init__.py
is a good example for what Ashwin suggested. As a developer who occasionally peeks into someone else's __init__.py
, I certainly get confused why we are defining things in there directly, as opposed to properly organizing them in respective public/private modules, and just import them. What's imported are considered public APIs (including types). Does it make sense?
I'm a bit lost, to be honest, and I’d appreciate some clarification so I can move this forward in the direction you want. The goals that seemed most important to me originally (from a Slack post last Wednesday) were:
After Ashwin’s comment on Slack, I interpreted the request as: prioritize a flat public API, even if that means giving up on some of that isolation — especially when type hints are involved. Then Leo wrote:
That seems more aligned with what I had before the last commit (e.g., a file like I’m happy to implement what you both want here — I just need a bit more clarity. If we go back to what I proposed last week: from cuda.pathfinder import nvidia_dynamic_libs, nvidia_static_libs, nvidia_headers
nvidia_dynamic_libs.load("nvrtc")
nvidia_static_libs.find("cudart")
nvidia_headers.find_file("cuda.h") The file structure was:
I thought that structure balanced modularity and clarity, and users could still import just what they needed. But it sounds like we might instead prefer: from cuda.pathfinder import load_nvidia_dynamic_lib, find_nvidia_static_lib, find_nvidia_header_file That keeps the API flat, but does give up on the isolation goal unless we resort to function-local imports or other workarounds. Could you please help me understand: What would be the preferred file/module structure to go along with the flat API? Should I keep separate modules like |
At this point we need to follow up offline. I think there is a gap that we unfortunately did not really cover during the weekly meeting. I suggest we don't cancel the dev meeting tomorrow and use the time to finalize it. This PR also needs at least @kkraus14's approval for the change of license. |
/ok to test |
Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
@@ -286,6 +292,11 @@ jobs: | |||
- name: Set up compute-sanitizer | |||
run: setup-sanitizer | |||
|
|||
- name: Run cuda.pathfinder tests with see_what_works |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a blocker for getting this in, but in theory since we're just testing the viability of finding and loading libraries, we shouldn't need to run these tests with a GPU where we may need to split these tests into a separate workflow.
I think the only tricky part would be handling the driver, libcuda
, where I'm not sure how easy it is to install and load without a GPU.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds good.
I think it'll be useful to keep the see_what_works
testing here, for very inexpensive extra test coverage, but probably testing can and should indeed be run on machines without a GPU.
Currently libcuda
isn't covered by the pathfinder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds good.
I think it'll be useful to keep the see_what_works
testing here, for very inexpensive extra test coverage, but probably testing can and should indeed be run on machines without a GPU.
Currently libcuda
isn't covered by the pathfinder.
cuda_bindings/pyproject.toml
Outdated
@@ -27,6 +27,7 @@ dynamic = [ | |||
"readme", | |||
] | |||
dependencies = [ | |||
"cuda-pathfinder", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a problem where when we release a 2.0
with breaking changes, then older releases of cuda_bindings
would break from finding it.
* `cuda.pathfinder.load_nvidia_dynamic_lib(libname: str) -> LoadedDL` | ||
|
||
* `cuda.pathfinder.LoadedDL`: | ||
* `handle` (platform-specific type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any concern with this being platform-specific type and API stability? Is this effectively the handle returned from dlopen
on Linux?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, for Linux it's a plain int
, for Windows a pywintypes.HANDLE
. This seems to be mainstream for each platform, and very unlikely to require changes.
I guess we could introduce a level of indirection (some wrapper type), but I figure that'll be far more trouble (confused users) than we could hope to avoid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed on the wrapper type. Last question since this is public API is is this handle
needed to be a public attribute today? Do we expect usage outside of it how we use it for cuda.bindings
?
…pre-commit autoupdate --freeze`, but keeping only the change for mypy).
/ok to test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved, but would like to finish the conversation on whether handle
is needed to be a public attribute
Thanks! I'm thinking of that as essential, with the primary use case in mind, like here:
It's unfortunate that there is no universal handle type, but I believe what we're using (in the checked-in code already) are de-facto standards. Introducing our own handle type: presumed over-engineering, more likely to cause confusion than being helpful. Hiding the handle object: presumed to cause usability hardships. While writing this, I realized that I lost comments that I had earlier, about how to use the handle objects in cython. I'll work on adding them back here (in
|
In my previous comment I wrote:
I quizzed the LLMs a bit, and I'm surprised to learn:
@kkraus14 @leofang does that match your experience? If yes, I believe I should rework (I'm also much better set up now to get this done quickly, compared to my situation a couple months ago: I have easy interactive access to a Windows development environment now, and fairly comprehensive CI testing is in place already.) |
Yes, ctypes is probably the most common way that a shared library is loaded in Python in this way. That being said, I don't think we particularly want to encourage users to start making function calls into the loaded cuda library, so returning a My 2c: I think we should remove the If someone wanted a ctypes object based on |
Cool! Then we can simplify and standardize to:
With that, This is based on the observation that ultimately For Linux that's extremely obvious:
For Windows we have to drill down:
With that, we just have to be careful to consistently cast from/to unsigned integers. Linux: For Windows, claude.ai suggests:
|
…finder/tests pass, cuda_bindings/tests are broken). Interactive testing: CUDA_PATHFINDER_TEST_LOAD_NVIDIA_DYNAMIC_LIB_STRICTNESS=all_must_work pytest -ra -s -v tests/ python ../toolshed/run_cuda_pathfinder.py
This does not break any cuda_pathfinder tests. (cuda_bindings tests are "more broken".)
/ok to test |
@kkraus14 Could you please take another look? The changes in
The changes in |
I'm quite happy with the latest set of changes in terms of the state of the public API at this point. As far as the type of the now private handle attribute, I would defer entirely to your best judgement on how it is typed and represented based on our own consumption of it. Will do a deeper dive implementation review when I get some spare cycles. |
I'm very happy with the latest state of the pathfinder code, as under PR #751. Without your feedback, I'd probably have standardized on
To the last point: unsigned ints are the de-facto standard under Linux, signed ints seem to be more common under Windows. In theory, we could go with that, but then people would always have to think twice. A simple "it's always an unsigned int" gives them a firm reference point on our side, then they only need to figure out what the "other side" (e.g. |
Testing under PR #751 passed: https://github.com/NVIDIA/cuda-python/actions/runs/16180272303?pr=751 I'll hold off moving the changes over to this PR until after your review. (I assume the split makes the review a little easier.) |
For last-minute polishing, I worked under #751 on hardening the
(The other changes in e90855e and 15c5791 are just nice to haves.) |
Description
Closes #719, #708
Note: The name is changed from
path_finder
topathfinder
. The word "pathfinder" is very common, and it avoids underscore/dash confusion between directory names and package names.Move
cuda.bindings.path_finder
→cuda.pathfinder
, with backward compatibility:cuda.bindings.path_finder
forwards tocuda.pathfinder
.Make
cuda-bindings
dependent oncuda-pathfinder
.Remove 32-bit DLL names from
SUPPORTED_WINDOWS_DLLS
and add runtime guard incuda.pathfinder.load_nvidia_dynamic_lib()
: "requires 64-bit Python".Add specific
DynamicLibNotFound
exception type to public API (this wasRuntimeError
before; now inherits fromRuntimeError
).Run
mypy-pathfinder
frompre-commit
(after systematicmypy
cleanup of library code).Adjust GitHub Actions jobs to the move.
Add
nvidia_wheels_cu12
to[project.optional-dependencies]
incuda_pathfinder/pyproject.toml
and ensure all libs are loaded successfully. Treat all NVIDIA libs as "supported".Mark
cuda.bindings.path_finder
as version 1.0.0