LaunchConfig parameters #828
-
The launch config takes a Does anybody have a mental model that makes this a bit more intuitive? I'm not criticizing the design, I'm just assuming that there's some way of thinking about this that makes the interface feel more intuitive. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Sorry for late reply. Starting the introduction of the Hopper GPU (cc 9.0), the CUDA programming model gains a new level in the thread hierarchy called "thread block clusters." The new hierarchy goes like this: a grid can have one or more clusters, a cluster can have one or more blocks, and a block can have one or more threads. It presents a new challenge to the traditional CUDA C++ triple chevron syntax, because it does not allow simultaneously specifying all hierarchical information at once; that is, In the new projects cuda.core (Python) and cccl-rt (C++) we have the unique opportunity to express the thread hierarchy without issues, using the launch-config-based approach. It is also future-proof should a new hierarchical level be introduced again in the future. |
Beta Was this translation helpful? Give feedback.
Sorry for late reply. Starting the introduction of the Hopper GPU (cc 9.0), the CUDA programming model gains a new level in the thread hierarchy called "thread block clusters." The new hierarchy goes like this: a grid can have one or more clusters, a cluster can have one or more blocks, and a block can have one or more threads.
It presents a new challenge to the traditional CUDA C++ triple chevron syntax, because it does not allow simultaneously specifying all hierarchical information at once; that is,
<<<grid, cluster, block>>>
, wheregrid
,cluster
, andblock
are all dim3 objects with integer-overloads (soN
means(N, 1, 1)
), is not supported due to the ambiguity in overload resolution. …