We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BlockScan::ExclusiveScan
Currently, cuda.cooperative supports ExclusiveScan, but only provides the overloads for multiple data elements per thread.
cuda.cooperative
ExclusiveScan
We should also provide overloads for block-wide exclusive scan where each thread contributes a single element.
In particular, it would be good to start with the following:
BlockScan::ExclusiveScan 'void (T, T &, ScanOp)
BlockScan::ExclusiveScan 'void (T, T &, ScanOp, BlockPrefixCallbackOp &)
Later on, we could also support:
BlockScan::ExclusiveScan 'void (T, T &, ScanOp, T &)
BlockScan::ExclusiveScan 'void (T, T &, T, ScanOp)
BlockScan::ExclusiveScan 'void (T, T &, T, ScanOp, T &)
It might be useful to refer to #2691 as an example for how to add overloads to an existing algorithm.
The text was updated successfully, but these errors were encountered:
tpn
No branches or pull requests
Currently,
cuda.cooperative
supportsExclusiveScan
, but only provides the overloads for multiple data elements per thread.We should also provide overloads for block-wide exclusive scan where each thread contributes a single element.
In particular, it would be good to start with the following:
BlockScan::ExclusiveScan 'void (T, T &, ScanOp)
BlockScan::ExclusiveScan 'void (T, T &, ScanOp, BlockPrefixCallbackOp &)
Later on, we could also support:
BlockScan::ExclusiveScan 'void (T, T &, ScanOp, T &)
BlockScan::ExclusiveScan 'void (T, T &, T, ScanOp)
BlockScan::ExclusiveScan 'void (T, T &, T, ScanOp, T &)
It might be useful to refer to #2691 as an example for how to add overloads to an existing algorithm.
The text was updated successfully, but these errors were encountered: