-
Notifications
You must be signed in to change notification settings - Fork 73
Add sycl_khr_free_function_commands extension #921
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
slawekptak
wants to merge
56
commits into
KhronosGroup:main
from
slawekptak:khr_free_function_commands
Closed
Add sycl_khr_free_function_commands extension #921
slawekptak
wants to merge
56
commits into
KhronosGroup:main
from
slawekptak:khr_free_function_commands
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This extension provides an alternative mechanism for submitting commands to a device via free-functions that require developers to opt-in to the creation of event objects. It also proposes alternative names for several commands (e.g., launch) and simplifies some concepts (e.g., by removing the need for the nd_range class).
Previous "0 or more" wording only made sense when reductions could be optionally provided to functions like parallel_for; now that there are dedicated *_reduce functions, at least one reduction is required.
"is" is more consistent with ISO C++ wording.
Co-authored-by: Greg Lueck <[email protected]>
There is no need to constrain T here because T must be device-copyable in order to construct the accessor passed as an argument.
Renaming sycl::nd_item is not a necessary part of the API redesign for submitting work, so it should be moved to its own extension. This will also give us more time to consider the design and naming of any proposed replacement(s), including how they should interact with new functionality proposed in other KHRs.
There are currently no backends that define interop for reductions, so we can remove these functions for now. If we decide later that these functions are necessary, we can release a revision of the KHR.
Co-authored-by: Andrey Alekseenko <[email protected]>
Co-authored-by: Nikita Kornev <[email protected]>
Changes to names and cv-qualifiers resulted in inconsistent spacing.
Co-authored-by: Pablo Reble <[email protected]>
This restriction potentially improves performance by giving implementations the freedom to submit work immediately where possible.
khr_free_function_commands renames several of the old enqueue APIs. The comments added in this commit are intended to help reviewers, and will not be visible in the specification.
This reverts commit caf3b0a. After discussion, the SYCL WG decided that this was too error-prone.
A more detailed investigation of performance overheads in SYCL implementations has uncovered that the cost associated with using a handler is similar to the cost associated with returning a sycl::event. This commit removes all the handler overloads from the KHR, as a first step towards introducing an alternative design that does not depend on handler.
An instance of the requirements class represents all of the scheduling requirements that must be satisfied when submitting a command. It acts as a replacement for handler that delivers two main improvements: 1) All requirements are captured at once (by the requirements constructor), allowing the presence or absence of specific requirements to be detected at compile-time. 2) All requirements are passed as an argument to the command function, allowing the command function to enqueue work immediately. There are some existing APIs that were dependent on handler that do not yet have a requirements-based equivalent, including: - local memory - specialization constants - kernel bundles
Adding a kernel_bundle<executable> as a requirement should have the same effect as calling handler::use_kernel_bundle. handler::use_kernel_bundle cannot be used in conjunction with commands accepting a kernel object, and the command is defined as ignoring the kernel_bundle in that case. Since the new API receives the command and all requirements simultaneously, and the requirements are known at compile-time, we can instead use a Constraint to ensure that such code doesn't compile.
This is consistent with info::event_command_status. Co-authored-by: Gordon Brown <[email protected]>
Although we could limit errors to the {tracking(true), tracking(false)} case,
this would have to be deferred until runtime. Ensuring that each requirements
object contains only one tracking requirement is simpler and less error-prone.
Commands like copy, memcpy, fill, etc are not kernels and so passing a kernel_bundle as a requirement is not meaningful.
Commands like copy, memcpy, fill, etc take their arguments explicitly rather than being captured by a function, and so there is no need to inform the runtime about which accessors are used. If a command uses an accessor, it must have been passed as an argument.
Any accessor passed to a command that will run on the device must have target::device.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a new, follow-up PR to #644, originally created by John Pennycook. All the future work related to that PR will be continued here. The reason for creating a new PR is that the PR ownership transfer is required.
This extension provides an alternative mechanism for submitting commands to a device via free-functions that require developers to opt-in to the creation of event objects.
It also proposes alternative names for several commands (e.g., launch) and simplifies some concepts (e.g., by removing the need for the nd_range class).