Skip to content

Conversation

@lbushi25
Copy link
Contributor

Implement the new enqueue functions for free function kernels that were added in #19995

@lbushi25 lbushi25 requested a review from a team as a code owner November 20, 2025 05:55
Copy link
Contributor

@cperkinsintel cperkinsintel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realize we were changing this. Nice!e

// Free function kernel enqueue functions
template <auto *Func, typename... ArgsT>
void single_task(queue Q, kernel_function_s<Func> KernelFunc, ArgsT &&...Args) {
(void)KernelFunc;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(void)KernelFunc;

same below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the catch, the functions that take queue actually use the parameter, it's the handler ones that don't.

Comment on lines +295 to +297
submit(Q, [&](handler &CGH) {
single_task(CGH, KernelFunc, std::forward<ArgsT>(Args)...);
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a submit_direct* version of this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a submit_direct* version of this? Please sync with @slawekptak to implement it properly from the start rather than create more future work for him.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, there is no submit_direct* version of this in the spec.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can have it in detail:: still. Also, queue::* itself can act as submit_direct.

Comment on lines +304 to +306
queue Q = CGH.getQueue();
sycl::kernel_bundle Bndl =
get_kernel_bundle<Func, sycl::bundle_state::executable>(Q.get_context());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This creates and destroys two std::shared_ptrs for almost no reason. IMO, we should fix this getQueue() hack while we're in an ABI breaking window. Maybe by changing handler_impl to store a reference to the sycl::queue it was created with? handler_impl::MQueueOrGraph isn't used directly outside a few getters, so the change should be very simple.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate a bit please?
What shared pointers are you referring to?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::shared_ptr<detail::queue_impl> impl;
queue(std::shared_ptr<detail::queue_impl> impl) : impl(impl) {}

and similar for sycl::context.

Comment on lines +150 to +158
template <auto *> struct kernel_function_s;
template <auto *Func, typename... Args>
void single_task(handler &, kernel_function_s<Func>, Args &&...);
template <auto *Func, int Dimensions, typename... Args>
void nd_launch(handler &, nd_range<Dimensions>, kernel_function_s<Func>,
Args &&...);
template <auto *Func, int Dimensions, typename Properties, typename... Args>
void nd_launch(handler &, launch_config<nd_range<Dimensions>, Properties>,
kernel_function_s<Func>, Args &&...);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is all of that just for handler::getQueue()? Can you extend

namespace detail {
class HandlerAccess {
public:
static void internalProfilingTagImpl(handler &Handler) {
Handler.internalProfilingTagImpl();
}
template <typename RangeT, typename PropertiesT>
static void parallelForImpl(handler &Handler, RangeT Range, PropertiesT Props,
kernel Kernel) {
Handler.parallel_for_impl(Range, Props, Kernel);
}
static void swap(handler &LHS, handler &RHS) {
std::swap(LHS.implOwner, RHS.implOwner);
std::swap(LHS.impl, RHS.impl);
std::swap(LHS.MLocalAccStorage, RHS.MLocalAccStorage);
std::swap(LHS.MStreamStorage, RHS.MStreamStorage);
std::swap(LHS.MKernelName, RHS.MKernelName);
std::swap(LHS.MKernel, RHS.MKernel);
std::swap(LHS.MSrcPtr, RHS.MSrcPtr);
std::swap(LHS.MDstPtr, RHS.MDstPtr);
std::swap(LHS.MLength, RHS.MLength);
std::swap(LHS.MPattern, RHS.MPattern);
std::swap(LHS.MHostKernel, RHS.MHostKernel);
std::swap(LHS.MCodeLoc, RHS.MCodeLoc);
}
// pre/postProcess are used only for reductions right now, but the
// abstractions they provide aren't reduction-specific. The main problem they
// solve is
//
// # User code
// q.submit([&](handler &cgh) {
// set_dependencies(cgh);
// enqueue_whatever(cgh);
// }); // single submission
//
// that needs to be implemented as multiple enqueues involving
// pre-/post-processing internally. SYCL prohibits recursive submits from
// inside control group function object (lambda above) so we need some
// internal interface to implement that.
__SYCL_EXPORT static void preProcess(handler &CGH, type_erased_cgfo_ty F);
__SYCL_EXPORT static void postProcess(handler &CGH, type_erased_cgfo_ty F);
template <class FunctorTy>
static void preProcess(handler &CGH, FunctorTy &Func) {
preProcess(CGH, type_erased_cgfo_ty{Func});
}
template <class FunctorTy>
static void postProcess(handler &CGH, FunctorTy &Func) {
postProcess(CGH, type_erased_cgfo_ty{Func});
}
};
} // namespace detail
instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that has been added to access getQueue. Now that you've brought HandlerAccess to my attention, it seems like a better solution so I'll try to migrate it over there instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants