-
Notifications
You must be signed in to change notification settings - Fork 15.1k
[SYCL] Add platform enumeration and info query using liboffload #166927
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This is part of the SYCL support upstreaming effort. The relevant RFCs can be found here: https://discourse.llvm.org/t/rfc-add-full-support-for-the-sycl-programming-model/74080 https://discourse.llvm.org/t/rfc-sycl-runtime-upstreaming/74479 The SYCL runtime is device-agnostic and uses liboffload for offloading to GPU. This commit adds a dependency on liboffload, implementation of platform::get_platforms, platform::get_backend and platform::get_info methods, initial implementation of sycl-ls tool for manual testing of added functionality. Plan for next PR: device/context impl, rest of platform test infrastructure (depends on L0 liboffload plugin CI, our effort is joined) ABI tests
|
@tahonermann, @dvrogozh, @asudarsa, @aelovikov-intel, @sergey-semenov, @bader, @againull, @YuriPlyakhin, @vinser52 FYI, published for review. |
|
|
||
| namespace detail { | ||
|
|
||
| template <class Impl, class SyclObject> class ObjBase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would make more sense to do this:
// Helper to implement classes that must obey SYCL's "common reference semantics"
template <typename Impl, typename SyclObject> class ObjBase;
template <typename Impl, typename SyclObject> class ObjBase<Impl &, SyclObject> {
/* your impl almost unmodified */
};That way when we start implementing other classes that would require lifetime management, we'd add another specialization like
template <typename Impl, typename SyclObject> class ObjBase<std::shared_ptr<Impl>, SyclObject> {
/* ... */
};| template <class Obj> | ||
| friend Obj createSyclObjFromImpl( | ||
| std::add_lvalue_reference_t<typename Obj::ImplType> ImplObj); | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::hash support should probably go here as well. We will require some boilerplate code to provide an actual specialization because C++20's
template <class T>
requires(is_sycl_common_reference_semantics_class_v<T>)
struct std::hash<T> {
// sycl_obj_hash impl inlined here.
};isn't available to us, but the implementation itself can be done generically here.
| // Exceptions must be noexcept copy constructible, so cannot use std::string | ||
| // directly. | ||
| std::shared_ptr<std::string> MMessage; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great opportunity to start conversation with libcxx developers (@ldionne to start that, I think) on how/if we can share some of their implementation details. __libcpp_refstring would be perfect here.
Another potential opportunity (in a not so distant future PR) would be related to the implementation of std::shared_ptr. In a nutshell, some of the SYCL objects would probably be implemented (simplified) like this:
class event_impl; // defined inside libsycl.so, not exposed to the public headers
class event {
public:
/* ... */
private:
std::shared_ptr<event_impl> pImpl;
};Ideally, we'd want to inherit event_impl from std::enable_shared_from_this. The problem we saw is that inheritance by itself slowed down construction of event_impl because even if know that we always create them via make_shared<event_iml> the implementation still had to initialize the std::enable_shared_from_this subobject with atomic operations resulting in a measurable overhead. If we could somehow use most the libc++'s implementation of these but with a way to communicate extra guarantees of how those objects are created and not to pay the price of these initialization atomics, that would be great, but I'm not sure if that's possible/worth maintenance efforts.
| template <typename T> struct is_platform_info_desc : std::false_type {}; | ||
|
|
||
| #define __SYCL_PARAM_TRAITS_SPEC(DescType, Desc, ReturnT, OffloadCode) \ | ||
| template <> \ | ||
| struct is_##DescType##_info_desc<info::DescType::Desc> : std::true_type { \ | ||
| using return_type = info::DescType::Desc::return_type; \ | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I grepped our downstream implementation, it seems we use the ::value part in just a few places inside libsycl.so for static_asserts. I think we can change to
template <typename Desc> struct enable_platform_info_return {};
#define __SYCL_PARAM_TRAITS_SPEC(DescType, Desc, ReturnT, OffloadCode) \
template <> \
struct enable_##DescType##_info_return<info::DescType::Desc> { \
using return_type = info::DescType::Desc::return_type; \
};followed by
template <typename Desc> using enable_platform_info_return_t =
typename enable_platform_info<Desc>::return_type;and then introduce some helper inside the library to use for the static_asserts. Something like https://godbolt.org/z/vocr575q4:
template <typename Desc, template <typename> typename Enabler, typename = void>
struct is_enabled : std::false_type {};
template <typename Desc, template <typename> typename Enabler>
struct is_enabled<Desc, Enabler, std::void_t<Enabler<Desc>>> : std::true_type {};Also, we include platform.def specifically, do we really need macro-magic around ##DescType##? Can't we use "platfrom" directly?
| typename backend_traits<Backend>::template return_type<SYCLObjectT>; | ||
|
|
||
| namespace detail { | ||
| inline std::string_view get_backend_name(const backend &Backend) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to use something like _LIBCPP_HIDE_FROM_ABI here, if I understand the idea behind it correctly.
| template <typename Param> | ||
| typename detail::is_platform_info_desc<Param>::return_type get_info() const { | ||
| return get_info_impl<Param>(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably needs _LIBCPP_HIDE_FROM_ABI as well.
| OPTIONAL) | ||
| endif() | ||
|
|
||
| target_compile_options(${LIB_OBJ_NAME} PUBLIC /EHsc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need
llvm-project/clang/docs/UsersManual.rst
Line 5047 in 9cca883
| /Zc:dllexportInlines- Don't dllexport/dllimport inline member functions of dllexport/import classes |
here. We couldn't do that downstream because we support building with MSVC there (and aren't using LLVM_ENABLE_RUNTIMES) that doesn't have support for that option, but in this upstream version we know that we're building with a freshly built clang and the option is available to us.
| void shutdown() { | ||
| // No error reporting in shutdown | ||
| std::ignore = olShutDown(); | ||
| } | ||
|
|
||
| #ifdef _WIN32 | ||
| extern "C" _LIBSYCL_EXPORT BOOL WINAPI DllMain(HINSTANCE hinstDLL, | ||
| DWORD fdwReason, | ||
| LPVOID lpReserved) { | ||
| // Perform actions based on the reason for calling. | ||
| switch (fdwReason) { | ||
| case DLL_PROCESS_DETACH: | ||
| try { | ||
| shutdown(); | ||
| } catch (std::exception &e) { | ||
| // report | ||
| } | ||
|
|
||
| break; | ||
| case DLL_PROCESS_ATTACH: | ||
| break; | ||
| case DLL_THREAD_ATTACH: | ||
| break; | ||
| case DLL_THREAD_DETACH: | ||
| break; | ||
| } | ||
| return TRUE; // Successful DLL_PROCESS_ATTACH. | ||
| } | ||
| #else | ||
| // Setting low priority on destructor ensures it runs after all other global | ||
| // destructors. Priorities 0-100 are reserved by the compiler. The priority | ||
| // value 110 allows SYCL users to run their destructors after runtime library | ||
| // deinitialization. | ||
| __attribute__((destructor(110))) static void syclUnload() { shutdown(); } | ||
| #endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need that in this PR? I thought it's related to GlobalHandler that we decided to introduce later.
| platform_impl * | ||
| platform_impl::getOrMakePlatformImpl(ol_platform_handle_t Platform, | ||
| size_t PlatformIndex) { | ||
| const std::lock_guard<std::mutex> Guard(getPlatformMapMutex()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think just a simple singleton without any mutexes or getOrMakePlatformImpl would suffice for the platforms and that this particular interface is just some legacy from our downstream implementation. We should be able to create full platform/device hierarchy that wouldn't change later, or is it (immutability) not the case with liboffload?
| /// Returns all SYCL platforms from all backends that are available in the | ||
| /// system. | ||
| static std::vector<platform> getPlatforms(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is very subjective, but I think all *_impl classes should only operate with *_impls and not with user-visible SYCL objects. As such, I think it should be
const std::vector<platform_impl> &
// or even range/view of platform_impls
platform_impl::getPlatforms() { ... }
std::vector<platform> platform::get_platforms() { ... }
This is part of the SYCL support upstreaming effort. The relevant RFCs can be found here:
https://discourse.llvm.org/t/rfc-add-full-support-for-the-sycl-programming-model/74080 https://discourse.llvm.org/t/rfc-sycl-runtime-upstreaming/74479
The SYCL runtime is device-agnostic and uses liboffload for offloading to GPU. This commit adds a dependency on liboffload, implementation of platform::get_platforms, platform::get_backend and platform::get_info methods, initial implementation of sycl-ls tool for manual testing of added functionality.
Plan for next PR:
device/context impl, rest of platform
test infrastructure (depends on L0 liboffload plugin CI, our effort is joined) ABI tests