Skip to content

Conversation

@steffenlarsen
Copy link
Contributor

This commit adds a new extension for inter-process communicable SYCL object handles. As part of the initial version of this extension, only inter-process communicable memory is exposed.

@steffenlarsen steffenlarsen force-pushed the ext_oneapi_inter_process_communication branch from cae8ee3 to 14415fe Compare September 10, 2025 08:16
@steffenlarsen steffenlarsen force-pushed the ext_oneapi_inter_process_communication branch from 14415fe to 89f495c Compare September 10, 2025 11:03
@steffenlarsen steffenlarsen force-pushed the ext_oneapi_inter_process_communication branch from 89f495c to 5f443ab Compare September 10, 2025 12:03
@steffenlarsen steffenlarsen force-pushed the ext_oneapi_inter_process_communication branch from 5f443ab to b7d2052 Compare September 10, 2025 12:53
@steffenlarsen steffenlarsen force-pushed the ext_oneapi_inter_process_communication branch from b7d2052 to 07f4355 Compare September 10, 2025 15:57
@steffenlarsen steffenlarsen force-pushed the ext_oneapi_inter_process_communication branch from 07f4355 to 7364b75 Compare September 11, 2025 05:40
This commit adds a new extension for inter-process communicable SYCL
object handles. As part of the initial version of this extension, only
inter-process communicable memory is exposed.

Signed-off-by: Larsen, Steffen <[email protected]>
Signed-off-by: Larsen, Steffen <[email protected]>
Signed-off-by: Larsen, Steffen <[email protected]>
* An exception with the `errc::feature_not_supported` error code if device
`dev` does not have `aspect::ext_oneapi_ipc_memory`.
* An exception with the `errc::invalid` error code if the handle data
`handle_data` has an unexpected number of bytes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the number of bytes is correct, but the content is garbage? Can we broaden this exception like this:

  • An exception with the errc::invalid error code if the handle data handle_data does not represent a valid IPC handle for USM memory on this host system.

That would also cover the case when the handle had the wrong number of bytes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am reluctant to promise an error code if handle data is corrupt. The byte size is something we can check at a SYCL/UR level, but the validity of the data is up to UMF to decide and the API documentation doesn't specify which error is returned if the data is off. Maybe a precondition?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should push this requirement down to the UMF and down to Level Zero if necessary. They should be able to define an error code for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will open an issue against UMF, but I think it might be tricky. What happens if the data looks correct to UMF, but using it causes UB or invalid access in the implementation? I suppose that's something they can look into though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UMF shouldn't look at the handle content at all. It should just pass it to the underlying driver API, and that API should return an error code if the handle content is invalid. This assumes that each driver is able to diagnose an error in such a case, but that seems quite reasonable to me. For example, cudaIpcOpenMemHandle is defined to return cudaErrorInvalidResourceHandle in this case. I don't see an error code listed for Level Zero, but that seems like a bug. They should be able to return an error code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There may be some wrapper information in the UMF IPC handle, but I don't know the exact details. Maybe @vinser52 has some insight into this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Friendly ping, @vinser52 .

Signed-off-by: Larsen, Steffen <[email protected]>
Signed-off-by: Larsen, Steffen <[email protected]>
Signed-off-by: Larsen, Steffen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants