Skip to content

Conversation

TedThemistokleous
Copy link
Collaborator

…Tensors (microsoft#16787)

Add compile guards to gate functionality based on MIGRAPHX_STREAM_SYNC for adding the following

  • remove excess hipStreamSyncronize to nullstream on CopyTensor calls
  • Add proper call for stream synchronized CopyTensorAsync for DeviceToHost case

Without this change subsequent CopyTensorAsync() calls will fail for cards that don't use pinned memory thus causing hipMemcpy() calls to occur before certain kernel operations occur.

image

becomes

image


Description

Motivation and Context

…Tensors (microsoft#16787)

Add compile guards to gate functionality based on MIGRAPHX_STREAM_SYNC
for adding the following

- remove excess hipStreamSyncronize to nullstream on CopyTensor calls
- Add proper call for stream synchronized CopyTensorAsync for
DeviceToHost case

Without this change subsequent CopyTensorAsync() calls will fail for
cards that don't use pinned memory thus causing hipMemcpy() calls to
occur before certain kernel operations occur.

![image](https://github.com/microsoft/onnxruntime/assets/107195283/4915c18a-fb2d-40c9-a50e-a7c6613c324b)

becomes

![image](https://github.com/microsoft/onnxruntime/assets/107195283/f661acf4-e2af-4c9a-b26a-30fca339cf1d)

---------

Co-authored-by: Ted Themistokleous <[email protected]>
@TedThemistokleous TedThemistokleous added the Bug Something isn't working label Jul 26, 2023
@groenenboomj groenenboomj force-pushed the rocm5.7_internal_testing branch from b29ddea to 6c56542 Compare July 27, 2023 20:17
@TedThemistokleous
Copy link
Collaborator Author

@groenenboomj do we need this still since its 5.7?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant