Fix sub_materialize
for GPU arrays
#261
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Currently,
sub_materialize
(throughsub_materialize_axes
) falls back to materializing on CPU. This PR generalizes that logic by determining the output destination withsimilar
, which helps to support non-Array types like GPU arrays. As a stand-in for other GPU arrays, I test this using JLArrays.JLArray, which is a reference implementation for the GPUArrays.jl interface that runs on CPU.An alternative design would be to define memory layouts for GPU arrays (i.e. #9), which would allow more customizability for GPU array backends, however I think it is helpful to have fallbacks that "just work" if reasonable parts of the Base AbstractArray interface are implemented.
I hit this issue because I was testing out
BlockArrays.BlockedArray
wrapping a GPU array and noticed that callingA[Block(1, 1)]
to access a block instantiated the block on CPU, this PR fixes that issue.