You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While debugging another issue on our systems, I realized that the btl/self component is not accelerator aware. As of right now, a send-to-self operation from gpu memory to gpu memory, data is copied first to host memory and then back to gpu memory. We should probably fix that and perform a direct accelerator.memcpy() operation, at least for contiguous buffers