-
Notifications
You must be signed in to change notification settings - Fork 82
Open
Description
Hello,
I'm currently trying to use the grouped gemm code in my project, but I've noticed that in every iteration, workspace is initialized (based on torch::Tensor workspace = torch::empty(workspace_size, options)); that seems unnecessary?
Because cutlass's workspace is reuseable. And it seems to affect performance when used frequently, such as in many MoE layers, or when the MxNxK is large. Has anyone tested the effects of this?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels