Skip to content

Conversation

pralay-das
Copy link

@pralay-das pralay-das commented Oct 9, 2025

feat:

  • supported fused RoPE in flash attention
  • Use GMEM data (read & write) for RoPE calculation

used #498 PR as a reference for chunk prefill.

taozha2 and others added 12 commits October 9, 2025 10:38
This change imports `SYCLCompat` to cutlass-sycl repo as `compat`.
Previous dependencies on `syclcompat` are changed to `compat`.
This PR also fix some failures of `SYCLCompat` in oneapi 2025.2.

---------

Co-authored-by: Roland Schulz <[email protected]>
1. This version will compute RoPE on GMEM data
@pralay-das pralay-das force-pushed the dev/pralay/chunk_prefill_rope_on_gmem branch from 2b3344a to ce0bbf2 Compare October 9, 2025 10:39
@pralay-das pralay-das changed the title [PYTORCHDGQ-6865] Added support for RoPE on chunk prefill [PYTORCHDGQ-6865] Added support for RoPE on chunk prefill [WIP] Oct 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants