Skip to content

Conversation

cfgfung
Copy link

@cfgfung cfgfung commented Oct 6, 2025

A repo for PyTorch SDPA Forward Upstreaming

Functionalities:

  • BF16/FP16
  • Now supports BSHD layout
  • Output LSE for training
  • Custom softmax scale
  • Is causal

WIP:

  • Solve accuracy issue when seq_len_kv % BLK_N is not fully divisible
  • Potential performance optimizations

@rolandschulz
Copy link

This should be based on #547

@cfgfung cfgfung force-pushed the sdpa_fwd_upstream branch from f99c157 to 2c56c1e Compare October 9, 2025 18:09
@cfgfung cfgfung changed the title First version of SDPA Fwd First version of SDPA Fwd - No need to review Oct 12, 2025
@cfgfung
Copy link
Author

cfgfung commented Oct 12, 2025

This should be based on #547

Hi Roland,

Thanks for the review. Right now it is more like for internal use and PyTorch Integration. Please skip the review for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants