Skip to content

[INTEL HPU] add fused block atten #1706

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 26, 2025

Conversation

yanfeich
Copy link
Contributor

Optimize HPU fused block attention.

  1. fuse residual add into RMS+Rope
  2. fuse residual add into RMS+MLP
  3. fuse Rope+SDPA as fused_block_attention
  4. optimize prepare_block_metadata
  5. Add related UT

Copy link

paddle-bot bot commented May 23, 2025

Thanks for your contribution!

@yanfeich
Copy link
Contributor Author

add @LeoZhao-Intel @JianyuLi01 @zongwave @fmiao2372 @feiwan1 to review.
add @xiaoguoguo626807 to review.

@xiaoguoguo626807 xiaoguoguo626807 merged commit d8e6b25 into PaddlePaddle:develop May 26, 2025
10 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants