Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sage Attention For deepseek prefilling. WIP #9958

Draft
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

l1cacheDell
Copy link

PR brief

Sage Attention sm90 cuda kernel specialized for Deepseek mixed head_dim=192 inference.

Now make a demo inference code.

Benchmarking...

Copy link

paddle-bot bot commented Feb 27, 2025

Thanks for your contribution!

@l1cacheDell l1cacheDell marked this pull request as draft February 27, 2025 11:13
Copy link

codecov bot commented Feb 27, 2025

Codecov Report

Attention: Patch coverage is 0% with 12 lines in your changes missing coverage. Please review.

Project coverage is 51.07%. Comparing base (45b5012) to head (a6ccb70).
Report is 4 commits behind head on develop.

Files with missing lines Patch % Lines
...erimental/transformers/fused_transformer_layers.py 0.00% 12 Missing ⚠️

❌ Your patch check has failed because the patch coverage (0.00%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.
❌ Your project check has failed because the head coverage (51.07%) is below the target coverage (58.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #9958      +/-   ##
===========================================
- Coverage    51.08%   51.07%   -0.01%     
===========================================
  Files          745      745              
  Lines       119261   119285      +24     
===========================================
+ Hits         60926    60927       +1     
- Misses       58335    58358      +23     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants