-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Support float8 dtype storage and deepseek v3 with fp8 inference. #9906
base: develop
Are you sure you want to change the base?
Conversation
Thanks for your contribution! |
|
Codecov ReportAttention: Patch coverage is
❌ Your patch check has failed because the patch coverage (42.77%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## develop #9906 +/- ##
===========================================
- Coverage 51.08% 51.08% -0.01%
===========================================
Files 745 748 +3
Lines 119274 119522 +248
===========================================
+ Hits 60927 61053 +126
- Misses 58347 58469 +122 ☔ View full report in Codecov by Sentry. |
return paddle.to_tensor(tensor) | ||
|
||
|
||
class EextendDtypeNumpySafe(unittest.TestCase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extend
is_bf16 = str(tensor.dtype) in ["uint16", "bfloat16"] | ||
tensor = paddle.Tensor.__call__(tensor, zero_copy=True) | ||
lora_A_tensor = paddle.Tensor.__call__(lora_A_tensor, zero_copy=True) | ||
lora_B_tensor = paddle.Tensor.__call__(lora_B_tensor, zero_copy=True) | ||
if self.is_cpu and is_bf16: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里替换__call__函数的原因是什么?
@@ -0,0 +1,226 @@ | |||
# Copyright (c) 2025 PaddlePaddle Authors. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里的CopyRight是不是要加上deepseek
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
可以改成 kernel.py -> fp8_kernel.py
from .configuration import DeepseekV2Config | ||
from .fp8_linear import Linear |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里可以直接import吗? 看起来是把Linear全替换了
@@ -628,36 +635,43 @@ def __init__(self, config: DeepseekV2Config, hidden_size=None, intermediate_size | |||
self.hidden_size = config.hidden_size if hidden_size is None else hidden_size | |||
self.intermediate_size = config.intermediate_size if intermediate_size is None else intermediate_size | |||
|
|||
def linear_dtype_gaurd(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fp8参数的加载已经在在from_pretrained接口适配了?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是的,直接初始化成加载FP8参数。
Before submitting
tests
folder. If there are codecov issues, please add tests cases first.PR types
New features
PR changes
Others
Description
Support float8 dtype storage.
FP8的模型有:
deepseek-ai/DeepSeek-V3-FP8
,deepseek-ai/DeepSeek-R1-FP8
For FP8,
For BFLOAT16