Optimizing the performance of think length limit using custom operators #4279

yuanlehome · 2025-09-25T14:56:31Z

Changed:

修复模型注册机制，原本category是没有生效的，增加is_reasoning_model判断函数（暂未用到）
删除原本通过MultimodalRegistry注册多模模型的方式
删除非nv gpu硬件的思考长度裁剪逻辑代码（原本也没有生效和验证过，后续需要通过自定义算子的形式支持）
修改common_engine.py里的cfg->fd_config，便于阅读
支持nv gpu下的思考长度裁剪功能，通过环境变量FD_LIMIT_THINKING_CONTENT_TRUNCATE_STR来指定用于截断思考的插入字符串，</think> for ernie4_5_vl, \n</think>\n\n for ernie_x1，分别实现了两个自定义算子
支持MTP下的思考长度裁剪
其他一些代码优化

paddle-bot · 2025-09-25T14:56:41Z

Thanks for your contribution!

…into upgrade_limit_think_length

…ehome/FastDeploy into upgrade_limit_think_length

…into upgrade_limit_think_length

K11OntheBoat

LGTM.
得check一下PD分离下，max_think_len=1的极端情况
可以参考这个PR: #4433
这个PR的一些改动修复了极端情况的Bug.

K11OntheBoat · 2025-10-16T12:28:59Z

custom_ops/gpu_ops/limit_thinking_content_length_v2.cu

+__global__ void limit_thinking_content_length_kernel_v2(
+    int64_t *next_tokens,
+    const int *max_think_lens,
+    const int64_t *step_idx,  // step_idx 不再需要被修改，改为 const


这个注释删了

xiaoxiaohehe001

LGTM

…into upgrade_limit_think_length

LiqinruiG

LGTM

gongshaotian

LGTM

yuanlehome marked this pull request as draft September 26, 2025 05:47

delete impl

73384a6

yuanlehome force-pushed the upgrade_limit_think_length branch from 3bad98a to 73384a6 Compare October 13, 2025 03:19

yuanlehome added 2 commits October 13, 2025 11:28

delete min_length&max_length

fe92435

support limit thinking content strategy

1b289b6

yuanlehome marked this pull request as ready for review October 13, 2025 08:07

yuanlehome added 7 commits October 13, 2025 16:26

fix

1912e72

fix

fe0fee8

fix

81674a5

update

3282a2f

Merge branch 'develop' into upgrade_limit_think_length

406676d

Merge branch 'develop' of https://github.com/PaddlePaddle/FastDeploy …

02281b7

…into upgrade_limit_think_length

fix set_value_by_flags_and_idx

6f1f082

yuanlehome force-pushed the upgrade_limit_think_length branch from 62dd5da to 6f1f082 Compare October 14, 2025 10:24

yuanlehome added 11 commits October 15, 2025 11:03

fix

31aa8ee

Merge branch 'develop' into upgrade_limit_think_length

8421f31

fix

bc60b26

Merge branch 'upgrade_limit_think_length' of https://github.com/yuanl…

fe43788

…ehome/FastDeploy into upgrade_limit_think_length

fix

61d9b72

fix

06b5441

Merge branch 'develop' into upgrade_limit_think_length

dcc8dca

Merge branch 'develop' of https://github.com/PaddlePaddle/FastDeploy …

db20c22

…into upgrade_limit_think_length

Merge branch 'develop' of https://github.com/PaddlePaddle/FastDeploy …

948555d

…into upgrade_limit_think_length

update

36ed90d

fix

2f8aa11

K11OntheBoat previously approved these changes Oct 16, 2025

View reviewed changes

xiaoxiaohehe001 previously approved these changes Oct 16, 2025

View reviewed changes

fix

324d17e

yuanlehome dismissed xiaoxiaohehe001’s stale review via 324d17e October 16, 2025 13:08

yuanlehome dismissed K11OntheBoat’s stale review via 324d17e October 16, 2025 13:08

yuanlehome added 9 commits October 16, 2025 21:09

Merge branch 'develop' of https://github.com/PaddlePaddle/FastDeploy …

9caf6f3

…into upgrade_limit_think_length

fix typo

0710f34

fix ci

141608f

Merge branch 'develop' of https://github.com/PaddlePaddle/FastDeploy …

e247d2a

…into upgrade_limit_think_length

Merge branch 'develop' of https://github.com/PaddlePaddle/FastDeploy …

565c4d1

…into upgrade_limit_think_length

fix

9bb4629

fix

41ef32c

Merge branch 'develop' of https://github.com/PaddlePaddle/FastDeploy …

bc43254

…into upgrade_limit_think_length

support mtp

849eaa6

yuanlehome force-pushed the upgrade_limit_think_length branch from 96de19f to 849eaa6 Compare October 20, 2025 04:02

yuanlehome added 7 commits October 20, 2025 12:04

Merge branch 'develop' of https://github.com/PaddlePaddle/FastDeploy …

1fff8f3

…into upgrade_limit_think_length

fix

393d830

fix

2e0f607

update

4fd1dde

Merge branch 'develop' of https://github.com/PaddlePaddle/FastDeploy …

0a0571c

…into upgrade_limit_think_length

update

db4e279

Merge branch 'develop' of https://github.com/PaddlePaddle/FastDeploy …

ba3cf37

…into upgrade_limit_think_length

LiqinruiG approved these changes Oct 20, 2025

View reviewed changes

gongshaotian approved these changes Oct 20, 2025

View reviewed changes

Jiang-Jia-Jun merged commit cef3164 into PaddlePaddle:develop Oct 20, 2025
13 of 16 checks passed

Jiang-Jia-Jun added the skip-ci: coverage label Oct 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimizing the performance of think length limit using custom operators #4279

Optimizing the performance of think length limit using custom operators #4279

Uh oh!

yuanlehome commented Sep 25, 2025 •

edited

Loading

Uh oh!

paddle-bot bot commented Sep 25, 2025

Uh oh!

K11OntheBoat left a comment

Uh oh!

K11OntheBoat Oct 16, 2025

Uh oh!

yuanlehome Oct 16, 2025

Uh oh!

xiaoxiaohehe001 left a comment

Uh oh!

LiqinruiG left a comment

Uh oh!

gongshaotian left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Optimizing the performance of think length limit using custom operators #4279

Optimizing the performance of think length limit using custom operators #4279

Uh oh!

Conversation

yuanlehome commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

paddle-bot bot commented Sep 25, 2025

Uh oh!

K11OntheBoat left a comment

Choose a reason for hiding this comment

Uh oh!

K11OntheBoat Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

yuanlehome Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

xiaoxiaohehe001 left a comment

Choose a reason for hiding this comment

Uh oh!

LiqinruiG left a comment

Choose a reason for hiding this comment

Uh oh!

gongshaotian left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

yuanlehome commented Sep 25, 2025 •

edited

Loading