[CI]【Hackathon第9阶段开发示例NO 12】功能模块fastdeploy/spec_decode/mtp.py 单测补充 #5068

0Ayachi0 · 2025-11-15T19:35:50Z

Motivation

NO.12 功能模块 fastdeploy/spec_decode/mtp.py 单测补充

Modifications

add unittest tests/spec_decode/test_mtp.py

Usage or Command

no need

Accuracy Tests

no need

Checklist

Add at least a tag in the PR title.
- Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
- You can add new tags based on the PR content, but the semantics must be clear.
Format your code, run pre-commit before commit.
unit tests. Please write the reason in this PR if no unit tests.
Provide accuracy results.
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

paddle-bot · 2025-11-15T19:35:56Z

Thanks for your contribution!

CLAassistant · 2025-11-15T19:35:59Z

All committers have signed the CLA.

Copilot

Pull Request Overview

This PR adds unit tests for the fastdeploy/spec_decode/mtp.py module as part of Hackathon Phase 9 Task NO.12. The tests cover initialization and basic operations of the MTPProposer class used in speculative decoding.

Comprehensive test coverage for MTPProposer initialization and configuration
Tests for cache management methods (initialize, clear, update)
Tests for utility methods like exist_prefill and is_chunk_prefill_enabled

Copilot · 2025-11-15T19:40:06Z

tests/spec_decode/test_mtp.py

+        self.mock_target_model_inputs = {
+            "block_tables": paddle.zeros([8, 100], dtype="int32"),
+            "input_ids": paddle.zeros([8, 2048], dtype="int64"),
+            "seq_lens_this_time": paddle.zeros([8], dtype="int32"),
+            "seq_lens_encoder": paddle.zeros([8], dtype="int32"),
+            "seq_lens_decoder": paddle.zeros([8], dtype="int32"),
+            "step_idx": paddle.zeros([8], dtype="int32"),
+            "stop_flags": paddle.zeros([8], dtype="bool"),
+            "stop_nums": paddle.zeros([8], dtype="int32"),
+            "pre_ids": paddle.zeros([8], dtype="int64"),
+            "output_cum_offsets": paddle.zeros([8], dtype="int32"),
+            "output_padding_offset": paddle.zeros([8], dtype="int32"),
+            "ids_remove_padding": paddle.zeros([8], dtype="int64"),
+            "batch_id_per_token": paddle.zeros([8], dtype="int32"),
+            "cu_seqlens_q": paddle.zeros([9], dtype="int32"),
+            "cu_seqlens_k": paddle.zeros([9], dtype="int32"),
+            "decoder_batch_ids": paddle.zeros([8], dtype="int32"),
+            "decoder_tile_ids_per_batch": paddle.zeros([8], dtype="int32"),
+            "decoder_num_blocks_cpu": paddle.zeros([8], dtype="int32"),
+            "decoder_num_blocks_device": paddle.zeros([8], dtype="int32"),
+            "decoder_chunk_size_device": paddle.zeros([8], dtype="int32"),
+            "max_len_tensor_cpu": paddle.zeros([8], dtype="int32"),
+            "encoder_batch_ids": paddle.zeros([8], dtype="int32"),
+            "encoder_tile_ids_per_batch": paddle.zeros([8], dtype="int32"),
+            "encoder_num_blocks_x_cpu": paddle.zeros([8], dtype="int32"),
+            "kv_batch_ids": paddle.zeros([8], dtype="int32"),
+            "kv_tile_ids_per_batch": paddle.zeros([8], dtype="int32"),
+            "kv_num_blocks_x_cpu": paddle.zeros([8], dtype="int32"),
+            "prompt_lens": paddle.zeros([8], dtype="int32"),
+            "top_p": paddle.ones([8], dtype="float32") * 0.7,
+            "top_k": paddle.ones([8], dtype="int32") * 50,
+            "temperature": paddle.ones([8], dtype="float32") * 1.0,
+            "eos_token_id": paddle.ones([8, 1], dtype="int64") * 2,
+            "penalty_score": paddle.ones([8], dtype="float32"),
+            "frequency_score": paddle.zeros([8], dtype="float32"),
+            "presence_score": paddle.zeros([8], dtype="float32"),
+            "infer_seed": paddle.zeros([8], dtype="int64"),
+            "max_dec_len": paddle.ones([8], dtype="int32") * 256,
+            "min_dec_len": paddle.zeros([8], dtype="int32"),
+            "bad_tokens": paddle.zeros([8, 0], dtype="int64"),
+            "draft_tokens": paddle.zeros([8, 10], dtype="int64"),
+            "accept_tokens": paddle.zeros([8, 10], dtype="int64"),
+            "accept_num": paddle.zeros([8], dtype="int32"),
+            "encoder_block_lens": paddle.zeros([8], dtype="int32"),
+            "cu_batch_token_offset": paddle.zeros([9], dtype="int32"),
+            "temp_scaled_logprobs": None,
+            "top_p_normalized_logprobs": None,
+            "draft_logits": None,
+        }


[nitpick] The test creates large mock tensors (e.g., paddle.zeros([8, 2048], dtype="int64") for input_ids) that consume significant memory but aren't used meaningfully in most tests. This could slow down test execution unnecessarily. Consider:

Using smaller tensor dimensions for tests that don't require full-size tensors

Lazy-initializing tensors only when needed

Sharing tensor instances across tests when the values don't change

For example, input_ids with shape [8, 2048] could be [2, 10] for most tests unless the full dimensions are specifically being tested.

Copilot · 2025-11-15T19:40:07Z

tests/spec_decode/test_mtp.py

+    @patch("fastdeploy.spec_decode.mtp.get_attention_backend")
+    @patch("fastdeploy.spec_decode.mtp.get_rope")
+    @patch("fastdeploy.spec_decode.mtp.MTPSampler")
+    def test_is_chunk_prefill_enabled(self, mock_sampler, mock_get_rope, mock_get_attn_backend, mock_get_model_loader):


Missing documentation for the test method. According to the custom coding guidelines, functions should have clear documentation. Add a docstring that explains what behavior is being tested, for example:

"""Test that is_chunk_prefill_enabled always returns True for MTPProposer."""

Copilot · 2025-11-15T19:40:07Z

tests/spec_decode/test_mtp.py

+    def setUp(self):
+        """Set up test environment"""
+        # Mock FDConfig
+        self.mock_fd_config = MagicMock()
+        self.mock_fd_config.model_config = MagicMock()
+        self.mock_fd_config.model_config.architectures = ["ErnieMoeForCausalLM"]
+        self.mock_fd_config.model_config.num_hidden_layers = 32
+        self.mock_fd_config.model_config.max_model_len = 2048
+        self.mock_fd_config.model_config.hidden_size = 1024
+        self.mock_fd_config.model_config.num_attention_heads = 16
+        self.mock_fd_config.model_config.num_key_value_heads = 16
+        self.mock_fd_config.model_config.head_dim = 64
+        self.mock_fd_config.model_config.rope_theta = 10000.0
+        self.mock_fd_config.model_config.enable_logprob = False
+        self.mock_fd_config.speculative_config = MagicMock()
+        self.mock_fd_config.speculative_config.mtp_strategy = "standard"
+        self.mock_fd_config.speculative_config.num_gpu_block_expand_ratio = 1.0
+        self.mock_fd_config.speculative_config.model = "test_model"
+        self.mock_fd_config.speculative_config.quantization = ""
+        self.mock_fd_config.speculative_config.method = "mtp"
+        self.mock_fd_config.scheduler_config = MagicMock()
+        self.mock_fd_config.scheduler_config.splitwise_role = "mixed"
+        self.mock_fd_config.cache_config = MagicMock()
+        self.mock_fd_config.cache_config.block_size = 16
+        self.mock_fd_config.cache_config.enc_dec_block_num = 0
+        self.mock_fd_config.cache_config.total_block_num = 100
+        self.mock_fd_config.cache_config.kv_cache_ratio = 0.9
+        self.mock_fd_config.cache_config.enable_prefix_caching = False
+        self.mock_fd_config.cache_config.enable_chunked_prefill = False
+        self.mock_fd_config.graph_opt_config = MagicMock()
+        self.mock_fd_config.graph_opt_config.draft_model_use_cudagraph = False
+        self.mock_fd_config.graph_opt_config.cudagraph_capture_sizes = []
+        self.mock_fd_config.graph_opt_config.sot_warmup_sizes = []
+        self.mock_fd_config.parallel_config = MagicMock()
+        self.mock_fd_config.parallel_config.tensor_parallel_size = 1
+        self.mock_fd_config.parallel_config.enable_expert_parallel = False
+        self.mock_fd_config.quant_config = None
+        self.mock_fd_config.load_config = MagicMock()
+        self.mock_fd_config.max_num_seqs = 8
+        self.mock_fd_config.max_prefill_batch = 4
+        self.mock_fd_config.model_config.enable_mm = False
+
+        # Mock main model
+        self.mock_main_model = MagicMock()
+
+        # Mock target model inputs
+        self.mock_target_model_inputs = {
+            "block_tables": paddle.zeros([8, 100], dtype="int32"),
+            "input_ids": paddle.zeros([8, 2048], dtype="int64"),
+            "seq_lens_this_time": paddle.zeros([8], dtype="int32"),
+            "seq_lens_encoder": paddle.zeros([8], dtype="int32"),
+            "seq_lens_decoder": paddle.zeros([8], dtype="int32"),
+            "step_idx": paddle.zeros([8], dtype="int32"),
+            "stop_flags": paddle.zeros([8], dtype="bool"),
+            "stop_nums": paddle.zeros([8], dtype="int32"),
+            "pre_ids": paddle.zeros([8], dtype="int64"),
+            "output_cum_offsets": paddle.zeros([8], dtype="int32"),
+            "output_padding_offset": paddle.zeros([8], dtype="int32"),
+            "ids_remove_padding": paddle.zeros([8], dtype="int64"),
+            "batch_id_per_token": paddle.zeros([8], dtype="int32"),
+            "cu_seqlens_q": paddle.zeros([9], dtype="int32"),
+            "cu_seqlens_k": paddle.zeros([9], dtype="int32"),
+            "decoder_batch_ids": paddle.zeros([8], dtype="int32"),
+            "decoder_tile_ids_per_batch": paddle.zeros([8], dtype="int32"),
+            "decoder_num_blocks_cpu": paddle.zeros([8], dtype="int32"),
+            "decoder_num_blocks_device": paddle.zeros([8], dtype="int32"),
+            "decoder_chunk_size_device": paddle.zeros([8], dtype="int32"),
+            "max_len_tensor_cpu": paddle.zeros([8], dtype="int32"),
+            "encoder_batch_ids": paddle.zeros([8], dtype="int32"),
+            "encoder_tile_ids_per_batch": paddle.zeros([8], dtype="int32"),
+            "encoder_num_blocks_x_cpu": paddle.zeros([8], dtype="int32"),
+            "kv_batch_ids": paddle.zeros([8], dtype="int32"),
+            "kv_tile_ids_per_batch": paddle.zeros([8], dtype="int32"),
+            "kv_num_blocks_x_cpu": paddle.zeros([8], dtype="int32"),
+            "prompt_lens": paddle.zeros([8], dtype="int32"),
+            "top_p": paddle.ones([8], dtype="float32") * 0.7,
+            "top_k": paddle.ones([8], dtype="int32") * 50,
+            "temperature": paddle.ones([8], dtype="float32") * 1.0,
+            "eos_token_id": paddle.ones([8, 1], dtype="int64") * 2,
+            "penalty_score": paddle.ones([8], dtype="float32"),
+            "frequency_score": paddle.zeros([8], dtype="float32"),
+            "presence_score": paddle.zeros([8], dtype="float32"),
+            "infer_seed": paddle.zeros([8], dtype="int64"),
+            "max_dec_len": paddle.ones([8], dtype="int32") * 256,
+            "min_dec_len": paddle.zeros([8], dtype="int32"),
+            "bad_tokens": paddle.zeros([8, 0], dtype="int64"),
+            "draft_tokens": paddle.zeros([8, 10], dtype="int64"),
+            "accept_tokens": paddle.zeros([8, 10], dtype="int64"),
+            "accept_num": paddle.zeros([8], dtype="int32"),
+            "encoder_block_lens": paddle.zeros([8], dtype="int32"),
+            "cu_batch_token_offset": paddle.zeros([9], dtype="int32"),
+            "temp_scaled_logprobs": None,
+            "top_p_normalized_logprobs": None,
+            "draft_logits": None,
+        }


[nitpick] The setUp method creates an extensive mock configuration with 71 lines of repetitive mock setup. This duplicated mock setup pattern across all test methods makes the tests harder to maintain. Consider:

Extracting common mock setup into a helper method or fixture

Creating a factory function that returns a properly configured mock FDConfig

Using a test configuration file for default values

This would make tests more readable and easier to update when the configuration schema changes.

add unittest tests/spec_decode/test_mtp.py

715cd92

Copilot AI review requested due to automatic review settings November 15, 2025 19:35

paddle-bot bot added the contributor External developers label Nov 15, 2025

Copilot started reviewing on behalf of 0Ayachi0 November 15, 2025 19:36 View session

0Ayachi0 changed the title ~~[CI]【Hackathon第9阶段开发示例NO 12】功能模块fastdeploy/spec_decode/mtp.py 的单元测试补充内容~~ [CI]【Hackathon第9阶段开发示例NO 12】功能模块fastdeploy/spec_decode/mtp.py 单测补充 Nov 15, 2025

Copilot finished reviewing on behalf of 0Ayachi0 November 15, 2025 19:39

Copilot AI reviewed Nov 15, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CI]【Hackathon第9阶段开发示例NO 12】功能模块fastdeploy/spec_decode/mtp.py 单测补充 #5068

[CI]【Hackathon第9阶段开发示例NO 12】功能模块fastdeploy/spec_decode/mtp.py 单测补充 #5068

0Ayachi0 commented Nov 15, 2025 •

edited

Loading

Uh oh!

paddle-bot bot commented Nov 15, 2025

Uh oh!

CLAassistant commented Nov 15, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Nov 15, 2025

Uh oh!

Copilot AI Nov 15, 2025

Uh oh!

Copilot AI Nov 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[CI]【Hackathon第9阶段开发示例NO 12】功能模块fastdeploy/spec_decode/mtp.py 单测补充 #5068

Are you sure you want to change the base?

[CI]【Hackathon第9阶段开发示例NO 12】功能模块fastdeploy/spec_decode/mtp.py 单测补充 #5068

Conversation

0Ayachi0 commented Nov 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

paddle-bot bot commented Nov 15, 2025

Uh oh!

CLAassistant commented Nov 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI Nov 15, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 15, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 15, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

0Ayachi0 commented Nov 15, 2025 •

edited

Loading

CLAassistant commented Nov 15, 2025 •

edited

Loading