[CI] Add mla ut #4280

GDzhu01 · 2025-11-19T09:33:08Z

What this PR does / why we need it?

add mla_v1.py and mla.py ut

Does this PR introduce any user-facing change?

No

How was this patch tested?

pytest tests/ut/attention/test_mla_v1.py
pytest tests/ut/models/test_mla.py

vLLM version: v0.11.0
vLLM main: vllm-project/vllm@2918c1b

gemini-code-assist

Code Review

This pull request adds unit tests for the Multi-Head Latent Attention (MLA) implementation. The new tests cover the metadata builder in test_mla_v1.py and the AscendMultiHeadLatentAttention layer in test_mla.py. The changes are a good step towards improving test coverage. I've found one high-severity issue in the new test test_forward within tests/ut/models/test_mla.py, where the mocking of a custom operation is incorrect, leading to a test that doesn't properly validate the intended functionality. I've provided a code suggestion to fix the test and make it more robust.

gemini-code-assist · 2025-11-19T09:34:38Z

tests/ut/models/test_mla.py

+        mock_mla_forward.return_value = (3, self.hidden_size)
+
+        output = attn.forward(positions, hidden_states)
+
+        self.assertEqual(output.shape, (3, self.hidden_size))
+        self.assertTrue(
+            torch.allclose(output, output.view(-1, self.hidden_size)))


The mock for torch.ops.vllm.mla_forward is not correctly configured, and the assertions are not sufficient to validate the behavior of the forward method.

The mla_forward custom op is defined to return None and modify its output argument in-place. However, the test sets a tuple (3, self.hidden_size) as the return_value, which is incorrect and ignored during execution. This means the test doesn't verify that the op correctly modifies the output tensor.

The assertions only check the shape of the output tensor and that it's allclose to itself, which will always pass. The test should verify that mla_forward is called correctly and that the output tensor is populated as expected.

To make this test more robust, you should use side_effect to simulate the in-place modification of the output tensor and add assertions to check the op's arguments and the output's content.

Suggested change

mock_mla_forward.return_value = (3, self.hidden_size)

output = attn.forward(positions, hidden_states)

self.assertEqual(output.shape, (3, self.hidden_size))

self.assertTrue(

torch.allclose(output, output.view(-1, self.hidden_size)))

def mla_forward_side_effect(hidden_states, need_gather_q_kv, output, prefix):

# Simulate the op writing to the output tensor

output.fill_(1.0)

mock_mla_forward.side_effect = mla_forward_side_effect

output = attn.forward(positions, hidden_states)

mock_mla_forward.assert_called_once_with(hidden_states, False, output, self.prefix)

self.assertEqual(output.shape, (3, self.hidden_size))

self.assertTrue(torch.all(output == 1.0))

github-actions · 2025-11-19T09:36:29Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Signed-off-by: GDzhu01 <[email protected]>

MengqingCao · 2025-11-20T03:03:52Z

plz update the pr message

…cend into eplb_ci_bugfix * 'eplb_ci_bugfix' of https://github.com/845473182/vllm-ascend: (31 commits) [Test] Add ut test for torchair (vllm-project#4287) [Feat][Doc] Add a load_balance_dp_proxy in examples and external dp doc. (vllm-project#4265) [CI] Defaultly compile vllm with multimodal audio feature in dockerfile (vllm-project#4324) [MM][Bugfix] Add error log for VL models when enabling FLASHCOMM (vllm-project#4272) [Readme] EPLB Support Scenarios (vllm-project#4314) eplb redundant expert bugfix (vllm-project#4291) [Feat][BugFix]Support the Qwen3-Next-80B-A3B-Instruct quantization model&Fix the NZ issue (vllm-project#4245) [Test] Add ACL graph capture/replay DP test (vllm-project#4259) [Test] quick fix mla ut (vllm-project#4318) [Feat] Support MTP to running in full graph mode (vllm-project#3892) [CI] Add mla ut (vllm-project#4280) [Test] Add tests for the multi-node DeepSeek-V2-Lite network in GE Graph (vllm-project#4039) avoid mrope fusion op when running qwen2.5-vl on a+x machine (vllm-project#4270) [Bugfix] fix nightly multi-node EPLB tests' "DYNAMIC_EPLB=true" environment not working (vllm-project#4223) [long seq feat]GQA support long-prefill-token-threshold and fixbug (vllm-project#4209) [misc] clean up get_metadata_cls (vllm-project#4276) [Docs] Improve the AISBench multi-modal testing docs (vllm-project#4255) [doc]fix readme for kv pool user guide (vllm-project#4271) remove get_metadata_cls (vllm-project#4087) [Bugfix] fix hang in async scheduling (vllm-project#4233) ...

### What this PR does / why we need it? add mla_v1.py and mla.py ut ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? `pytest tests/ut/attention/test_mla_v1.py` `pytest tests/ut/models/test_mla.py` - vLLM version: v0.11.0 - vLLM main: vllm-project/vllm@2918c1b Signed-off-by: GDzhu01 <[email protected]> Signed-off-by: 白永斌 <[email protected]>

gemini-code-assist bot reviewed Nov 19, 2025

View reviewed changes

github-actions bot added the module:tests label Nov 19, 2025

GDzhu01 force-pushed the ut_mla branch from 14e9d1d to 8447914 Compare November 19, 2025 12:09

add mla ut

5d266b4

Signed-off-by: GDzhu01 <[email protected]>

GDzhu01 force-pushed the ut_mla branch from 8447914 to 5d266b4 Compare November 20, 2025 02:08

wangxiyuan approved these changes Nov 20, 2025

View reviewed changes

wangxiyuan merged commit 15c1eb0 into vllm-project:main Nov 20, 2025
18 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CI] Add mla ut #4280

[CI] Add mla ut #4280

GDzhu01 commented Nov 19, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Nov 19, 2025

Uh oh!

github-actions bot commented Nov 19, 2025

Uh oh!

MengqingCao commented Nov 20, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[CI] Add mla ut #4280

[CI] Add mla ut #4280

Conversation

GDzhu01 commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Nov 19, 2025

Uh oh!

MengqingCao commented Nov 20, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

GDzhu01 commented Nov 19, 2025 •

edited

Loading