-
Notifications
You must be signed in to change notification settings - Fork 582
[Test]Add accuracy test for multiple models(2) #4251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: MrZ20 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds accuracy test configurations for four new models: ERNIE-4.5-21B-A3B-PT, MiniCPM3-4B, Mistral-7B-Instruct-v0.1, and Phi-4-mini-instruct. The changes include new YAML configuration files for each model and an update to the list of accuracy tests. The new configuration files are well-formed. I have one suggestion to improve the maintainability of the test list file by sorting it alphabetically, which will make it easier to manage in the future.
| InternVL3_5-8B.yaml | ||
| ERNIE-4.5-21B-A3B-PT.yaml | ||
| MiniCPM3-4B.yaml | ||
| Mistral-7B-Instruct-v0.1.yaml | ||
| Phi-4-mini-instruct.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For improved maintainability and to prevent potential duplicates, it's a good practice to keep this list of configuration files sorted alphabetically. While the existing file isn't sorted, applying this convention now would be beneficial for future updates.
Could you please sort the entire file? For your convenience, here is the alphabetically sorted list:
DeepSeek-V2-Lite.yaml
ERNIE-4.5-21B-A3B-PT.yaml
InternVL3_5-8B.yaml
Meta-Llama-3.1-8B-Instruct.yaml
MiniCPM3-4B.yaml
Mistral-7B-Instruct-v0.1.yaml
Phi-4-mini-instruct.yaml
Qwen2.5-Omni-7B.yaml
Qwen2.5-VL-7B-Instruct.yaml
Qwen2-7B.yaml
Qwen2-Audio-7B-Instruct.yaml
Qwen2-VL-7B-Instruct.yaml
Qwen3-30B-A3B.yaml
Qwen3-8B.yaml
Qwen3-VL-30B-A3B-Instruct.yaml
Qwen3-VL-8B-Instruct.yaml
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
Signed-off-by: MrZ20 <[email protected]>
Signed-off-by: MrZ20 <[email protected]>
| - name: "exact_match,strict-match" | ||
| value: 0.35 | ||
| - name: "exact_match,flexible-extract" | ||
| value: 0.38 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an accuracy issue on this model?
What this PR does / why we need it?
Add accuracy test for multiple models:
Does this PR introduce any user-facing change?
How was this patch tested?