-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluation script fails to work for Qwen-Math models. #174
Comments
@ChenDRAG The problem you’re encountering is due to a mismatch between the model’s maximum supported context length and the value you’ve specified for max_model_length in MODEL_ARGS. The Qwen2.5 (and -Instruct) models support a maximum context length of 32768, but the Qwen2.5-Math models have a much lower limit of 4096. This discrepancy is causing the error you’re seeing. To resolve this issue, you should modify the max_model_length parameter to be no greater than 4096 when working with the Qwen-2.5 Math models: MODEL_ARGS="pretrained=$MODEL,dtype=float16,data_parallel_size=$NUM_GPUS,max_model_length=4096,gpu_memory=..." You can also override this maximum context length by setting the environment variable VLLM_ALLOW_LONG_MAX_MODEL_LEN=1, as shown in your ValueError message. This allows the model to accept context lengths larger than the model’s default maximum (4096 for the Math models). You can try adding this line to your environment configuration: export VLLM_ALLOW_LONG_MAX_MODEL_LEN=1 However, this approach is not recommended because it could lead to incorrect model outputs or CUDA errors due to the mismatch in the expected maximum sequence length. Forcing a larger context length may result in performance issues or even instability, especially when working with Math models that are designed to handle only shorter sequences. |
@Maxwell-Jia Thank you so much for your response. I tried setting max_model_length to 4096. However, another issue raises: [rank0]: ValueError: please provide at least one prompt
I use the following command
Any idea? please |
Also I have another question. Since Qwen-Math series has a context length limitation of 4096. And MODEL. deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B is fine-tuned from Qwen-Math-Base. Why DeepSeek-R1-Distill-Qwen-1.5B does not have a context limitation of 4096 as shown in the official demonstration? |
@ChenDRAG I think DeepSeek has shifted the context length of Qwen2.5-MATH models from 4096 to 131072 during the distillation process, although they indeed do not explicitly mention this in their technical report. Qwen2.5-MATH-1.5B DeepSeek-R1-Distill-Qwen-1.5B |
I tried running the Qwen-Math models instead of the Qwen models, but find the evaluation scripts don't work for Qwen-Math
ValueError: User-specified max_model_len (32768) is greater than the derived max_model_len (max_position_embeddings=4096 or model_max_length=None in model's config.json). This may lead to incorrect model outputs or CUDA errors. To allow overriding this maximum, set the env var VLLM_ALLOW_LONG_MAX_MODEL_LEN=1
Hope you could help me out.
The text was updated successfully, but these errors were encountered: