[Bug] 自定义数据集中直接指定路径和mm_custom_gen的方式测试结果有差异

### 操作系统及版本

openEuler 22.03 LTS

### 安装工具的python环境

docker容器中的python环境

### python版本

3.11

### AISBench工具版本

3.0.0

### AISBench执行命令

ais_bench \     --models vllm_api_stream_chat \     --datasets mm_custom_gen \     --mode perf ais_bench \     --models vllm_api_stream_chat \     --custom-dataset-path /home/zzr/benchmark/ais_bench/datasets/textvqa/textvqa_json/textvqa_val_1.jsonl \     --mode perf

### 模型配置文件或自定义配置文件内容

from ais_bench.benchmark.models import VLLMCustomAPIChat
from ais_bench.benchmark.utils.postprocess.model_postprocessors import extract_non_reasoning_content

models = [
    dict(
        attr="service",
        type=VLLMCustomAPIChat,
        abbr="vllm-api-stream-chat",
        path="/home/Qwen2.5-VL-72B-Instruct/",
        model="qwen2_vl",
        stream=True,
        request_rate=0,
        retry=2,
        host_ip="127.0.0.8",
        host_port=1025,
        max_out_len=1024,
        batch_size=32,
        trust_remote_code=False,
        generation_kwargs=dict(
            temperature=0.5,
            top_k = 10,
            top_p = 0.95,
            seed = None,
            repetition_penalty = 1.03,
            ignore_eos=True,
        ),
        pred_postprocessor=dict(type=extract_non_reasoning_content),
    )
]

### 预期行为

相同的图片在相同的参数下跑，通过直接指定路径的方法和通过在mm_custom_gen的方法跑结果应该差不多

### 实际行为

<img width="1064" height="223" alt="Image" src="https://github.com/user-attachments/assets/6c09f6cd-4aea-4202-83c1-ed74eb669fa6" />

### 前置检查

- [x] 我已读懂主页文档的快速入门，无法解决问题
- [x] 我已检索过FAQ，无重复问题
- [x] 我已搜索过现有Issue，无重复问题
- [x] 我已更新到最新版本，问题仍存在

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug] 自定义数据集中直接指定路径和mm_custom_gen的方式测试结果有差异 #91

操作系统及版本

安装工具的python环境

python版本

AISBench工具版本

AISBench执行命令

模型配置文件或自定义配置文件内容

预期行为

实际行为

前置检查

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug] 自定义数据集中直接指定路径和mm_custom_gen的方式测试结果有差异 #91

Description

操作系统及版本

安装工具的python环境

python版本

AISBench工具版本

AISBench执行命令

模型配置文件或自定义配置文件内容

预期行为

实际行为

前置检查

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions