-
Notifications
You must be signed in to change notification settings - Fork 12
Description
操作系统及版本
Ubuntu 24.04.3 LTS
安装工具的python环境
docker容器中的python环境
python版本
3.11
AISBench工具版本
3.0.20251103
AISBench执行命令
ais_bench --models vllm_api_general_chat --datasets gpqa_gen_0_shot_str --dump-eval-details
模型配置文件或自定义配置文件内容
from ais_bench.benchmark.models import VLLMCustomAPIChat
from ais_bench.benchmark.utils.model_postprocessors import extract_non_reasoning_content
models = [
dict(
attr="service",
type=VLLMCustomAPIChat,
abbr='vllm-api-general-chat',
path="/home/weight/Qwen3-Next-80B-A3B-Instruct-W8A8",
model="model",
request_rate = 0,
retry = 2,
host_ip = "localhost",
host_port = 8012,
max_out_len = 32768,
batch_size=64,
trust_remote_code=False,
generation_kwargs = dict(
temperature = 0.5,
top_k = 10,
top_p = 0.95,
seed = None,
repetition_penalty = 1.03,
),
pred_postprocessor=dict(type=extract_non_reasoning_content)
)
]
预期行为
在 results/vllm-api-general-chat/GPQA_diamond.json 中提取的 predictions 应与模型的 origin_prediction 给出的答案一致
实际行为
提取出的 predictions 应与模型的 origin_prediction 给出的 correct answer 不一致,例如:
"133": {
"prompt": "What is the correct answer to this question: Calculate the enthalpy of neutralization when 500 mL 0.2 M HCl, 300 mL 0.3 M H2SO4 and 200 mL 0.5 M Ba(OH)2 are mixed.\nChoices:\n(A)-16.0 kJ\n(B)-11.42 kcal\n(C)-2.72 kcal\n(D)-3.80 kcal\nFormat your response as follows: "The correct answer is (insert answer here)"",
"origin_prediction": "To calculate the enthalpy of neutralization, we need to determine how many moles of H⁺ and OH⁻ ions react, since neutralization is governed by the reaction:\n\nH⁺ + OH⁻ → H₂O, ΔH ≈ -57.3 kJ/mol (or -13.7 kcal/mol) — standard enthalpy of neutralization for strong acid and strong base.\n\n---\n\n### Step 1: Calculate moles of H⁺ from acids\n\n#### HCl (monoprotic):\nVolume = 500 mL = 0.5 L \nConcentration = 0.2 M \n→ Moles of HCl = 0.5 × 0.2 = 0.1 mol H⁺\n\n#### H₂SO₄ (diprotic):\nVolume = 300 mL = 0.3 L \nConcentration = 0.3 M \n→ Moles of H₂SO₄ = 0.3 × 0.3 = 0.09 mol \n→ Moles of H⁺ = 2 × 0.09 = 0.18 mol H⁺\n\nTotal H⁺ = 0.1 + 0.18 = 0.28 mol\n\n---\n\n### Step 2: Calculate moles of OH⁻ from base\n\nBa(OH)₂ → provides 2 OH⁻ per molecule \nVolume = 200 mL = 0.2 L \nConcentration = 0.5 M \n→ Moles of Ba(OH)₂ = 0.2 × 0.5 = 0.1 mol \n→ Moles of OH⁻ = 2 × 0.1 = 0.2 mol OH⁻\n\n---\n\n### Step 3: Determine limiting reactant\n\nWe have:\n- H⁺ = 0.28 mol\n- OH⁻ = 0.2 mol\n\nSo, OH⁻ is limiting → only 0.2 mol of H⁺ will react with 0.2 mol of OH⁻\n\nThus, moles of water formed = 0.2 mol\n\n---\n\n### Step 4: Calculate enthalpy change\n\nStandard ΔH_neutralization = -57.3 kJ/mol (for H⁺ + OH⁻ → H₂O)\n\nSo, for 0.2 mol:\n\nΔH = 0.2 × (-57.3) = -11.46 kJ\n\nNow convert to kcal (since choices are in kcal):\n\n1 kcal = 4.184 kJ \n→ -11.46 kJ ÷ 4.184 ≈ -2.74 kcal\n\nThis is very close to -2.72 kcal (choice C), likely due to rounding or slight variation in standard value used (sometimes -13.7 kcal/mol is used).\n\nLet’s check using kcal directly:\n\nStandard ΔH = -13.7 kcal/mol (commonly used value for strong acid-strong base)\n\nThen: \nΔH = 0.2 mol × (-13.7 kcal/mol) = -2.74 kcal\n\nAgain, rounds to -2.72 kcal if we use -13.6 kcal/mol:\n\n0.2 × 13.6 = 2.72 → so they likely used -13.6 kcal/mol\n\nThat’s acceptable; values vary slightly by source.\n\n---\n\n### Final Answer:\n\nThe correct answer is (C) -2.72 kcal",
"predictions": [
"A"
],
"references": [
"C"
],
"correct": [
false
]
}
前置检查
- 我已读懂主页文档的快速入门,无法解决问题
- 我已检索过FAQ,无重复问题
- 我已搜索过现有Issue,无重复问题
- 我已更新到最新版本,问题仍存在