Skip to content

实验结果的复现 #52

@aWwwei

Description

@aWwwei

请问论文中Figure 3的准确率是如何评估的,有具体的评估脚本或prompt吗?我用原生Llama-3.2-3B在作者提供的测试集上无法跑出图片中约0.28的准确率,所以想问一下评估细节

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions