-
Notifications
You must be signed in to change notification settings - Fork 446
Open
Description
错误描述
Step1 Option 2: Upload Custom Local Datasets (上传自定义数据集)处描述dataset,source,question,answer为必填字段,但使用官方提供的数据模板,调用scripts/data/upload_dataset.py后实测报错
单条数据
{"dataset": "YourDataset", "source": "training_free_grpo", "question": "What is 2+2?", "answer": "4"}报错
Traceback (most recent call last):
File "C:\Users\IT_la\Desktop\mbpp_youtu\youtu-agent\scripts\data\upload_dataset.py", line 78, in <module>
main()
~~~~^^
File "C:\Users\IT_la\Desktop\mbpp_youtu\youtu-agent\scripts\data\upload_dataset.py", line 74, in main
upload_dataset(args.file_path, args.dataset_name, data_format=args.data_format)
~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\IT_la\Desktop\mbpp_youtu\youtu-agent\scripts\data\upload_dataset.py", line 47, in upload_dataset
dataset_sample = convert_format_llamafactory(data)
File "C:\Users\IT_la\Desktop\mbpp_youtu\youtu-agent\scripts\data\upload_dataset.py", line 25, in convert_format_llamafactory
assert len(question) > 0, "Either 'instruction' or 'input' must be provided."
^^^^^^^^^^^^^^^^^
AssertionError: Either 'instruction' or 'input' must be provided.
错误分析
分析了报错来源scripts/data/upload_dataset.py文件
发现相关函数convert_format_llamafactory中产生AssertionError的代码为:
question = [data.get("instruction", None), data.get("input", None)]
question = [s for s in question if s is not None and s.strip() != ""]
assert len(question) > 0, "Either 'instruction' or 'input' must be provided."修改数据中的question字段为instruction或input即可正常上传数据

Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels