Official codebase for the paper "ChinaTravel: A Real-World Benchmark for Language Agents in Chinese Travel Planning".
| Webpage | Paper | Dataset(Huggingface), Dataset(ModelScope) |
- Create a conda environment and install dependencies:
conda create -n chinatravel python=3.9
conda activate chinatravel
pip install -r requirements.txt
- Download the database and unzip it to the chinatravel/environment/ directory
Download Links: Google Drive, NJU Drive
We support the deepseek (offical API from deepseek), gpt-4o (chatgpt-4o-latest), glm4-plus, and local inferences with qwen (Qwen2.5-7B-Instruct).
export OPENAI_API_KEY=""
python run_exp.py --splits easy --agent LLMNeSy --llm deepseek --oracle_translation
python run_exp.py --splits medium --agent LLMNeSy --llm deepseek --oracle_translation
python run_exp.py --splits human --agent LLMNeSy --llm deepseek --oracle_translation
python run_exp.py --splits human --agent LLMNeSy --llm deepseek
Note: please download the model weights to the "project_root_path/chinatravel/open_source_llm/Qwen2.5-7B-Instruct/".
python eval_exp.py --splits human --method LLMNeSy_deepseek_oracletranslation
python eval_exp.py --splits human --method LLMNeSy_deepseek
If you have any problems, please contact Jie-Jing Shao, Bo-Wen Zhang, Xiao-Wen Yang.
If our paper or related resources prove valuable to your research, we kindly ask for citation.
@misc{shao2024chinatravelrealworldbenchmarklanguage,
title={ChinaTravel: A Real-World Benchmark for Language Agents in Chinese Travel Planning},
author={Jie-Jing Shao and Xiao-Wen Yang and Bo-Wen Zhang and Baizhi Chen and Wen-Da Wei and Guohao Cai and Zhenhua Dong and Lan-Zhe Guo and Yu-feng Li},
year={2024},
eprint={2412.13682},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2412.13682},
}