Skip to content

LAMDASZ-ML/ChinaTravel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ChinaTravel: A Real-World Benchmark for Language Agents in Chinese Travel Planning

Official codebase for the paper "ChinaTravel: A Real-World Benchmark for Language Agents in Chinese Travel Planning".

| Webpage | Paper | Dataset(Huggingface), Dataset(ModelScope) |

Overview

Quick Start

Setup

  1. Create a conda environment and install dependencies:
conda create -n chinatravel python=3.9  
conda activate chinatravel  
pip install -r requirements.txt  
  1. Download the database and unzip it to the chinatravel/environment/ directory

Download Links: Google Drive, NJU Drive

Running

We support the deepseek (offical API from deepseek), gpt-4o (chatgpt-4o-latest), glm4-plus, and local inferences with qwen (Qwen2.5-7B-Instruct).

export OPENAI_API_KEY=""

python run_exp.py --splits easy --agent LLMNeSy --llm deepseek --oracle_translation
python run_exp.py --splits medium --agent LLMNeSy --llm deepseek --oracle_translation
python run_exp.py --splits human --agent LLMNeSy --llm deepseek --oracle_translation


python run_exp.py --splits human --agent LLMNeSy --llm deepseek 

Note: please download the model weights to the "project_root_path/chinatravel/open_source_llm/Qwen2.5-7B-Instruct/".

Evaluation

python eval_exp.py --splits human --method LLMNeSy_deepseek_oracletranslation
python eval_exp.py --splits human --method LLMNeSy_deepseek

Docs

Environment

Contact

If you have any problems, please contact Jie-Jing Shao, Bo-Wen Zhang, Xiao-Wen Yang.

Citation

If our paper or related resources prove valuable to your research, we kindly ask for citation.

@misc{shao2024chinatravelrealworldbenchmarklanguage,
      title={ChinaTravel: A Real-World Benchmark for Language Agents in Chinese Travel Planning}, 
      author={Jie-Jing Shao and Xiao-Wen Yang and Bo-Wen Zhang and Baizhi Chen and Wen-Da Wei and Guohao Cai and Zhenhua Dong and Lan-Zhe Guo and Yu-feng Li},
      year={2024},
      eprint={2412.13682},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2412.13682}, 
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages