Temporal-Spatial Object Relations Modeling for Vision-and-Language Navigation

Prerequisites

Our model was trained and evaluated using the following package dependencies:

Pytorch 2.1.1
Python 3.9.18

Install Matterport3D simulators: follow instructions here.
Download data from here, Put the data in datasets directory.

Pre-training

cd pretrain_src
bash pretrain.sh

Fine-tuning

cd main_src
bash scripts/train.sh 8001

Citation

If you find this work useful in your research, please cite the following paper:

# BibTeX
@ARTICLE{11141526,
  author={Huang, Bowen and Zheng, Yanwei and Sui, Dongchen and Lan, Chuanlin and Zhao, Xinpeng and Zhang, Xiao and Meng, Jingke and Xiao, Mengbai and Zou, Yifei and Yu, Dongxiao},
  journal={IEEE Transactions on Intelligent Transportation Systems}, 
  title={Temporal-Spatial Object Relations Modeling for Vision-and-Language Navigation}, 
  year={2025},
  volume={},
  number={},
  pages={1-15},
  keywords={Navigation;Training;Feature extraction;Visualization;Transformers;Trajectory;Turning;Planning;Reinforcement learning;Large language models;Vision-and-language navigation;temporal object relations;spatial object relations;turning back penalty},
  doi={10.1109/TITS.2025.3599649}}

# GB/T 7714
[1] Huang B , Zheng Y , Lan C ,et al.Temporal-Spatial Object Relations Modeling for Vision-and-Language Navigation[J].  2024.DOI:10.1109/TITS.2025.3599649.

# MLA
[1] Huang, Bowen , et al. "Temporal-Spatial Object Relations Modeling for Vision-and-Language Navigation." (2024).

# APA
[1] Huang, B. ,  Zheng, Y. ,  Lan, C. ,  Zhao, X. ,  Zou, Y. , &  Yu, D. . (2024). Temporal-spatial object relations modeling for vision-and-language navigation.

Acknowledgement

Codebase from DUET.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
main_src		main_src
pretrain_src		pretrain_src
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Temporal-Spatial Object Relations Modeling for Vision-and-Language Navigation

Prerequisites

Pre-training

Fine-tuning

Citation

Acknowledgement

About

Uh oh!

Releases

Packages

Languages

visee-sdu/TSOR

Folders and files

Latest commit

History

Repository files navigation

Temporal-Spatial Object Relations Modeling for Vision-and-Language Navigation

Prerequisites

Pre-training

Fine-tuning

Citation

Acknowledgement

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages