BMIP: Bi-directional Modality Interaction Prompt Learning for VLM [IJCAI 2025]

BMIP: Bi-directional Modality Interaction Prompt Learning for VLM
Song-Lin Lv, Yu-Yang Chen, Zhi Zhou, Ming Yang, Lan-Zhe Guo

Official implementation of the paper "BMIP: Bi-directional Modality Interaction Prompt Learning for VLM".

🚀 News

(Aug 16, 2025)
- Paper accepted at IJCAI 2025 🎉
(Aug 14, 2025)
- The repository also supports CoOp, Co-CoOp, Deep Vision Prompting, Deep Language Prompting, and Independent V-L Prompting architectures.

Highlights

Abstract: Vision-language models (VLMs) have exhibited remarkable generalization capabilities, and prompt learning for VLMs has attracted great attention for the ability to adapt pre-trained VLMs to specific downstream tasks. However, existing studies mainly focus on single-modal prompts or uni-directional modality interaction, overlooking the powerful alignment effects resulting from the interaction between the vision and language modalities. To this end, we propose a novel prompt learning method called Bi-directional Modality Interaction Prompt (BMIP), which dynamically weights bi-modal information through learning the information of the attention layer, enhancing trainability and inter-modal consistency compared to simple information aggregation methods. To evaluate the effectiveness of prompt learning methods, we propose a more realistic evaluation paradigm called open-world generalization complementing the widely adopted cross-dataset transfer and domain generalization tasks. Comprehensive experiments on various datasets reveal that BMIP not only outperforms current state-of-the-art methods across all three evaluation paradigms but is also flexible enough to be combined with other prompt-based methods for consistent performance enhancement.

Main Contributions

Novel Bi-directional Modality Interaction Technique
- Enhance the cross-modality alignment and pave the way for further exploration of information aggregation in other multi-modal modelsNew
Evaluation Paradigm: Open-World Generalization
- Facilitate more realistic evaluations and promote related research
Flexible Integration with Other Methods
- BMIP is flexible enough to combine with other prompt learning methods, consistently boosting their performance.
State-of-the-Art Performance
- BMIP achieves SOTA performance across all tasks

☑️ Supported Methods

Method	Paper	Configs	Training Scripts
BMIP	IJCAI 2025	link	link
MaPLe	CVPR 2023	link	link
CoOp	IJCV 2022	link	link
Co-CoOp	CVPR 2022	link	link
Deep Vision Prompting	-	link	link
Deep Language Prompting	-	link	link
Independent V-L Prompting	-	link	link

Results

BMIP in comparison with existing methods

Results reported below show accuracy for base and novel classes for across 11 recognition datasets averaged over 3 seeds.

Name	Base Acc.	Novel Acc.	HM	Epochs
CLIP	69.34	74.22	71.70	-
CoOp	82.69	63.22	71.66	200
CoCoOp	80.47	71.69	75.83	10
MaPLe	82.28	75.14	78.55	5
BMIP	83.47	76.69	79.04	10

Installation

For installation and other package requirements, please follow the instructions detailed in INSTALL.md.

Data preparation

Please follow the instructions at DATASETS.md to prepare all datasets.

Training and Evaluation

Please refer to the RUN.md for detailed instructions on training, evaluating and reproducing the results using our pre-trained models.

Citation

If you use our work, please consider citing:

@misc{lv2025bmipbidirectionalmodalityinteraction,
      title={BMIP: Bi-directional Modality Interaction Prompt Learning for VLM}, 
      author={Song-Lin Lv and Yu-Yang Chen and Zhi Zhou and Ming Yang and Lan-Zhe Guo},
      year={2025},
      eprint={2501.07769},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2501.07769}, 
}

Contact

If you have any questions, please create an issue on this repository or contact at [email protected].

Acknowledgements

Our code is based on Co-CoOp and CoOp repository. We thank the authors for releasing their code. If you use our model and code, please consider citing these works as well.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Dassl.pytorch-master		Dassl.pytorch-master
clip		clip
configs		configs
datasets		datasets
docs		docs
interpret_prompts		interpret_prompts
lpclip		lpclip
scripts		scripts
trainers		trainers
utils		utils
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md
clip_words.csv		clip_words.csv
parse_test_res.py		parse_test_res.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

BMIP: Bi-directional Modality Interaction Prompt Learning for VLM [IJCAI 2025]

🚀 News

Highlights

Main Contributions

☑️ Supported Methods

Results

BMIP in comparison with existing methods

Installation

Data preparation

Training and Evaluation

Citation

Contact

Acknowledgements

About

Uh oh!

Releases

Packages

Languages

License

LAMDASZ-ML/BMIP

Folders and files

Latest commit

History

Repository files navigation

BMIP: Bi-directional Modality Interaction Prompt Learning for VLM [IJCAI 2025]

🚀 News

Highlights

Main Contributions

☑️ Supported Methods

Results

BMIP in comparison with existing methods

Installation

Data preparation

Training and Evaluation

Citation

Contact

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages