Proto-CLIP: Vision-Language Prototypical Network for Few-Shot Learning

Code release for Proto-CLIP [ Arxiv | Project-Page ]

Dataset

To download the datasets, please follow the details in DATASET.md.
To download the FewSOL dataset variants [52 | 198], please use this link.
Note : Please make sure to place all the datasets in DATA/ directory.

Setup

# create conda environment
conda create -n proto-clip python=3.9

# activate the environment
conda activate proto-clip

# install dependencies
pip install -r requirements.txt

# Install the according versions of torch and torchvision
conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia

Alias

Adapter Adapter-Alias

3xConv conv-3x

2xConv conv-2x

MLP fc
For details about adapter aliases, please check the supplementary material.
For dataset aliases, please check datasets/__init__.py

Run

CUDA_VISIBLE_DEVICES=<GPU_ID> \
python main.py \
--config <configs-file> \
--dataset <dataset-alias> \
--logs tb_logs \
--alpha <alpha> \
--beta <beta> \
--adapter <adapter-alias> \
<vl-flag> \
<test-flag>

config-file : Configuration file path for the experiment. Default config files are in configs/ directory.
dataset-alias : Alias of the dataset to be used for the experiment
alpha : alpha hyperparameter for the selected dataset
beta : beta hyperparameter for the selected dataset
adapter-alias : adapter alias for the experiment
vl-flag : To train text memory use "" else "--train_vis_memory_only"
test-flag : To train/test use ""/"--only_test".

Note: Please use main.qt.py for experiments involving Proto-CLIP-F-Q^T.

Tensorboard

tensorboard --logdir tb_logs

Proto-CLIP Toolkit

Demo: User command oriented (Fetch) robot grasping using Proto-CLIP predictions.
For the real world demo, please use proto-clip-toolkit (sample codes). Please check the pypi package here.
Please check the pretrained checkpoints to use/work with the proto-clip-toolkit.
NOTE: Use appropriate dataset w.r.t. the checkpoint.

Links

Project Page
Please check the FAQs here
Real World Demo | Playlist
Results for Joint Object Segmentation and Few-Shot Classification in the Real World
CLIP vs Proto-CLIP t-SNE visualization
Barnes-Hut t-SNE visualization using Proto-CLIP-F trained on FewSOL [198 classes] dataset

Contact

Following 3 options are available for any clarification, comments or suggestions

Join the discussion forum.
Inform an issue.
Contact Jishnu.

Citation

Please cite Proto-CLIP if it helps your research:

@article{padalunkal2023protoclip,
 title={Proto-CLIP: Vision-Language Prototypical Network for Few-Shot Learning}, 
 author={Jishnu Jaykumar P and Kamalesh Palanisamy and Yu-Wei Chao and Xinya Du and Yu Xiang},
 archivePrefix={arXiv},
 eprint={2307.03073},
 year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
clip		clip
configs		configs
datasets		datasets
media		media
pretrained_ckpt		pretrained_ckpt
splits		splits
toolkit		toolkit
.gitignore		.gitignore
DATASET.md		DATASET.md
LICENSE		LICENSE
README.md		README.md
main.py		main.py
main.qt.py		main.qt.py
model.py		model.py
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Proto-CLIP: Vision-Language Prototypical Network for Few-Shot Learning

Dataset

Setup

Alias

Run

Tensorboard

Proto-CLIP Toolkit

Links

Contact

Citation

About

Releases

Packages

Contributors 2

Languages

Adapter	Adapter-Alias
3xConv	conv-3x
2xConv	conv-2x
MLP	fc

License

IRVLUTD/Proto-CLIP

Folders and files

Latest commit

History

Repository files navigation

Proto-CLIP: Vision-Language Prototypical Network for Few-Shot Learning

Dataset

Setup

Alias

Run

Tensorboard

Proto-CLIP Toolkit

Links

Contact

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages