Skip to content

Official Code for "LoftUp: Learning a Coordinate-Based Feature Upsampler for Vision Foundation Models"

License

Notifications You must be signed in to change notification settings

andrehuang/loftup

Repository files navigation

[Arxiv] [Project Page]

Haiwen Huang, Anpei Chen, Volodymyr Havrylov, Andreas Geiger, Dan Zhang

Teaser

TL;DR: LoftUp achieves the strongest feature upsampling performance at a comparable speed to bilinear upsampling.

bike-packing camel horsejump

Contents

Install

In general, LoftUp can run with most recent pytorch environments. We encourage the users to try out LoftUp in their exisitng environment first.

We also provide two yaml file for installation. To use them, simply run:

conda env create -f environment_cuda11.yaml

or

conda env create -f environment.yaml

Inference with pretrained upsamplers

All pre-trained upsamplers are available on 🤗 here: https://huggingface.co/models?search=loftup.

We provide example code for using LoftUp in example_usage.py. Currently we provide:

Backbone Name Featurizer Class HF hub Torch Hub Repo Torch Hub Name
DINOv2 S/14 dinov2 haiwen/loftup-dinov2s andrehuang/loftup loftup_dinov2s
DINOv2 S/14 + Reg dinov2s_reg haiwen/loftup-dinov2s_reg andrehuang/loftup loftup_dinov2s_reg
DINOv2 B/14 dinov2b haiwen/loftup-dinov2b andrehuang/loftup loftup_dinov2b
DINOv2 B/14 + Reg dinov2b_reg haiwen/loftup-dinov2b_reg andrehuang/loftup loftup_dinov2b_reg
CLIP ViT B/16 clip haiwen/loftup-clip andrehuang/loftup loftup_clip
SigLIP ViT B/16 siglip haiwen/loftup-siglip andrehuang/loftup loftup_siglip
SigLIP2 ViT B/16 siglip2 haiwen/loftup-siglip2 andrehuang/loftup loftup_siglip2

To use torch hub checkpoints, simply run

upsampler = torch.hub.load('andrehuang/loftup', model_torch_hub_name, pretrained=True)

For example, upsampler = torch.hub.load('andrehuang/loftup', loftup_dinov2s, pretrained=True).

The upsampler class is defined at UpsamplerwithChannelNorm.

Evaluation on Downstream Tasks

Dataset Preparation

See Preparing Datasets for Evaluation.

Semantic Segmentation

For semantic segmentation, our implementation is adapted from FeatUp. You can use eval_seg.py by running:

python eval_seg.py  ++upsampler_path=/path/to/your/upsampler

You can also configure other hyper-parameters such as output_dir and dataset directory. The config file is configs/eval_seg.yaml.

Video Object Segmentation

For video object segmentation on DAVIS, our code is modified from the implementation in LiFT. Specifically, we first extract segmentaiton results by running:

    python eval_davis.py --dataroot your_davis_data_dir --model_type "dinov2" --output_dir your_output_dir --imsize 224 --upsampler_path=your_upsampler_path

Then run the following to get evaluation results:

python davis2017-evaluation/evaluation_method.py --davis_path /your_davis_data_dir --task semi-supervised --results_path your_output_dir/davis_vidseg_224 --imsize 224

Others

For interactive segmentation, please check out iSegProbe.

For open-vocabulary segmentation, please check out ProxyCLIP.

For depth and normal estimation, please check out Probe3D.

Training LoftUp

We will provide training code soon.

Citation

If you find our work helpful, please cite:

@misc{huang2025loftuplearningcoordinatebasedfeature,
      title={LoftUp: Learning a Coordinate-Based Feature Upsampler for Vision Foundation Models}, 
      author={Haiwen Huang and Anpei Chen and Volodymyr Havrylov and Andreas Geiger and Dan Zhang},
      year={2025},
      eprint={2504.14032},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2504.14032}, 
}

About

Official Code for "LoftUp: Learning a Coordinate-Based Feature Upsampler for Vision Foundation Models"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages