Skip to content

【ICME2025】Offical Pytorch Code for "Fraesormer: Learning Adaptive Sparse Transformer for Efficient Food Recognition"

License

Notifications You must be signed in to change notification settings

zs1314/Fraesormer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Fraesormer: Learning Adaptive Sparse Transformer for Efficient Food Recognition

paper supplement project

🔥🔥🔥 News

  • 2025-01-22: This repo is released (Private).

Abstract: In recent years, Transformer has witnessed significant progress in food recognition. However, most existing approaches still face two critical challenges in lightweight food recognition: (1) the quadratic complexity and redundant feature representation from interactions with irrelevant tokens; (2) static feature recognition and single-scale representation, which overlook the unstructured, non-fixed nature of food images and the need for multi-scale features. To address these, we propose an adaptive and efficient sparse Transformer architecture (Fraesormer) with two core designs: Adaptive Top-k Sparse Partial Attention (ATK-SPA) and Hierarchical Scale-Sensitive Feature Gating Network (HSSFGN). ATK-SPA uses a learnable Gated Dynamic Top-K Operator (GDTKO) to retain critical attention scores, filtering low query-key matches that hinder feature aggregation. It also introduces a partial channel mechanism to reduce redundancy and promote expert information flow, enabling local-global collaborative modeling. HSSFGN employs gating mechanism to achieve multi-scale feature representation, enhancing contextual semantic information. Extensive experiments show that Fraesormer outperforms state-of-the-art methods.


Dependencies

  • Python 3.8
  • PyTorch 1.11.0+cu113

Setup

conda create -n fraesormer python=3.8
conda activate fraesormer
conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=11.3 -c pytorch
pip install -r requirements.txt

Contents

  1. Datasets
  2. Training
  3. Testing
  4. Results
  5. Citation
  6. Acknowledgements

Datasets

Used training and testing sets can be downloaded as follows:

Datasets Content Link
ETHZ Food-101 ETHZ Food-101 contains 101 categories with a total of 101,000 images baidu
Vireo Food-172 Vireo Food-172 consists of 172 categories with 67,288 images baidu
UEC Food-256 UEC Food-256 includes 256 categories with 31,395 images baidu
SuShi-50 SuShi-50 contains 50 categories with 3,963 images baidu

Download training and testing datasets and put them into the corresponding folders of datasets/.

Training

Download the all food dataset and structure the data as follows:

/path/to/datasets/
  ETHZ Food-101/
    train/
      class1/
        img1.jpeg
      class2/
        img2.jpeg
    validation/
      class1/
        img3.jpeg
      class2/
        img4.jpeg
  Vireo Food-172/
    train/
      class1/
        img1.jpeg
      class2/
        img2.jpeg
    validation/
      class1/
        img3.jpeg
      class2/
        img4.jpeg
  UEC Food-256/        
  SuShi-50/

To train Fraesormer models, follow the respective command below:

ETHZ Food-101

Fraesormer-Tiny
python main.py --model Fraesormer-Tiny --data-set ETHZ_Food-101 --data-path $PATH_TO_ETHZ_Food-101 --output_dir $PATH_Result_ETHZ_Food-101
Fraesormer-base
python main.py --model Fraesormer-Base --data-set ETHZ_Food-101 --data-path $PATH_TO_ETHZ_Food-101 --output_dir $PATH_Result_ETHZ_Food-101
Fraesormer-Large
python main.py --model Fraesormer-Large --data-set ETHZ_Food-101 --data-path $PATH_TO_ETHZ_Food-101 --output_dir $PATH_Result_ETHZ_Food-101

Vireo Food-172

Fraesormer-Tiny
python main.py --model Fraesormer-Tiny --data-set Vireo_Food-172 --data-path $PATH_TO_Vireo_Food-172 --output_dir $PATH_Result_Vireo_Food-172
Fraesormer-base
python main.py --model Fraesormer-Base --data-set Vireo_Food-172 --data-path $PATH_TO_Vireo_Food-172 --output_dir $PATH_Result_Vireo_Food-172
Fraesormer-Large
python main.py --model Fraesormer-Large --data-set Vireo_Food-172 --data-path $PATH_TO_Vireo_Food-172 --output_dir $PATH_Result_Vireo_Food-172

UEC Food-256

Fraesormer-Tiny
python main.py --model Fraesormer-Tiny --data-set UEC_Food-256 --data-path $PATH_TO_UEC_Food-256 --output_dir $PATH_Result_UEC_Food-256
Fraesormer-base
python main.py --model Fraesormer-Base --data-set UEC_Food-256 --data-path $PATH_TO_UEC_Food-256 --output_dir $PATH_Result_UEC_Food-256
Fraesormer-Large
python main.py --model Fraesormer-Large --data-set UEC_Food-256 --data-path $PATH_TO_UEC_Food-256 --output_dir $PATH_Result_UEC_Food-256

SuShi-50

Fraesormer-Tiny
python main.py --model Fraesormer-Tiny --data-set SuShi-50 --data-path $PATH_TO_SuShi-50 --output_dir $PATH_Result_SuShi-50
Fraesormer-base
python main.py --model Fraesormer-Base --data-set SuShi-50 --data-path $PATH_TO_SuShi-50 --output_dir $PATH_Result_SuShi-50
Fraesormer-Large
python main.py --model Fraesormer-Large --data-set SuShi-50 --data-path $PATH_TO_SuShi-50 --output_dir $PATH_Result_SuShi-50

Testing

Run the following command to evaluate a pre-trained Fraesormer-Tiny on UEC_Food-256 validation set with a single GPU:

python main.py --eval --model Fraesormer-Tiny --resume ./Fraesormer-Tiny.pth --data-path $PATH_TO_UEC_Food_256

Results

We achieve state-of-the-art performance. Detailed results can be found in the paper.

Quantitative Comparisons (click to expand)
  • Results in Table 1 (main paper)

  • Results in Figure 1 (main paper)

Citation

If you find the code helpful in your research or work, please cite the following paper(s).

@article{zou2025fraesormer,
  title={Fraesormer: Learning Adaptive Sparse Transformer for Efficient Food Recognition},
  author={Zou, Shun and Zou, Yi and Zhang, Mingya and Luo, Shipeng and Chen, Zhihao and Gao, Guangwei},
  journal={ICME},
  year={2025}
}

Acknowledgements

We sincerely appreciate SHViT, Swin Transformer, LeViT, pytorch-image-models, EfficientViT and PyTorch for their wonderful implementations.

About

【ICME2025】Offical Pytorch Code for "Fraesormer: Learning Adaptive Sparse Transformer for Efficient Food Recognition"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages