Skip to content

riotu-lab/MAG-ViT-Super-Resolution

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 

Repository files navigation

MAG-ViT: Multi-Attention Grid Vision Transformer for High-Fidelity Super-Resolution in Remote Sensing

Official Pytorch implementation of the paper "MAG-ViT: Multi-Attention Grid Vision Transformer for High-Fidelity Super-Resolution in Remote Sensing".

Remote sensing applications need high-resolution imagery, but hardware and acquisition constraints often limit image quality. While Vision Transformers (ViTs) have advanced RSISR (Remote Sensing Image Super-Resolution), they struggle with high computational costs and limited contextual understanding. MAG-ViT addresses these challenges by combining local and global self-attention efficiently with linear complexity. At the heart of MAG-ViT is the HaloMBConv module, which integrates halo-based attention and mobile bottleneck convolutions to enhance spatial details while reducing redundant computations. The model uses a dual-attention strategy: fixed windows for local features and grid windows for capturing broader context, strengthened by residual connections. Experiments on UCMerced and AID datasets show that MAG-ViT achieves up to 1.1 dB PSNR and 0.03 SSIM improvements over state-of-the-art methods, while offering faster inference than diffusion-based models making it highly suitable for practical remote sensing tasks.

Requirements

  • Python 3.6+
  • Pytorch>=1.6
  • torchvision>=0.7.0
  • einops
  • matplotlib
  • cv2
  • scipy
  • tqdm
  • scikit

Installation

Clone or download this code and install aforementioned requirements

cd codes

Dataset Preparation

Download the UCMerced and AID datasets from the following links:

The datasets are already split into train, validation, and test sets.
The original images serve as the high-resolution (HR) references, and the corresponding low-resolution (LR) images are generated by bicubic downsampling.

Important:
When preparing the datasets, make sure the folder structure matches the expected format used in the code.
The datasets should be organized as follows:

For AID:

  • /data/Image_restoration/Datasets/AID-dataset/
    • train/
      • HR/
      • LR_x2/
      • LR_x3/
      • LR_x4/
    • val/
      • HR/
      • LR_x2/
      • LR_x3/
      • LR_x4/
    • test/
      • HR/
      • LR_x2/
      • LR_x3/
      • LR_x4/

For UCMerced:

  • /data/Image_restoration/Datasets/UCMerced-dataset/
    • train/
      • HR/
      • LR_x2/
      • LR_x3/
      • LR_x4/
    • val/
      • HR/
      • LR_x2/
      • LR_x3/
      • LR_x4/
    • test/
      • HR/
      • LR_x2/
      • LR_x3/
      • LR_x4/

Training

# x4
python demo_train.py --model=MAGVIT --dataset=UCMerced --scale=4 --patch_size=192 --loss 1*L1 --lr 1e-4 --ext=img --epochs 2500 --batch_size 8 --n_GPUs 1 --save=MAGVITx4_UCMerced
# x3
python demo_train.py --model=MAGVIT --dataset=UCMerced --scale=3 --patch_size=144 --loss 1*L1 --lr 1e-4 --ext=img --epochs 2500 --batch_size 8 --save=MAGVITx3_UCMerced
# x2
python demo_train.py --model=MAGVIT --dataset=UCMerced --scale=2 --patch_size=96 --loss 1*L1 --lr 1e-4 --ext=img --epochs 2500 --batch_size 8 --save=MAGVITx2_UCMerced

The train/val data pathes are set in data/init.py

Testing

Pre-trained TransENet models for the UCMerced and AID datasets are available here:
Baidu Drive (Password: w7ct) | Google Drive

Before running the test, you need to manually set the input and output paths inside the demo_deploy.py file:

args.dir_data = '/path/to/your/LR_x1'  # Path to the low-resolution input images
args.dir_out = '/path/to/save/output'  # Path where the output results will be saved
# x4
python demo_deploy.py --model=MAGVIT --scale=4
# x3
python demo_deploy.py --model=MAGVIT --scale=3
# x2
python demo_deploy.py --model=MAGVIT --scale=2

Results

The output images generated by the trained models on the UCMerced and AID datasets can be downloaded here:

These folders contain the visual results obtained after running the testing phase using the pre-trained models.

Evaluation

To reproduce the evaluation results (PSNR, SSIM, and LPIPS metrics) on the UCMerced and AID datasets:

  1. Download the predicted output images from the results links:

  2. Open and run the notebook evaluation.ipynb.

  3. In the notebook, set the paths to:

    • Ground-truth (HR) images
    • Predicted output images
  4. The notebook will automatically calculate and print the average PSNR, SSIM, and LPIPS scores.

Note: Make sure you install the required libraries before running the evaluation:

pip install basicsr lpips

The evaluation code uses metrics from BasicSR for accurate computation.

Citation

If you find this code useful for your research, please cite our paper:

@article{ali2024magvit,
  title     = {MAG-ViT: Multi-Attention Grid Vision Transformer for High-Fidelity Super-Resolution in Remote Sensing},
  journal   = {Under Review / Preprint},
  year      = {2026},
}

Acknowledgements

This code is built on TransENet (Pytorch) and BasicSR. We thank the authors for sharing the codes.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors