Official Implementations "InterLCM: Low-Quality Images as Intermediate States of Latent Consistency Models for Effective Blind Face Restoration" (ICLR'25)
Senmao Li1,2*, Kai Wang2†, Joost van de Weijer2, Fahad Shahbaz Khan3,4, Chun-Le Guo1, Shiqi Yang5, Yaxing Wang1, jian Yang1, Ming-Ming Cheng1
1.VCIP,CS,Nankai University 2.Computer Vision Center, Universitat Autònoma de Barcelona 3.Mohamed bin Zayed University of AI 4.Linkoping University 5.SB Intuitions. SoftBank
*Work done during a research stay at Computer Vision Center, Universitat Autònoma de Barcelona
†The corresponding author.
⭐ If InterLCM is helpful to your images or projects, please help star this repo. Thanks! 🤗
- 2025.01.15: Release the pre-trained models and inference code. 😀
- 2024.12.24: This repo is created.
# git clone this repository
git clone https://github.com/sen-mao/InterLCM.git
cd InterLCM
# create new anaconda env
conda create -n interlcm python=3.8 -y
conda activate interlcm
# install python dependencies
pip3 install -r requirements.txt
python basicsr/setup.py develop
conda install -c conda-forge dlib (only for face detection or cropping with dlib)
Download the InterLCM pretrained models (Visual Encoder and Spatial Encoder) from [Releases|Google Drive] to the weights/InterLCM
folder.
# For cropped and aligned faces (512x512) (3-step interlcm reconstruction)
python inference_InterLCM.py --has_aligned --num_inference_steps 4 \
--input_path inputs/cropped_faces \
--output_path results/cropped_faces
# For cropped and aligned faces (512x512) (1-step interlcm reconstruction)
python inference_InterLCM.py --has_aligned --num_inference_steps 2 \
--visual_encoder_path weights/InterLCM/visual_encoder_1step.pth \
--spatial_encoder_path weights/InterLCM/spatial_encoder_1step.pth \
--input_path inputs/cropped_faces \
--output_path results/cropped_faces
# For whole image
# Add '--bg_upsampler realesrgan' to enhance the background regions with Real-ESRGAN
# Add '--face_upsample' to further upsample restorated face with Real-ESRGAN
python inference_InterLCM.py --num_inference_steps 4 \
--input_path inputs/whole_imgs \
--output_path results/whole_imgs \
--bg_upsampler realesrgan
- Download training dataset: FFHQ
- Resize to 512
$\times$ 512 resolution
- Training Visual Encoder and Spatial Encoder:
python -m torch.distributed.launch --nproc_per_node=gpu_num --master_port=4323 basicsr/train.py -opt options/interlcm.yml --launcher pytorch
- Pre-trained Visual Encoder (
visual_encoder.pth
) and Spatial Encoder (spatial_encoder.pth
) can be found in the folder of Releases v0.1.0: https://github.com/sen-mao/InterLCM/releases/tag/v0.1.0
Licensed under a Creative Commons Attribution-NonCommercial 4.0 International for Non-commercial use only. Any commercial use should get formal permission first.
This project is based on LCM and CodeFormer. Some codes are brought from StableSR. Thanks for their awesome works.
If you have any questions, please feel free to reach out to me at [email protected]
.