Skip to content

Official repository for CVPR 2025 paper: OpenSDI: Spotting Diffusion-Generated Images in the Open World

Notifications You must be signed in to change notification settings

whuhxb/OpenSDI

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LMM-R1 logo

OpenSDI: Spotting Diffusion-Generated Images in the Open World

🤗 HF Dataset 🤗 HF Model 📄 Paper 🌐 Project Page

News

  • [2025/3/10] We release dataset and code!

Introduce OpenSDI challenge

As text-to-image (T2I) diffusion models rapidly advance, distinguishing between real and AI-generated images becomes increasingly challenging. This repository introduces OpenSDID (Open-world Spotting of Diffusion Images Dataset), a novel benchmark designed to address the OpenSDI challenge: spotting diffusion-generated images in realistic, open-world scenarios.

OpenSDID Dataset Highlights:

  • User Diversity: Simulates the wide range of user intentions and creative styles. We use diverse text prompts generated by VLMs to capture varied real-world editing scenarios.
  • Model Innovation: Accounts for the rapid evolution of diffusion models. OpenSDID includes images from multiple state-of-the-art diffusion models (SD1.5, SD2.1, SDXL, SD3, Flux.1) released after 2023.
  • Manipulation Scope: Covers the full spectrum of diffusion-based manipulation, from global image synthesis to precise local edits, object insertions, and background changes.

Scheme

Model Training Set Test Set Total
Real Fake Real Fake Images
SD1.5 100K 100K 10K 10K 220K
SD2.1 - - 10K 10K 20K
SDXL - - 10K 10K 20K
SD3 - - 10K 10K 20K
Flux.1 - - 10K 10K 20K
Total 100K 100K 50K 50K 300K

Dataset Overview.

pipeline

Samples.

Download:

We have packaged the dataset into Hugging Face datasets for convenient distribution in the following link: OpenSDI Training Dataset OpenSDI Testing Dataset

We distribute this dataset under the CC BY-SA 4.0 license. Real images come from megalith-10m. This dataset is for academic use. While it has undergone ethical review by the University of Southampton, it is provided 'as is' without warranties. Users assume all risks and liabilities associated with its use; the providers accept no responsibility for any consequences arising from its use.

The original data (indiviual image files) will be uploaded to cloud storage later.

For dataset utilization, we recommend using IMDLBenCo, which offers many methods. And you can use our hf dataset to load the data.

Benchmark and Leaderboard: Join the OpenSDI Evaluation!

We maintain a public leaderboard at OpenSDI Leaderboard. We warmly welcome researchers to evaluate their detection and localization models using our OpenSDID dataset and submit their results to contribute to this growing benchmark.

Quick Start with MaskCLIP

To address the OpenSDI challenge, we introduce MaskCLIP, a novel method based on our Synergizing Pretrained Models (SPM) framework. MaskCLIP effectively enhances the generalization strengths of two powerful pre-trained models at image and pixel levels:

  • CLIP (Contrastive Language-Image Pre-training): Provides robust image-level semantic understanding, crucial for generalizable detection.
  • MAE (Masked Autoencoder): Offers fine-grained spatial representation learning, essential for accurate forgery localization at pixel level. pipeline

MaskCLIP overview.

Installation

conda create -n opensdi python=3.12 -y
conda activate opensdi
pip install -r requirements.txt
wget -P weights https://dl.fbaipublicfiles.com/mae/pretrain/mae_pretrain_vit_base.pth

Train

train.sh

Test

test.sh

Citation

If you find OpenSDI useful for your research and applications, please cite using this BibTeX:

@article{wang2025opensdi,
  title         = {OpenSDI: Spotting Diffusion-Generated Images in the Open World},
  author        = {Wang, Yabin and Huang, Zhiwu and Hong, Xiaopeng},
  year          = {2025},
  archivePrefix = {arXiv},
  primaryClass  = {cs.CV},
  url           = {https://arxiv.org/abs/2503.19653},
}

References & Acknowledgements

We sincerely thank IML-ViT and IMDLBenCo for their exploration and support.

About

Official repository for CVPR 2025 paper: OpenSDI: Spotting Diffusion-Generated Images in the Open World

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.0%
  • Shell 1.0%