Skip to content

VisionXLab/OF-Diff

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OF-Diff: Object Fidelity Diffusion for Remote Sensing Image Generation

  • We summarize common failure modes in the generation of remote sensing images, including control leakage, structural distortion, dense generation collapse, and feature-level mismatch. In these four aspects, OF-Diff outperforms the current state-of-the-art methods.

Fig1

Abstract

High-precision controllable remote sensing image generation is both meaningful and challenging. Existing diffusion models often produce low-fidelity images due to their inability to adequately capture morphological details, which may affect the robustness and reliability of object detection models. To enhance the accuracy and fidelity of generated objects in remote sensing, this paper proposes Object Fidelity Diffusion (OF-Diff), which effectively improves the fidelity of generated objects. Specifically, we are the first to extract the prior shapes of objects based on the layout for diffusion models in remote sensing. Then, we introduce a dual-branch diffusion model with diffusion consistency loss, which can generate high-fidelity remote sensing images without providing real images during the sampling phase. Furthermore, we introduce DDPO to fine-tune the diffusion process, making the generated remote sensing images more diverse and semantically consistent. Comprehensive experiments demonstrate that OF-Diff outperforms state-of-the-art methods in the remote sensing across key quality metrics. Notably, the performance of several polymorphic and small object classes shows significant improvement. For instance, the mAP increases by 8.3%, 7.7%, and 4.0% for airplanes, ships, and vehicles, respectively.

Overview

  • Comparison of OF-Diff with Mainstream Methods.

Fig1

  • An Overview of OF-Diff and Pipeline.

arch

Main Results

  • Comparison of the Generation Results of OF-Diff with Other Methods.

arch

  • Diversity Results and Style Preference Results
dual resampler cond gen
  • Quantitative Comparison with Other Methods on DIOR and DOTA.

arch

  • **Trainability Comparison Results, and the Results on Unknown Layout Dataset during Training **
dual resampler cond gen

Getting Started

1. Conda environment setup

conda env create -f environment.yaml
conda activate aerogen

2. Data Preparation

2.1 Dataset and structure

You need to download the dataset. Taking DIOR as an example, the dataset needs to be processed (see the data_process.md) to form the following format.

DIOR-R-train
├── images
│   ├── 00001.jpg
|   ├── ...
|   ├── 05862.jpg
├── labels
|   ├── 00001.jpg
|   ├── ...
|   ├── 05862.jpg
├── prompt.json

2.2 weights

Initialize the ControlNet model using the pretrained UNet encoder weights obtained from Stable Diffusion, and subsequently merge these weights with the Stable Diffusion model weights, saving the result as ./model/control_sd15_ini.ckpt.

python ./tools/add_control.py

3. Training

python train.py

4. Sampling

python merge_weights.py ./path/to/checkpoints
python inference.py

TODOs

  • Release the paper on arXiv.
  • Release the initial code.
  • Release the complete code.
  • Release synthetic images by OF-Diff.

Contact

If you have any questions about this paper or code, feel free to email me at [email protected]. This ensures I can promptly notice and respond!

Acknowledgements

Our work is based on Stable Diffusion, ControlNet, RemoteSAM, we appreciate their outstanding contributions. In addition, we are also extremely grateful to AeroGen and CC-Diff for their outstanding contributions in the field of remote sensing image generation. It is their excellent experiments that have promoted the development of this field.

Citation

@misc{ye2025objectfidelitydiffusionremote,
      title={Object Fidelity Diffusion for Remote Sensing Image Generation}, 
      author={Ziqi Ye and Shuran Ma and Jie Yang and Xiaoyi Yang and Ziyang Gong and Xue Yang and Haipeng Wang},
      year={2025},
      eprint={2508.10801},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2508.10801}, 
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages