OF-Diff: Object Fidelity Diffusion for Remote Sensing Image Generation

We summarize common failure modes in the generation of remote sensing images, including control leakage, structural distortion, dense generation collapse, and feature-level mismatch. In these four aspects, OF-Diff outperforms the current state-of-the-art methods.

Abstract

High-precision controllable remote sensing image generation is both meaningful and challenging. Existing diffusion models often produce low-fidelity images due to their inability to adequately capture morphological details, which may affect the robustness and reliability of object detection models. To enhance the accuracy and fidelity of generated objects in remote sensing, this paper proposes Object Fidelity Diffusion (OF-Diff), which effectively improves the fidelity of generated objects. Specifically, we are the first to extract the prior shapes of objects based on the layout for diffusion models in remote sensing. Then, we introduce a dual-branch diffusion model with diffusion consistency loss, which can generate high-fidelity remote sensing images without providing real images during the sampling phase. Furthermore, we introduce DDPO to fine-tune the diffusion process, making the generated remote sensing images more diverse and semantically consistent. Comprehensive experiments demonstrate that OF-Diff outperforms state-of-the-art methods in the remote sensing across key quality metrics. Notably, the performance of several polymorphic and small object classes shows significant improvement. For instance, the mAP increases by 8.3%, 7.7%, and 4.0% for airplanes, ships, and vehicles, respectively.

Overview

Comparison of OF-Diff with Mainstream Methods.

An Overview of OF-Diff and Pipeline.

Main Results

Comparison of the Generation Results of OF-Diff with Other Methods.

Diversity Results and Style Preference Results

Quantitative Comparison with Other Methods on DIOR and DOTA.

**Trainability Comparison Results, and the Results on Unknown Layout Dataset during Training **

Getting Started

1. Conda environment setup

conda env create -f environment.yaml
conda activate aerogen

2. Data Preparation

2.1 Dataset and structure

You need to download the dataset. Taking DIOR as an example, the dataset needs to be processed (see the data_process.md) to form the following format.

DIOR-R-train
├── images
│   ├── 00001.jpg
|   ├── ...
|   ├── 05862.jpg
├── labels
|   ├── 00001.jpg
|   ├── ...
|   ├── 05862.jpg
├── prompt.json

2.2 weights

Initialize the ControlNet model using the pretrained UNet encoder weights obtained from Stable Diffusion, and subsequently merge these weights with the Stable Diffusion model weights, saving the result as ./model/control_sd15_ini.ckpt.

python ./tools/add_control.py

3. Training

python train.py

4. Sampling

python merge_weights.py ./path/to/checkpoints
python inference.py

TODOs

Release the paper on arXiv.
Release the initial code.
Release the complete code.
Release synthetic images by OF-Diff.

Contact

If you have any questions about this paper or code, feel free to email me at [email protected]. This ensures I can promptly notice and respond!

Acknowledgements

Our work is based on Stable Diffusion, ControlNet, RemoteSAM, we appreciate their outstanding contributions. In addition, we are also extremely grateful to AeroGen and CC-Diff for their outstanding contributions in the field of remote sensing image generation. It is their excellent experiments that have promoted the development of this field.

Citation

@misc{ye2025objectfidelitydiffusionremote,
      title={Object Fidelity Diffusion for Remote Sensing Image Generation}, 
      author={Ziqi Ye and Shuran Ma and Jie Yang and Xiaoyi Yang and Ziyang Gong and Xue Yang and Haipeng Wang},
      year={2025},
      eprint={2508.10801},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2508.10801}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
cldm		cldm
data		data
figures		figures
font		font
ldm		ldm
models		models
tools		tools
README.md		README.md
config.py		config.py
environment.yaml		environment.yaml
merge_weights.py		merge_weights.py
share.py		share.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

OF-Diff: Object Fidelity Diffusion for Remote Sensing Image Generation

Abstract

Overview

Main Results

Getting Started

1. Conda environment setup

2. Data Preparation

3. Training

4. Sampling

TODOs

Contact

Acknowledgements

Citation

About

Uh oh!

Releases

Packages

Languages

VisionXLab/OF-Diff

Folders and files

Latest commit

History

Repository files navigation

OF-Diff: Object Fidelity Diffusion for Remote Sensing Image Generation

Abstract

Overview

Main Results

Getting Started

1. Conda environment setup

2. Data Preparation

3. Training

4. Sampling

TODOs

Contact

Acknowledgements

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages