- We summarize common failure modes in the generation of remote sensing images, including control leakage, structural distortion, dense generation collapse, and feature-level mismatch. In these four aspects, OF-Diff outperforms the current state-of-the-art methods.
- Comparison of OF-Diff with Mainstream Methods.
- An Overview of OF-Diff and Pipeline.
- Comparison of the Generation Results of OF-Diff with Other Methods.
- Diversity Results and Style Preference Results
- Quantitative Comparison with Other Methods on DIOR and DOTA.
- **Trainability Comparison Results, and the Results on Unknown Layout Dataset during Training **
conda env create -f environment.yaml
conda activate aerogen
2.1 Dataset and structure
You need to download the dataset. Taking DIOR as an example, the dataset needs to be processed (see the data_process.md) to form the following format.
DIOR-R-train
├── images
│ ├── 00001.jpg
| ├── ...
| ├── 05862.jpg
├── labels
| ├── 00001.jpg
| ├── ...
| ├── 05862.jpg
├── prompt.json
2.2 weights
Initialize the ControlNet model using the pretrained UNet encoder weights obtained from Stable Diffusion, and subsequently merge these weights with the Stable Diffusion model weights, saving the result as ./model/control_sd15_ini.ckpt.
python ./tools/add_control.py
python train.py
python merge_weights.py ./path/to/checkpoints
python inference.py
- Release the paper on arXiv.
- Release the initial code.
- Release the complete code.
- Release synthetic images by OF-Diff.
If you have any questions about this paper or code, feel free to email me at [email protected]. This ensures I can promptly notice and respond!
Our work is based on Stable Diffusion, ControlNet, RemoteSAM, we appreciate their outstanding contributions. In addition, we are also extremely grateful to AeroGen and CC-Diff for their outstanding contributions in the field of remote sensing image generation. It is their excellent experiments that have promoted the development of this field.
@misc{ye2025objectfidelitydiffusionremote,
title={Object Fidelity Diffusion for Remote Sensing Image Generation},
author={Ziqi Ye and Shuran Ma and Jie Yang and Xiaoyi Yang and Ziyang Gong and Xue Yang and Haipeng Wang},
year={2025},
eprint={2508.10801},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2508.10801},
}