Recent advancements in neural image compression have shown great potential in outperforming conventional standard codecs in terms of both rate-distortion and rate-analysis performance. However, there is an issue of divergent preferences in information preservation or reconstruction in the process of compression for humans and machines, respectively. Compression for humans tends to retain the signal fidelity or perceptual quality of visual appearance while compression for machines requires preserving critical semantic information, resulting in the limitation of the bitstream supporting only a single requirement during the compression. To bridge this gap, we propose a dynamic adaptation approach that generates a single bitstream serving both humans and machines. This approach aims to mitigate the domain gap among tasks, which facilitates maintaining the performance of out-of-scope tasks. Specifically, the proposed method concentrates on learning a dynamic adaptation process, i.e., optimizing the latent representation in the compressed domain in an end-to-end manner while adhering to the rate-performance constraint. Extensive results reveal that our paradigm significantly reduces the domain gap, surpassing existing codecs.
Official Pytorch implementation of Learned Image Compression for Both Humans and Machines via Dynamic Adaptation.
Lingyu Zhu, Binzhe Li, Riyu Lu, Peilin Chen, Qi Mao, Zhao Wang, Wenhan Yang, Shiqi Wang
[PDF
] [Presentation PPT
]
Recent advancements in neural image compression have shown great potential in outperforming conventional standard codecs in terms of both rate-distortion and rate-analysis performance. However, there is an issue of divergent preferences in information preservation or reconstruction in the process of compression for humans and machines, respectively. Compression for humans tends to retain the signal fidelity or perceptual quality of visual appearance while compression for machines requires preserving critical semantic information, resulting in the limitation of the bitstream supporting only a single requirement during the compression. To bridge this gap, we propose a dynamic adaptation approach that generates a single bitstream serving both humans and machines. This approach aims to mitigate the domain gap among tasks, which facilitates maintaining the performance of out-of-scope tasks. Specifically, the proposed method con- centrates on learning a dynamic adaptation process, i.e., optimizing the latent representation in the compressed domain in an end-to-end manner while adhering to the rate-performance constraint. Extensive results reveal that our paradigm significantly reduces the domain gap, surpassing existing codecs
We use the RGB image based on the COCO 2017 dataset.
If you find the project useful, please cite:
@inproceedings{zhu2024learned,
title={Learned Image Compression for Both Humans and Machines via Dynamic Adaptation},
author={Zhu, Lingyu and Li, Binzhe and Lu, Riyu and Chen, Peilin and Mao, Qi and Wang, Zhao and Yang, Wenhan and Wang, Shiqi},
booktitle={2024 IEEE International Conference on Image Processing (ICIP)},
pages={1788--1794},
year={2024},
organization={IEEE}
}