Logo Logo
Hilfe
Hilfe
Switch Language to English

Wang, Peng; Ma, Zhe; Dong, Bo; Liu, Xiuhua; Ding, Jishiyu; Sun, Kewu und Chen, Ying ORCID logoORCID: https://orcid.org/0009-0008-8183-6729 (2024): Generative data augmentation by conditional inpainting for multi-class object detection in infrared images. In: Pattern Recognition, Bd. 153, 110501 [PDF, 2MB]

Abstract

Multi-class object detection in infrared images is important in military and civilian use. Deep learning methods can obtain high accuracy but require a large-scale dataset. We propose a generative data augmentation framework DOCI-GAN, for infrared multi-class object detection with limited data. Contributions of this paper are four-folds. Firstly, DOCI-GAN is designed as a conditional image inpainting framework, yielding paired infrared multi-class object image and annotation. Secondly, a text-to-image converter is formulated to transform text-format object annotations to bounding box mask images, leading the augmentation to be mask-image-to-raw-image translation. Thirdly, a multiscale morphological erosion-based loss is created to alleviate the intensity inconsistency between inpainted local backgrounds and global background. Finally, for generating diverse images, artificial multi-class object annotations are integrated with real ones during augmentation. Experimental results demonstrated that DOCI-GAN augments dataset with high-quality infrared multi-class object images, consequently improving the accuracy of object detection baselines.

Dokument bearbeiten Dokument bearbeiten