Abstract
Multi-class object detection in infrared images is important in military and civilian use. Deep learning methods can obtain high accuracy but require a large-scale dataset. We propose a generative data augmentation framework DOCI-GAN, for infrared multi-class object detection with limited data. Contributions of this paper are four-folds. Firstly, DOCI-GAN is designed as a conditional image inpainting framework, yielding paired infrared multi-class object image and annotation. Secondly, a text-to-image converter is formulated to transform text-format object annotations to bounding box mask images, leading the augmentation to be mask-image-to-raw-image translation. Thirdly, a multiscale morphological erosion-based loss is created to alleviate the intensity inconsistency between inpainted local backgrounds and global background. Finally, for generating diverse images, artificial multi-class object annotations are integrated with real ones during augmentation. Experimental results demonstrated that DOCI-GAN augments dataset with high-quality infrared multi-class object images, consequently improving the accuracy of object detection baselines.
Dokumententyp: | Zeitschriftenartikel |
---|---|
Fakultät: | Medizin > Institut für Schlaganfall- und Demenzforschung (ISD) |
Themengebiete: | 600 Technik, Medizin, angewandte Wissenschaften > 610 Medizin und Gesundheit |
URN: | urn:nbn:de:bvb:19-epub-120218-3 |
ISSN: | 00313203 |
Sprache: | Englisch |
Dokumenten ID: | 120218 |
Datum der Veröffentlichung auf Open Access LMU: | 27. Aug. 2024, 13:49 |
Letzte Änderungen: | 27. Aug. 2024, 13:49 |