ORCID: https://orcid.org/0000-0002-5012-0874 und Ommer, Björn
ORCID: https://orcid.org/0000-0003-0766-120X
(2023):
Cross-Image-Attention for Conditional Embeddings in Deep Metric Learning.
2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17-24 June 2023.
Institute of Electrical and Electronics Engineers (IEEE) (ed.) ,
In: CR-FIQA: Face Image Quality Assessment by Learning Sample Relative Classifiability,
Piscataway, NJ: IEEE. pp. 1170-1181
Abstract
Learning compact image embeddings that yield seman-tic similarities between images and that generalize to un-seen test classes, is at the core of deep metric learning (DML). Finding a mapping from a rich, localized image feature map onto a compact embedding vector is challenging: Although similarity emerges between tuples of images, DML approaches marginalize out information in an individ-ual image before considering another image to which simi-larity is to be computed. Instead, we propose during training to condition the em-bedding of an image on the image we want to compare it to. Rather than embedding by a simple pooling as in standard DML, we use cross-attention so that one image can iden-tify relevant features in the other image. Consequently, the attention mechanism establishes a hierarchy of conditional embeddings that gradually incorporates information about the tuple to steer the representation of an individual image. The cross-attention layers bridge the gap between the origi-nal unconditional embedding and the final similarity and al-low backpropagtion to update encodings more directly than through a lossy pooling layer. At test time we use the re-sulting improved unconditional embeddings, thus requiring no additional parameters or computational overhead. Ex-periments on established DML benchmarks show that our cross-attention conditional embedding during training im-proves the underlying standard DML pipeline significantly so that it outperforms the state-of-the-art.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Faculties: | Mathematics, Computer Science and Statistics Mathematics, Computer Science and Statistics > Computer Science |
Subjects: | 000 Computer science, information and general works > 004 Data processing computer science 700 Arts and recreation > 770 Photography and computer art |
Place of Publication: | Piscataway, NJ |
Annotation: | ISBN 979-8-3503-0129-8 ; 979-8-3503-0130-4 (ISBN der Printausgabe) |
Language: | English |
Item ID: | 121304 |
Date Deposited: | 13. Sep 2024 12:07 |
Last Modified: | 08. Oct 2024 15:22 |