Logo Logo
Hilfe
Hilfe
Switch Language to English

Kotovenko, Dmytro; Ma, Pingchuan; Milbich, Timo ORCID logoORCID: https://orcid.org/0000-0002-5012-0874 und Ommer, Björn ORCID logoORCID: https://orcid.org/0000-0003-0766-120X (2023): Cross-Image-Attention for Conditional Embeddings in Deep Metric Learning. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17-24 June 2023. Institute of Electrical and Electronics Engineers (IEEE) (Hrsg.), In: CR-FIQA: Face Image Quality Assessment by Learning Sample Relative Classifiability, Piscataway, NJ: IEEE. S. 1170-1181

Volltext auf 'Open Access LMU' nicht verfügbar.

Abstract

Learning compact image embeddings that yield seman-tic similarities between images and that generalize to un-seen test classes, is at the core of deep metric learning (DML). Finding a mapping from a rich, localized image feature map onto a compact embedding vector is challenging: Although similarity emerges between tuples of images, DML approaches marginalize out information in an individ-ual image before considering another image to which simi-larity is to be computed. Instead, we propose during training to condition the em-bedding of an image on the image we want to compare it to. Rather than embedding by a simple pooling as in standard DML, we use cross-attention so that one image can iden-tify relevant features in the other image. Consequently, the attention mechanism establishes a hierarchy of conditional embeddings that gradually incorporates information about the tuple to steer the representation of an individual image. The cross-attention layers bridge the gap between the origi-nal unconditional embedding and the final similarity and al-low backpropagtion to update encodings more directly than through a lossy pooling layer. At test time we use the re-sulting improved unconditional embeddings, thus requiring no additional parameters or computational overhead. Ex-periments on established DML benchmarks show that our cross-attention conditional embedding during training im-proves the underlying standard DML pipeline significantly so that it outperforms the state-of-the-art.

Dokument bearbeiten Dokument bearbeiten