Deep Metric Learning: Loss Functions Comparison

Vasilev, R. L.; D’yakonov, A. G.

doi:10.1134/S1064562423701053

Deep Metric Learning: Loss Functions Comparison

Published: 09 February 2024

Volume 108, pages S215–S225, (2023)
Cite this article

Doklady Mathematics Aims and scope Submit manuscript

R. L. Vasilev¹ &
A. G. D’yakonov¹

31 Accesses
Explore all metrics

Abstract

An overview of deep metric learning methods is presented. Although they have appeared in recent years, these methods were compared only with their predecessors, with neural networks of outdated architectures used for representation learning (representations on which the metric is calculated). The described methods were compared on different datasets from several domains, using pre-trained neural networks comparable in performance to SotA (state of the art): ConvNeXt for images and DistilBERT for texts. Labeled datasets were used, divided into two parts (train and test) so that the classes did not overlap (i.e., for each class its objects are fully in train or fully in test). Such a large-scale honest comparison was made for the first time and led to unexpected conclusions, viz. some “old” methods, for example, Tuplet Margin Loss, are superior in performance to their modern modifications and methods proposed in very recent works.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1.

Fig. 2.

Fig. 3.

Fig. 4.

A Metric Learning Reality Check

Deep Metric Learning Using Triplet Network

Deep Metric Learning with Hierarchical Triplet Loss

REFERENCES

W. Chen, Y. Liu, W. Wang, E. M. Bakker, T. K. Georgiou, P. Fieguth, L. Liu, and M. S. K. Lew, “Deep image retrieval: A survey” (2021). https://arxiv.org/abs/2101.11282
N. Reimers and I. Gurevych, “Sentence-bert: Sentence embeddings using Siamese bert-networks” (2019). arXiv:1908.10084
I. Masi, Y. Wu, T. Hassner, and P. Natarajan, “Deep face recognition: A survey,” in 2018 31st SIBGRAPI Conference on Graphics, Patterns and Images (IEEE, 2018), pp. 471–478.
M. Je, J. Shen, G. Lin, T. Xiang, L. Shao, and C. H. S. Hoi, “Deep learning for person re-identification: A survey and outlook,” IEEE Transactions on Pattern Analysis and Machine Intelligence (2021).
K. Musgrave, S. Belongie, and S.-N. Lim, “A metric learning reality check,” in European Conference on Computer Vision (Springer, Berlin, 2020), pp. 681–699.
J. Johnson, M. Douze, and H. Jégou, “Billion-scale similarity search with GPUs,” IEEE Trans. Big Data 7 (3), 535–547 (2019).
Article Google Scholar
S. Chopra, R. Hadsell, and Y. LeCun, “Learning a similarity metric discriminatively, with application to face verification,” in 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2005), Vol. 1, pp. 539–546.
F. Schroff, D. Kalenichenko, and J. Philbin, “Facenet: A unified embedding for face recognition and clustering,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015), pp. 815–823.
F. Cakir, K. He, X. Xia, B. Kulis, and S. Sclaroff, “Deep metric learning to rank,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2019), pp. 1861–1870.
E. Ustinova and V. Lempitsky, “Learning deep embeddings with histogram loss,” Advances in Neural Information Processing Systems (2016), Vol. 29.
V. Wieczorek, B. Rychalska, and J. Dąbrowski, “On the unreasonable effectiveness of centroids in image retrieval,” in International Conference on Neural Information Processing (Springer, Berlin, 2021), pp. 212–223.
W. Chao-Yuan, R. Manmatha, A. J. Smola, and P. Krahenbuhl, “Sampling matters in deep embedding learning,” in Proceedings of the IEEE International Conference on Computer Vision (2017), pp. 2840–2848.
W. Xun, H. Xintong, H. Weilin, D. Dengke, and M. R. Scott, “Multisimilarity loss with general pair weighting for deep metric learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2019), pp. 5022–5030.
N. Frosst, N. Papernot, and G. Hinton, “Analyzing and improving representations with the soft nearest neighbor loss,” Proc. Mach. Learn. Res. 97, 2012–2020 (2019).
Google Scholar
P. Khosla, P. Teterwak, C. Wang, A. Sarna, Y. Tian, P. Isola, A. Maschinot, C. Liu, and D. Krishnan, “Supervised contrastive learning,” Adv. Neural Inf. Process. Syst. 33, 18661–18673 (2020).
Google Scholar
T. Yuan, W. Deng, J. Tang, Y. Tang, and B. Chen, “Signal-to-noise ratio: A robust distance metric for deep metric learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2019), pp. 4815–4824.
B. Yu and D. Tao, “Deep metric learning with tuplet margin loss,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (2019), pp. 6490–6499.
Y. Sun, C. Cheng, Y. Zhang, C. Zhang, L. Zheng, Z. Wang, and Y. Wei, “Circle loss: A unified perspective of pair similarity optimization,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020), pp. 6398–6407.
J. Deng, J. Guo, N. Xue, and S. Zafeiriou, “Arcface: Additive angular margin loss for deep face recognition,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2019), pp. 4690–4699.
H. Wang, Y. Wang, Z. Zhou, X. Ji, D. Gong, J. Zhou, Z. Li, and W. Liu, “Cosface: Large margin cosine loss for deep face recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018), pp. 5265–5274.
J. Deng, J. Guo, T. Liu, M. Gong, and S. Zafeiriou, “Subcenter arcface: Boosting face recognition by large-scale noisy web faces,” in European Conference on Computer Vision (Springer, Berlin, 2020), pp. 741–757.
Q. Qian, L. Shang, B. Sun, J. Hu, H. Li, and R. Jin, “Softtriple loss: Deep metric learning without triplet sampling,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (2019), pp. 6450–6458.
S. Kim, D. Kim, M. Cho, and S. Kwak, “Proxy anchor loss for deep metric learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020), pp. 3238–3247.
J. Krause, M. Stark, J. Deng, and L. Fei-Fei, “3D object representations for fine-grained categorization,” in The 4th International IEEE Workshop on 3D Representation and Recognition, Sydney, Australia (2013).
C. Wah, S. Branson, P. Welinder, P. Perona, and S. Belongie, Caltech-UCSD Birds-200-2011 Dataset (2011).
H. O. Song, Y. Xiang, S. Jegelka, and S. Savarese, “Deep metric learning via lifted structured feature embedding,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016).
D.-N. Zou, S.-H. Zhang, T.-J. Mu, and M. Zhang, “A new dataset of dog breed images and a benchmark for fine-grained classification,” Comput. Visual Media 6 (4), 477–487 (2020).
Article Google Scholar
K. Lang, “Newsweeder: Learning to filter netnews,” in Machine Learning Proceedings (Elsevier, 1995), pp. 331–339.
Google Scholar
K. Kowsari, D. E. Brown, M. Heidarysafa, K. J. Meimandi, M. S. Gerber, and L. E. Barnes, “Hdltex: Hierarchical deep learning for text classification,” in 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA) (IEEE, 2017), pp. 364–371.
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016), pp. 770–778.
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015). pp. 1–9.
Z. Liu, H. Mao, C.-Y. Wu, C. Feichtenhofer, T. Darrell, and S. Xiell, “A ConvNet for the 2020s,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022).
V. Sanh, L. Debut, J. Chaumond, and T. Wolf, “DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter” (2019). https://arxiv.org/abs/1910.01108
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding” (2018). https://arxiv.org/abs/1810.04805
K. Musgrave, S. Belongie, and S.-N. Lim, “PyTorch metric learning” (2020). https://arxiv.org/abs/2008.09164
D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization” (2014). https://arxiv.org/abs/1412.6980
J. Wohlwend, E. R. Elenberg, S. Altschul, S. Henry, and T. Lei, “Metric learning for dynamic text classification” (2019). https://arxiv.org/abs/1911.01026

Download references

Funding

This work was supported by the ongoing institutional funding. No additional grants to carry out or direct this particular research were obtained.

Author information

Authors and Affiliations

Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, Moscow, Russia
R. L. Vasilev & A. G. D’yakonov

Authors

R. L. Vasilev
View author publications
You can also search for this author in PubMed Google Scholar
A. G. D’yakonov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to R. L. Vasilev or A. G. D’yakonov.

Ethics declarations

The authors of this work declare that they have no conflicts of interest.

Additional information

Publisher’s Note.

Pleiades Publishing remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vasilev, R.L., D’yakonov, A.G. Deep Metric Learning: Loss Functions Comparison. Dokl. Math. 108 (Suppl 2), S215–S225 (2023). https://doi.org/10.1134/S1064562423701053

Download citation

Received: 30 June 2023
Revised: 19 September 2023
Accepted: 15 October 2023
Published: 09 February 2024
Issue Date: December 2023
DOI: https://doi.org/10.1134/S1064562423701053

Keywords:

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions