Abstract
The metric learning area studies methodologies to find the most appropriate distance function for a given dataset. It was shown that dimensionality reduction algorithms are closely related to metric learning because, in addition to obtaining a more compact representation of the data, such methods also implicitly derive a distance function that best represents similarity between a pair of objects in the collection. Principal Component Analysis is a traditional linear dimensionality reduction algorithm that is still widely used by researchers. However, its procedure faithfully represents outliers in the generated space, which can be an undesirable characteristic in pattern recognition applications. With this is mind, it was proposed the replacement of the traditional punctual approach by a contextual one based on the data samples neighborhoods. This approach implements a mapping from the usual feature space to a parametric feature space, where the difference between two samples is defined by the vector whose scalar coordinates are given by the statistical divergence between two probability distributions. It was demonstrated for some divergences that the new approach outperforms several existing dimensionality reduction algorithms in a wide range of datasets. Although, it is important to investigate the framework divergence sensitivity. Experiments using Total Variation, Renyi, Sharma-Mittal and Tsallis divergences are exhibited in this paper and the results evidence the method robustness.
Availability of data and materials
The datasets analyzed during the current study are available in www.openml.org.
Code availability
Code is available from the corresponding author on reasonable request.
References
Li D, Tian Y (2018) Survey and experimental study on metric learning methods. Neural Netw 105:447–462
Wang F, Sun J (2015) Survey on distance metric learning and dimensionality reduction in data mining. Data Min Knowl Discov 29(2):534–564
Bellet A, Habrard A, Sebban M (2013) A survey on metric learning for feature vectors and structured data. In: CoRR arXiv: 1306.6709
Suárez JL, García S, Herrera F (2021) A tutorial on distance metric learning: Mathematical foundations, algorithms, experimental analysis, prospects and challenges. Neurocomputing 425:300–322
Yang L, Jin R (2006) Distance metric learning: a comprehensive survey. Michigan State University
Jolliffe IT (2002) Principal component analysis, 2nd edn. Springer, Aberdeen, p 487
Levada AL (2020) Parametric PCA for unsupervised metric learning. Pattern Recogn Lett 135:425–430
Levada ALM (2021) PCA-KL: a parametric dimensionality reduction approach for unsupervised metric learning. Adv Data Anal Classif 15(4):829–868
Roweis S, Saul L (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290:2323–2326
Verdu S (2014) Total variation distance and the distribution of relative information. In: 2014 information theory and applications workshop (ITA), pp 1– 3
Nielsen F, Sun K (2018) Guaranteed deterministic bounds on the total variation distance between univariate mixtures. In: 2018 IEEE 28th international workshop on machine learning for signal processing (MLSP), pp 1–6
van Erven T, Harremos P (2014) Rényi divergence and Kullback–Leibler divergence. IEEE Trans Inf Theory 60(7):3797–3820
Gil M, Alajaji F, Linder T (2013) Rényi divergence measures for commonly used univariate continuous distributions. Inf Sci 249:124–131
Havrda J, Charvat F (1967) Quantification method of classification processes. Kiberbetika Cislo 1(3):30–34
Tsallis C (1988) Possible generalization of Boltzmann–Gibbs statistics. J Stat Phys 52:479–487
Nielsen F, Nock R (2011) On rényi and tsallis entropies and divergences for exponential families. arXiv preprint arXiv:1105.3259
Nielsen F, Nock R (2011) A closed-form expression for the Sharma–Mittal entropy of exponential families. J Phys A Math Theory 45(3):032003
Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comp Appl Math 20:53–65
Markopoulos PP, Kundu S, Chamadia S, Pados DA (2017) Efficient l1-norm principal-component analysis via bit flipping. IEEE Trans Signal Process 65(16):4252–4264
Yi S, Lai Z, He Z, Cheung Y-M, Liu Y (2017) Joint sparse principal component analysis. Pattern Recogn 61:524–536
Schölkopf B, Smola A, Müller K-R (1997) Kernel principal component analysis. In: Gerstner W, Germond A, Hasler M, Nicoud J-D (eds) Artificial neural networks–ICANN’97. Springer, Berlin, Heidelberg, pp 583–588
Candès EJ, Li X, Ma Y, Wright J (2011) Robust principal component analysis? J ACM 58(3):1–37
Tenenbaum JB, de Silva V, Langford JC (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290:2319–2323
Belkin M, Niyogi P (2003) Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput 15(6):1373–1396
Camastra F (2003) Data dimensionality estimation methods: a survey. Pattern Recogn 36(12):2945–2954
Cox TF, Cox MAA (2000) Multidimensional scaling, 2nd edn. Chapman & Hall/CRC Monographs on Statistics & Applied Probability. CRC Press, New York
He J, Ding L, Jiang L, Li Z, Hu Q (2014) Intrinsic dimensionality estimation based on manifold assumption. J Vis Commun Image Represent 25(5):740–747
Miranda GF, Thomaz CE, Giraldi GA (2017) Geometric data analysis based on manifold learning with applications for image understanding. In: 2017 30th SIBGRAPI conference on graphics, patterns and images tutorials (SIBGRAPI-T), pp 42–62
Funding
No funds, grants, or other support were received.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no competing interest to declare that are relevant to the content of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Nakao, E.K., Levada, A.L.M. Information theory divergences in principal component analysis. Pattern Anal Applic 27, 19 (2024). https://doi.org/10.1007/s10044-024-01215-w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10044-024-01215-w