Information theory divergences in principal component analysis

Nakao, Eduardo K.; Levada, Alexandre L. M.

doi:10.1007/s10044-024-01215-w

Information theory divergences in principal component analysis

Short Paper
Published: 28 February 2024

Volume 27, article number 19, (2024)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

60 Accesses
Explore all metrics

Abstract

The metric learning area studies methodologies to find the most appropriate distance function for a given dataset. It was shown that dimensionality reduction algorithms are closely related to metric learning because, in addition to obtaining a more compact representation of the data, such methods also implicitly derive a distance function that best represents similarity between a pair of objects in the collection. Principal Component Analysis is a traditional linear dimensionality reduction algorithm that is still widely used by researchers. However, its procedure faithfully represents outliers in the generated space, which can be an undesirable characteristic in pattern recognition applications. With this is mind, it was proposed the replacement of the traditional punctual approach by a contextual one based on the data samples neighborhoods. This approach implements a mapping from the usual feature space to a parametric feature space, where the difference between two samples is defined by the vector whose scalar coordinates are given by the statistical divergence between two probability distributions. It was demonstrated for some divergences that the new approach outperforms several existing dimensionality reduction algorithms in a wide range of datasets. Although, it is important to investigate the framework divergence sensitivity. Experiments using Total Variation, Renyi, Sharma-Mittal and Tsallis divergences are exhibited in this paper and the results evidence the method robustness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Availability of data and materials

The datasets analyzed during the current study are available in www.openml.org.

Code availability

Code is available from the corresponding author on reasonable request.

References

Li D, Tian Y (2018) Survey and experimental study on metric learning methods. Neural Netw 105:447–462
Article PubMed Google Scholar
Wang F, Sun J (2015) Survey on distance metric learning and dimensionality reduction in data mining. Data Min Knowl Discov 29(2):534–564
Article MathSciNet Google Scholar
Bellet A, Habrard A, Sebban M (2013) A survey on metric learning for feature vectors and structured data. In: CoRR arXiv: 1306.6709
Suárez JL, García S, Herrera F (2021) A tutorial on distance metric learning: Mathematical foundations, algorithms, experimental analysis, prospects and challenges. Neurocomputing 425:300–322
Article Google Scholar
Yang L, Jin R (2006) Distance metric learning: a comprehensive survey. Michigan State University
Google Scholar
Jolliffe IT (2002) Principal component analysis, 2nd edn. Springer, Aberdeen, p 487
Google Scholar
Levada AL (2020) Parametric PCA for unsupervised metric learning. Pattern Recogn Lett 135:425–430
Article ADS Google Scholar
Levada ALM (2021) PCA-KL: a parametric dimensionality reduction approach for unsupervised metric learning. Adv Data Anal Classif 15(4):829–868
Article MathSciNet Google Scholar
Roweis S, Saul L (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290:2323–2326
Article CAS PubMed ADS Google Scholar
Verdu S (2014) Total variation distance and the distribution of relative information. In: 2014 information theory and applications workshop (ITA), pp 1– 3
Nielsen F, Sun K (2018) Guaranteed deterministic bounds on the total variation distance between univariate mixtures. In: 2018 IEEE 28th international workshop on machine learning for signal processing (MLSP), pp 1–6
van Erven T, Harremos P (2014) Rényi divergence and Kullback–Leibler divergence. IEEE Trans Inf Theory 60(7):3797–3820
Article Google Scholar
Gil M, Alajaji F, Linder T (2013) Rényi divergence measures for commonly used univariate continuous distributions. Inf Sci 249:124–131
Article Google Scholar
Havrda J, Charvat F (1967) Quantification method of classification processes. Kiberbetika Cislo 1(3):30–34
Google Scholar
Tsallis C (1988) Possible generalization of Boltzmann–Gibbs statistics. J Stat Phys 52:479–487
Article MathSciNet ADS Google Scholar
Nielsen F, Nock R (2011) On rényi and tsallis entropies and divergences for exponential families. arXiv preprint arXiv:1105.3259
Nielsen F, Nock R (2011) A closed-form expression for the Sharma–Mittal entropy of exponential families. J Phys A Math Theory 45(3):032003
Article MathSciNet ADS Google Scholar
Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comp Appl Math 20:53–65
Article Google Scholar
Markopoulos PP, Kundu S, Chamadia S, Pados DA (2017) Efficient l1-norm principal-component analysis via bit flipping. IEEE Trans Signal Process 65(16):4252–4264
Article MathSciNet ADS Google Scholar
Yi S, Lai Z, He Z, Cheung Y-M, Liu Y (2017) Joint sparse principal component analysis. Pattern Recogn 61:524–536
Article ADS Google Scholar
Schölkopf B, Smola A, Müller K-R (1997) Kernel principal component analysis. In: Gerstner W, Germond A, Hasler M, Nicoud J-D (eds) Artificial neural networks–ICANN’97. Springer, Berlin, Heidelberg, pp 583–588
Google Scholar
Candès EJ, Li X, Ma Y, Wright J (2011) Robust principal component analysis? J ACM 58(3):1–37
Article MathSciNet Google Scholar
Tenenbaum JB, de Silva V, Langford JC (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290:2319–2323
Article CAS PubMed ADS Google Scholar
Belkin M, Niyogi P (2003) Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput 15(6):1373–1396
Article Google Scholar
Camastra F (2003) Data dimensionality estimation methods: a survey. Pattern Recogn 36(12):2945–2954
Article ADS Google Scholar
Cox TF, Cox MAA (2000) Multidimensional scaling, 2nd edn. Chapman & Hall/CRC Monographs on Statistics & Applied Probability. CRC Press, New York
Book Google Scholar
He J, Ding L, Jiang L, Li Z, Hu Q (2014) Intrinsic dimensionality estimation based on manifold assumption. J Vis Commun Image Represent 25(5):740–747
Article Google Scholar
Miranda GF, Thomaz CE, Giraldi GA (2017) Geometric data analysis based on manifold learning with applications for image understanding. In: 2017 30th SIBGRAPI conference on graphics, patterns and images tutorials (SIBGRAPI-T), pp 42–62

Download references

Funding

No funds, grants, or other support were received.

Author information

Authors and Affiliations

Computing Department, Federal University of São Carlos, Washington Luis Highway, São Carlos, SP, 13565-905, Brazil
Eduardo K. Nakao & Alexandre L. M. Levada

Authors

Eduardo K. Nakao
View author publications
You can also search for this author in PubMed Google Scholar
Alexandre L. M. Levada
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Eduardo K. Nakao.

Ethics declarations

Conflict of interest

The authors have no competing interest to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nakao, E.K., Levada, A.L.M. Information theory divergences in principal component analysis. Pattern Anal Applic 27, 19 (2024). https://doi.org/10.1007/s10044-024-01215-w

Download citation

Received: 15 August 2022
Accepted: 31 December 2023
Published: 28 February 2024
DOI: https://doi.org/10.1007/s10044-024-01215-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Information theory divergences in principal component analysis

Abstract

Access this article

Availability of data and materials

Code availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation