skip to main content
research-article

Parametric Information Geometry with the Package Geomstats

Published:15 December 2023Publication History
Skip Abstract Section

Abstract

We introduce the information geometry module of the Python package Geomstats. The module first implements Fisher–Rao Riemannian manifolds of widely used parametric families of probability distributions, such as normal, gamma, beta, Dirichlet distributions, and more. The module further gives the Fisher–Rao Riemannian geometry of any parametric family of distributions of interest, given a parameterized probability density function as input. The implemented Riemannian geometry tools allow users to compare, average, interpolate between distributions inside a given family. Importantly, such capabilities open the door to statistics and machine learning on probability distributions. We present the object-oriented implementation of the module along with illustrative examples and show how it can be used to perform learning on manifolds of parametric probability distributions.

REFERENCES

  1. [1] Abbad Zakariae, Maliani El, Drissi Ahmed, Alaoui Said Ouatik, and Hassouni Mohammed El. 2017. Rao-geodesic distance on the generalized gamma manifold: Study of three sub-manifolds and application in the texture retrieval domain. Note di Matematica 37, supp1 (2017), 118.Google ScholarGoogle Scholar
  2. [2] Amari Shun-ichi. 2016. Information Geometry and Its Applications, Vol. 194. Springer.Google ScholarGoogle ScholarCross RefCross Ref
  3. [3] Ambrosio Luigi and Gigli Nicola. 2013. A user’s guide to optimal transport. In Modelling and Optimisation of Flows on Networks. Springer, 1155.Google ScholarGoogle Scholar
  4. [4] Angulo Jesus and Velasco-Forero Santiago. 2014. Morphological processing of univariate gaussian distribution-valued images based on poincaré upper-half plane representation. In Geometric Theory of Information. Springer, 331366.Google ScholarGoogle ScholarCross RefCross Ref
  5. [5] Arsigny Vincent. 2006. Processing Data in Lie Groups: An Algebraic Approach. Application to Non-Linear Registration and Diffusion Tensor MRI. Ph.D. Dissertation. École polytechnique.Google ScholarGoogle Scholar
  6. [6] Arutjunjan Rafael. 2021. InformationGeometry.jl. (2021).Google ScholarGoogle Scholar
  7. [7] Arwini Khadiga and Dodson Christopher T. J.. 2008. Information Geometry: Near Randomness and Near Independence. Springer Science and Business Media.Google ScholarGoogle ScholarCross RefCross Ref
  8. [8] Atkinson Colin and Mitchell Ann F. S.. 1981. Rao’s distance measure. Sankhyā: The Indian Journal of Statistics, Series 43, 3 (1981), 345365.Google ScholarGoogle Scholar
  9. [9] Blei David M., Ng Andrew Y., and Jordan Michael I.. 2003. Latent dirichlet allocation. Journal of Machine Learning Research 3, Jan (2003), 9931022.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. [10] Burbea J., Oller J. M., and Reverter F.. 2002. Some remarks on the information geometry of the gamma distribution. Communications in Statistics—Theory and Methods 31, 11 (2002), 19591975.Google ScholarGoogle ScholarCross RefCross Ref
  11. [11] Cencov Nikolai Nikolaevich. 1982. Statistical decision rules and optimal inference. transl. math. Monographs, American Mathematical Society, Providence, RI 53 (1982).Google ScholarGoogle Scholar
  12. [12] Chen William W. S. and Kotz Samuel. 2013. The riemannian structure of the three-parameter gamma distribution. Applied Mathematics 4, 3 (2013), 514522.Google ScholarGoogle ScholarCross RefCross Ref
  13. [13] Costa Sueli I. R., Santos Sandra A., and Strapasson Joao E.. 2015. Fisher information distance: A geometrical reading. Discrete Applied Mathematics 197 (2015), 5969.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. [14] Carmo Manfredo Perdigao Do and Francis J. Flaherty. 1992. Riemannian Geometry, Vol. 6. Springer.Google ScholarGoogle ScholarCross RefCross Ref
  15. [15] Dryden Ian and Mardia Kanti. 1998. Statistical Shape Analysis, with Applications in R. John Wiley and Sons, New York.Google ScholarGoogle Scholar
  16. [16] Flamary Rémi, Courty Nicolas, Gramfort Alexandre, Alaya Mokhtar Z., Boisbunon Aurélie, Chambon Stanislas, Chapel Laetitia, Corenflos Adrien, Fatras Kilian, Fournier Nemo, Gautheron Léo, Gayraud Nathalie T. H., Janati Hicham, Rakotomamonjy Alain, Redko Ievgen, Rolet Antoine, Schutz Antony, Seguy Vivien, Sutherland Danica J., Tavenard Romain, Tong Alexander, and Vayer Titouan. 2021. POT: Python optimal transport. Journal of Machine Learning Research 22, 78 (2021), 18. Retrieved from http://jmlr.org/papers/v22/20-451.htmlGoogle ScholarGoogle Scholar
  17. [17] Friedrich Thomas. 1991. Die fisher-information und symplektische strukturen. Mathematische Nachrichten 153, 1 (1991), 273296.Google ScholarGoogle ScholarCross RefCross Ref
  18. [18] Nicolas Guigui, Nina Miolane, Xavier Pennec, et al. 2023. Introduction to riemannian geometry and geometric statistics: from basic theory to implementation with geomstats. Foundations and Trends® in Machine Learning, 16, 3 (2023), 329–493.Google ScholarGoogle Scholar
  19. [19] Harper Marc. 2009. Information geometry and evolutionary game theory. arXiv:0911.1383. Retrieved from https://arxiv.org/abs/0911.1383Google ScholarGoogle Scholar
  20. [20] Husak Gregory J., Michaelsen Joel, and Funk Chris. 2007. Use of the gamma distribution to represent monthly rainfall in africa for drought monitoring applications. International Journal of Climatology: A Journal of the Royal Meteorological Society 27, 7 (2007), 935944.Google ScholarGoogle Scholar
  21. [21] Kass Robert E.. 1989. The geometry of asymptotic inference. Statistical Science 41, 3 (1989), 188219.Google ScholarGoogle Scholar
  22. [22] Lauritzen Stefan L.. 1987. Statistical manifolds. Differential Geometry in Statistical Inference 10 (1987), 163216.Google ScholarGoogle ScholarCross RefCross Ref
  23. [23] Brigant Alice Le, Guigui Nicolas, Rebbah Sana, and Puechmorel Stéphane. 2021. Classifying histograms of medical data using information geometry of beta distributions. IFAC-PapersOnLine 54, 9 (2021), 514520.Google ScholarGoogle ScholarCross RefCross Ref
  24. [24] Brigant Alice Le, Preston Stephen C, and Puechmorel Stéphane. 2021. Fisher-rao geometry of dirichlet distributions. Differential Geometry and its Applications 74 (2021), 101702.Google ScholarGoogle ScholarCross RefCross Ref
  25. [25] Guy Lebanon. 2002. Learning riemannian metrics. In Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence (UAI’03). San Francisco, CA, Morgan Kaufmann Publishers Inc., 362–369.Google ScholarGoogle Scholar
  26. [26] Lehmann Erich L and Casella George. 2006. Theory of Point Estimation. Springer Science and Business Media.Google ScholarGoogle Scholar
  27. [27] Lenglet Christophe, Rousson Mikaël, Deriche Rachid, and Faugeras Olivier. 2006. Statistics on the manifold of multivariate normal distributions: Theory and application to diffusion tensor MRI processing. Journal of Mathematical Imaging and Vision 25, 3 (2006), 423444.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. [28] Maybank Stephen J.. 2004. Detection of image structures using the fisher information and the rao metric. IEEE Transactions on Pattern Analysis and Machine Intelligence 26, 12 (2004), 15791589.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. [29] Nina Miolane, Nicolas Guigui, Alice Le Brigant, Johan Mathe, Benjamin Hou, Yann Thanwerdas, Stefan Heyder, Olivier Peltre, Niklas Koep, Hadi Zaatiti, Hatem Hajri, Yann Cabanes, Thomas Gerald, Paul Chauchat, Christian Shewmake, Daniel Brooks, Bernhard Kainz, Claire Donnat, Susan Holmes, and Xavier Pennec. 2002. Geomstats: a Python package for Riemannian geometry in machine learning. Journal of Machine Learning Research, 21, 223 (2020), 1–9.Google ScholarGoogle Scholar
  30. [30] Nielsen Frank. 2023. A simple approximation method for the fisher–rao distance between multivariate normal distributions. Entropy 25, 4 (2023), 654. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  31. [31] Felix Otto. 2001. The geometry of dissipative evolution equations: the porous medium equation. Communications in Partial Differential Equations, 26, 1-2 (2001), 101–174.Google ScholarGoogle Scholar
  32. [32] Pennec Xavier, Fillard Pierre, and Ayache Nicholas. 2006. A riemannian framework for tensor computing. International Journal of Computer Vision 66, 1 (2006), 4166.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. [33] Pinele Julianna, Costa Sueli I. R., and Strapasson João E.. 2019. On the fisher-rao information metric in the space of normal distributions. In International Conference on Geometric Science of Information. Springer, 676684.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. [34] Rao C. Radhakrishna. 1945. Information and the accuracy attainable in the estimation of statistical parameters. Bulletin of the Calcutta Mathematical Society 37 (1945). DOI:Google ScholarGoogle ScholarCross RefCross Ref
  35. [35] Rebbah Sana, Nicol Florence, and Puechmorel Stéphane. 2019. The geometry of the generalized gamma manifold and an application to medical imaging. Mathematics 7, 8 (2019), 674.Google ScholarGoogle ScholarCross RefCross Ref
  36. [36] Sato Yoshiharu, Sugawa Kazuaki, and Kawaguchi Michiaki. 1979. The geometrical structure of the parameter space of the two-dimensional normal distribution. Reports on Mathematical Physics 16, 1 (1979), 111119.Google ScholarGoogle ScholarCross RefCross Ref
  37. [37] Semenikhine Vadim, Furman Edward, and Su Jianxi. 2018. On a multiplicative multivariate gamma distribution with applications in insurance. Risks 6, 3 (2018), 79.Google ScholarGoogle ScholarCross RefCross Ref
  38. [38] Shinmoto Hiroshi, Oshio Koichi, Tamura Chiharu, Soga Shigeyoshi, Okamura Teppei, Yamada Kentaro, Kaji Tastumi, and Mulkern Robert V.. 2015. Diffusion-weighted imaging of prostate cancer using a statistical model based on the gamma distribution. Journal of Magnetic Resonance Imaging 42, 1 (2015), 5662.Google ScholarGoogle ScholarCross RefCross Ref
  39. [39] Skovgaard Lene Theil. 1984. A riemannian geometry of the multivariate normal model. Scandinavian Journal of Statistics 11, 4 (1984), 211223.Google ScholarGoogle Scholar
  40. [40] Strapasson João E., Pinele Julianna, and Costa Sueli I. R.. 2016. Clustering using the fisher-rao distance. In 2016 IEEE Sensor Array and Multichannel Signal Processing Workshop (SAM). IEEE, 15.Google ScholarGoogle Scholar
  41. [41] Verdoolaege Geert and Scheunders Paul. 2011. Geodesics on the manifold of multivariate generalized gaussian distributions with an application to multicomponent texture discrimination. International Journal of Computer Vision 95, 3 (2011), 265286.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. [42] Younes Laurent. 2012. Spaces and manifolds of shapes in computer vision: An overview. Image and Vision Computing 30, 6-7 (2012), 389397.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Parametric Information Geometry with the Package Geomstats

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Mathematical Software
        ACM Transactions on Mathematical Software  Volume 49, Issue 4
        December 2023
        226 pages
        ISSN:0098-3500
        EISSN:1557-7295
        DOI:10.1145/3637452
        • Editors:
        • Zhaojun Bai,
        • Wolfgang Bangerth
        Issue’s Table of Contents

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 15 December 2023
        • Online AM: 13 October 2023
        • Accepted: 25 September 2023
        • Revised: 13 July 2023
        • Received: 18 November 2022
        Published in toms Volume 49, Issue 4

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
      • Article Metrics

        • Downloads (Last 12 months)262
        • Downloads (Last 6 weeks)44

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      View Full Text