Lotka–Volterra model with mutations and generative adversarial networks

Kozyrev, S. V.

doi:10.1134/S0040577924020077

Lotka–Volterra model with mutations and generative adversarial networks

Research Articles
Published: 27 February 2024

Volume 218, pages 276–284, (2024)
Cite this article

Theoretical and Mathematical Physics Aims and scope Submit manuscript

S. V. Kozyrev¹

20 Accesses
1 Altmetric
Explore all metrics

Abstract

A model of population genetics of the Lotka–Volterra type with mutations on a statistical manifold is introduced. Mutations in the model are described by diffusion on a statistical manifold with a generator in the form of a Laplace–Beltrami operator with a Fisher–Rao metric, that is, the model combines population genetics and information geometry. This model describes a generalization of the model of machine learning theory, the model of generative adversarial network (GAN), to the case of populations of generative adversarial networks. The introduced model describes the control of overfitting for generating adversarial networks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Recent Developments in Generative Adversarial Networks

The Generative Adversarial Random Neural Network

Algorithms that Get Old: The Case of Generative Deep Neural Networks

References

I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in: 28th Annual Conference on Neural Information Processing Systems 2014 (Montreal, Canada, 8–13 December, 2014, Advances in Neural Information Processing Systems, Vol. 27, Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K. Q. Weinberger, eds.), NIPS Foundation, La Jolla, CA (2014), pp. 2672–2680; arXiv: 1406.2661.
Google Scholar
I. Goodfellow, “NIPS 2016 tutorial: Generative adversarial networks,” arXiv: 1701.00160.
S. V. Kozyrev, “Learning theory and population genetics,” Lobachevskii J. Math., 43, 1655–1662 (2022).
MathSciNet Google Scholar
S. V. Kozyrev, “Learning by population genetics and matrix Riccati equation,” Entropy, 25, 348, 9 pp. (2023).
ADS MathSciNet PubMed PubMed Central Google Scholar
A. M. Turing, “Computing machinery and intelligence,” Mind, LIX, 433–460 (1950).
MathSciNet Google Scholar
V. Vanchurin, Yu. I. Wolf, M. I. Katsnelson, and E. V. Koonin, “Towards a theory of evolution as multilevel learning,” Proc. Natl. Acad. Sci. USA, 119, e2120037119, 12 pp. (2022).
CAS PubMed PubMed Central Google Scholar
S. Nikolenko, A. Kadurin, and E. Arkhangelskaya, Deep Learning. Immersion in the World of Neural Networks [in Russian], Piter, St. Petersburg (2018).
Google Scholar
M. Eigen, J. McCaskill, and P. Schuster, “Molecular quasi-species,” J. Phys. Chem., 92, 6881–6891 (1988).
CAS Google Scholar
E. A. Morozova and N. N. Chentsov, “Natural geometry of families of probability laws [in Russian],” in: Probability theory – 8, Itogi Nauki i Tekhniki. Ser. Sovrem. Probl. Mat. Fund. Napr., Vol. 83), VINITI, Moscow (1991), pp. 133–265.
Google Scholar
N. N. Čencov, Statistical Decision Rules and Optimal Inferences (Translations of Mathematical Monographs, Vol. 53), AMS, Providence, RI (1972).
Google Scholar
S. Amari, Differential-Geometrical Methods in Statistics (Lecture Notes in Statistics, Vol. 28), Springer, Berlin (1985).
Google Scholar
S. Amari and H. Nagaoka, Methods of Information Geometry (Translations of Mathematical Monographs, Vol. 191), AMS, Providence, RI (2000).
Google Scholar
P. Gibilisco, E. Riccomagno, M. P. Rogantin, and H. P. Wynn (eds.), Algebraic and Geometric Methods in Statistics, Cambridge Univ. Press, Cambridge (2010).
Google Scholar
N. Combe, Yu. I. Manin, and M. Marcolli, “Geometry of information: Classical and quantum aspects,” Theor. Comput. Sci., 908, 2–27 (2022); arXiv: 2107.08006.
MathSciNet Google Scholar
V. N. Vapnik, The Nature of Statistical Learning Theory, Springer, New York (2000).
Google Scholar
S. Hochreiter and J. Schmidhuber, “Flat minima,” Neural Computation, 9, 1–42 (1997).
CAS PubMed Google Scholar
O. Bousquet and A. Elisseeff, “Stability and generalization,” J. Mach. Learn. Res., 2, 499–526 (2002).
MathSciNet Google Scholar
S. Kutin and P. Niyogi, “Almost-everywhere algorithmic stability and generalization error,” in: Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intelligence (UAI2002) (Alberta, Canada, August 1–4, 2002, A. Darwiche and N. Friedman, eds.), Morgan Kaufmann Publ., San Francisco, CA (2002), pp. 275–282; arXiv: 1301.0579.
Google Scholar
T. Poggio, R. Rifkin, S. Mukherjee, and P. Niyogi, “General conditions for predictivity in learning theory,” Nature, 428, 419–422 (2004).
ADS CAS PubMed Google Scholar
R. A. Fisher, The Genetical Theory of Natural Selection, Oxford Univ. Press, Oxford (1999).
Google Scholar
J. M. Smith, Evolution and the Theory of Games, Cambridge Univ. Press, Cambridge (1982).
Google Scholar

Download references

Funding

This work was supported by the Russian Science Foundation under grant No. 19-11-00320, https://rscf.ru/en/project/19-11-00320/.

Author information

Authors and Affiliations

Steklov Mathematical Institute, Russian Academy of Sciences, Moscow, Russia
S. V. Kozyrev

Authors

S. V. Kozyrev
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to S. V. Kozyrev.

Ethics declarations

The author of this work declares that he has no conflicts of interest.

Additional information

Translated from Teoreticheskaya i Matematicheskaya Fizika, 2024, Vol. 218, pp. 320–329 https://doi.org/10.4213/tmf10533.

Publisher’s note. Pleiades Publishing remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Maximum likelihood method

We consider a parametric family of probability distributions $p(x,\theta)$, $x\in X$, with a parameter $\theta$. Let $\{x_i\}$ be a sample of independent trials in $X$ (we assume the existence of a probability distribution $q$ on $X$ that generates such a sample). The likelihood function has the form of the product of probability densities for events from the sample

$$L(\{x_i\},\theta)=\prod_{i=1}^n p(x_i,\theta).$$

A maximum likelihood estimate for $\theta$,

$$\theta_0=\arg \max L(\{x_i\},\theta)=\arg \max \frac{1}{n} \log L(\{x_i\},\theta)=\arg \max \frac{1}{n} \sum_{i=1}^n \log p(x_i,\theta),$$

takes the form of an empirical sum. We assume that the maximum likelihood distribution $p(x,\theta_0)$ approximates the unknown distribution $q(x)$.

Overfitting is a well-known problem in learning theory: fitting $p(x,\theta)$ to the training sample $\{x_i\}$ can have a high likelihood but a low likelihood on the control sample $\{x'_i\}$ (some other sample generated by the distribution $q$). The control of overfitting in learning theory usually reduces to regularization (additions to the empirical sum that change the form of the likelihood function, or another kind of the empirical risk functional).

Game theory

A game is a function of several variables called players. A game takes values equal to a set of real numbers, one for each player:

$$(v_1,\dots,v_n)(s_1,\dots,s_n).$$

The $i$th number $v_i$ in the set is called the payoff for the $i$th player. The variable $s_i$ for each player takes values called (pure) strategies, and each player has its own set of strategies.

Example: two players, a zero-sum game (the sum of the players payoffs is zero).

A mixed strategy for the $i$th player: the probability distribution $p_i(s_j^i)$ for this player’s strategies, the index $j$ enumerates different strategies for the $i$th player. The payoff of the $i$th player for a set of mixed strategies

$$\langle v_i\rangle=\sum_{j_1,\dots,j_n} p_1(s_{j_1}^1)\dots p_n(s_{j_n}^n) v_i(s_{j_1}^1,\dots,s_{j_n}^n)$$

is the average of the payoff of the $i$th player over mixed strategies for all players. Here, the index $j_k$ ranges the values of possible strategies $s_k$ for the $k$th player: $s_k\in\{s_{1}^k,\dots,s_{m_k}^k\}$.

Nash equilibrium: A set of strategies in which no participant can increase the payoff by changing their strategy if the other participants do not change their strategies. It exists in the class of mixed strategies (may not exist in the class of pure strategies).

Maximin: the largest payoff that a given player can obtain without knowing the actions of other players,

$$\underline{v_i}=\max_{s_i}\min_{s_{-i}}v_i(s_i,s_{-i}),$$

where $s_i$ is the strategy of the $i$th player, the $s_{-i}$ are other players’ strategies, and $v_i$ is the payoff of the $i$th player.

Minimax: The smallest payoff that other players can force without knowing the player’s action,

$$\overline{v_i}=\min_{s_{-i}}\max_{s_i}v_i(s_i,s_{-i}).$$

The minimax is not less than the maximin, $\overline{v_i}\ge \underline{v_i}$.

The minimax for two-player zero-sum games is the same as the Nash equilibrium.

Lotka–Volterra model

This model describes the population dynamics of two species (predator and prey) by a system of two ODEs

$$\frac{dx}{dt}=\alpha x-\beta xy, \qquad \frac{dy}{dt}=-\gamma y+\delta xy,$$

where $x$ is the prey population and $y$ is the predator population. All constants are positive and the nonlinear term describes the interaction between the predator and prey. For an ecological niche of a finite volume, the first equation becomes ($N>0$)

$$\frac{dx}{dt}=\alpha \biggl(x-\frac{x^2}{N}\biggr)-\beta xy.$$

The Lotka–Volterra model with mutations generalizes the above model as follows. There are prey $x_i$ and predators $y_m$, and the equations of population dynamics have the form

$$\begin{aligned} \, &\frac{dx_i}{dt}=\sum_j A_{ij} x_j - x_i \sum_{n}B_{in} y_n, \\ &\frac{dy_m}{dt}=\sum_{n} C_{mn} y_n + y_m \sum_{j} B_{jm} x_j. \end{aligned}$$

The matrix $A$ has the following meaning: the diagonal part describes the reproduction of prey, the off-diagonal part describes mutations (the matrix elements are positive). The matrix $C$ has the following meaning: its diagonal part (with negative matrix elements) describes the extinction of predators, the off-diagonal part describes mutations (these matrix elements are positive). The $B$ matrix describes the predator–prey interaction (the matrix elements are positive).

For a bounded ecological niche, the first equation takes the form

$$\frac{dx_i}{dt}=\sum_j A_{ij} x_j - \frac{1}{N} x_i \sum_{ij} A_{ij} x_j - x_i \sum_{n}B_{in} y_n.$$

In the absence of predators ($y_n=0$), this equation takes the form of the equation of Eigen’s model [8], which describes the competition of genotypes in a limited ecological niche in the presence of mutations.

Information geometry [9]–[14]

A statistical manifold is a manifold of parameters of a parametric probability distribution, or a manifold whose points are probability distributions on $X$ (a smooth dependence of the distribution on parameters is assumed).

The (nonsymmetric) Kullback–Leibler distance between two probability distributions on $X$ is defined as

$$D(p|q)=\int_{X}p(x)\log\frac{p(x)}{q(x)}\,dx.$$

The Fisher–Rao metric on the statistical manifold has the form

$$g_{ij}(\theta)=\int_{X}\frac{\partial \log p(x,\theta)}{\partial \theta_i}\frac{\partial \log p(x,\theta)}{\partial \theta_j} p(x,\theta)\,dx.$$

The Kullback–Leibler distance expansion for small parameter differences $\Delta\theta=\theta-\theta_0$ is related to the Fisher metric:

$$D(p(\theta_0)|p(\theta))=\frac{1}{2}\sum_{ij}g_{ij}(\theta_0)\Delta\theta^i\Delta\theta^j.$$

The Laplace–Beltrami operator on a statistical manifold with a parameter $\theta$ has the following form, where $g_{ij}$ is the Fisher metric:

$$\Delta_{\theta}= \frac{1}{\sqrt{g}}\sum_{i}\frac{\partial}{\partial \theta_i}\biggl(\sqrt{g}\sum_{j}g^{ij}\frac{\partial}{\partial \theta_j}\biggr).$$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kozyrev, S.V. Lotka–Volterra model with mutations and generative adversarial networks. Theor Math Phys 218, 276–284 (2024). https://doi.org/10.1134/S0040577924020077

Download citation

Received: 05 May 2023
Revised: 23 May 2023
Accepted: 04 July 2023
Published: 27 February 2024
Issue Date: February 2024
DOI: https://doi.org/10.1134/S0040577924020077

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions