Abstract
We analyze the consequences that the so-called turnpike property has on the longtime behavior of the value function corresponding to a finite-dimensional linear-quadratic optimal control problem with general terminal cost and constrained controls. We prove that, when the time horizon T tends to infinity, the value function asymptotically behaves as \(W(x) + c\, T + \lambda \), and we provide a control interpretation of each of these three terms, making clear the link with the turnpike property. As a by-product, we obtain the longtime behavior of the solution to the associated Hamilton–Jacobi–Bellman equation in a case where the Hamiltonian is not coercive in the momentum variable. As a result of independent interest, we showed that linear-quadratic optimal control problems with constrained control enjoy a turnpike property, also particularly when the steady optimum may saturate the control constraints.
Similar content being viewed by others
Notes
Note that, as defined in (1.3), the value function V depends on the initial condition x and the time horizon T. We use the notation \(\nabla V\) for the derivative of V with respect to x, and \(\partial _{T}V\) for its derivative with respect to T. These derivatives must be interpreted in the appropriate classical or viscosity sense depending on the situation, which will be specified in each situation.
Indeed, consider the function \(f:\left[ 0,1\right] \longrightarrow {\mathbb {R}}\), defined as \(f\left( \delta \right) :=J_s\left( \left( \overline{u},\overline{y}\right) +\delta \left( u-\overline{u},y-\overline{y}\right) \right) \). Since \(\left( \overline{u},\overline{y}\right) \) minimizes \(J_s\), f achieves its minimum at \(\delta = 0\), whence \(f^{\prime }\left( 0\right) \ge 0\). Now, by (1.4), \(f^{\prime }\left( 0\right) =\left( \overline{u},u -\overline{u}\right) _{{\mathbb {R}}^m}+\left( C \, \overline{y}-z,C\, \left( y-\overline{y}\right) \right) _{{\mathbb {R}}^n}\). Now, the invertibility of A guarantees the existence of an adjoint state \(\overline{p}\) solving \(0=A^*\overline{p}+C^*(C\, \overline{y}-z)\). Then, we can rewrite \(f^{\prime }\left( 0\right) =\left( \overline{u} +B^*\overline{p},u-\overline{u}\right) _{{\mathbb {R}}^m}\), whence (remembering that \(f^{\prime }\left( 0\right) \ge 0\)) \(\left( \overline{p},B\left( u-\overline{u}\right) \right) _{{\mathbb {R}}^n}\ge -\left( \overline{u},u -\overline{u}\right) _{{\mathbb {R}}^m}\).
- $$\begin{aligned} \Lambda ^{-1}=\begin{bmatrix} I_n+S\widehat{E}&{}-S\\ -\widehat{E}&{}I_n. \end{bmatrix} \end{aligned}$$
References
Abou-Kandil H, Freiling G, Ionescu V, Jank G (2012) Matrix Riccati equations in control and systems theory, systems and control: foundations and applications. Birkhäuser, Basel
Anderson BD, Moore JB (2007) Optimal control: linear quadratic methods. Courier Corporation, North Chelmsford
Angeli D, Amrit R, Rawlings JB (2011) On average performance and stability of economic model predictive control. IEEE Trans Autom Control 57:1615–1626
Arisawa M (1997) Ergodic problem for the Hamilton–Jacobi–Bellman equation. I. Existence of the ergodic attractor. In: Annales de l’Institut Henri Poincare (C) non linear analysis, vol 14, Elsevier, pp 415–438
Arisawa M (1998) Ergodic problem for the Hamilton-Jacobi-Bellman equation. II, In: Annales de l’Institut Henri Poincare (C) Non Linear Analysis, vol 15, Elsevier, pp. 1–24
Bardi M, Capuzzo-Dolcetta I (2008) Optimal control and viscosity solutions of Hamilton–Jacobi–Bellman equations. Springer, Berlin
Barles G, Ley O, Nguyen T-T, Phan TV (2019) Large time behavior of unbounded solutions of first-order Hamilton–Jacobi equations in \({\mathbb{R}}^N\). Asymptot Anal 112:1–22
Barles G, Roquejoffre J-M (2006) Ergodic type problems and large time behaviour of unbounded solutions of Hamilton–Jacobi equations. Commun Partial Differ. Equ. 31:1209–1225
Barles G, Souganidis PE (2000) On the large time behavior of solutions of Hamilton–Jacobi equations. SIAM J Math Anal 31:925–939
Bensoussan A, Da Prato G, Delfour M, Mitter S (2011) Representation and control of infinite dimensional systems, systems and control: foundations and applications. Birkhäuser, Boston
Bensoussan A, Frehse J, Yam SCP (2015) The master equation in mean field theory. J Mathématiques Pures Appliquées 103:1441–1474
Bensoussan A, Frehse J, Yam SCP (2017) On the interpretation of the master equation. Stochas Process Appl 127:2093–2137
Brammer RF (1972) Controllability in linear autonomous systems with positive controllers. SIAM J Control 10:339–353
Breiten T, Pfeiffer L (2020) On the turnpike property and the receding-horizon method for linear-quadratic optimal control problems. SIAM J Control Optim 58:1077–1102
Callier FM, Winkin J (1995) Convergence of the time-invariant Riccati differential equation towards its strong solution for stabilizable systems. J Math Anal Appl 192:230–257
Cannarsa P, Sinestrari C (2004) Semiconcave functions, Hamilton–Jacobi equations, and optimal control, vol 58. Springer, Berlin
Cardaliaguet P, Porretta A (2019) Long time behavior of the master equation in mean field game theory. Anal PDE 12:1397–1453
Crandall MG, Ishii H, Lions P-L (1992) User’s guide to viscosity solutions of second order partial differential equations. Bull Am Math Soc 27:1–67
Crandall MG, Lions P-L (1983) Viscosity solutions of Hamilton–Jacobi equations. Trans Am Math Soc 277:1–42
Esteve C, Geshkovski B, Pighin D, Zuazua E (2020) Turnpike in lipschitz-nonlinear optimal control. arXiv:2011.11091
Evans LC (2010) Partial differential equations, vol 19. American Mathematical Soc, Providence
Faulwasser T, Kellett CM (2021) On continuous-time infinite horizon optimal control-dissipativity, stability, and transversality. Automatica 134:109907
Fujita Y, Ishii H, Loreti P (2006) Asymptotic solutions of Hamilton–Jacobi equations in Euclidean n space. Indiana Univ Math J 2006:1671–1700
Grüne L (2016) Approximation properties of receding horizon optimal control. Jahresber Deutsch Math Verein 118:3–37
Grüne L (2021) Dissipativity and optimal control. arXiv:2101.12606
Grüne L, Guglielmi R (2018) Turnpike properties and strict dissipativity for discrete time linear quadratic optimal control problems. SIAM J Control Optim 56:1282–1302
Grüne L, Guglielmi R (2021) On the relation between turnpike properties and dissipativity for continuous time linear quadratic optimal control problems. Math Control Rel Fields 11:169
Grüne L, Müller MA (2016) On the relation between strict dissipativity and turnpike properties. Syst Control Lett 90:45–53
Grüne L, Pannek J (2017) Nonlinear model predictive control. In: Nonlinear Model Predictive Control, Springer, pp 45–69
Ishii H (2006) Asymptotic solutions for large time of Hamilton–Jacobi equations. Int Congr Math 3:213–227
Ishii H (2008) Asymptotic solutions for large time of Hamilton–Jacobi equations in Euclidean \( n \) space. Ann l’IHP Analyse Non linéaire 25:231–266
Ishii H (2013) A short introduction to viscosity solutions and the large time behavior of solutions of Hamilton–Jacobi equations, In: Hamilton–Jacobi Equations: Approximations, Numerical Analysis and Applications, Springer, pp 111–249
Kouhkouh H (2018) Dynamic programming interpretation of turnpike and Hamilton–Jacobi–Bellman equation. Master thesis, Paris-Saclay University. http://bit.ly/2R7soRx
Kwakernaak H, Sivan R (1972) Linear optimal control systems, vol 1. Wiley, New York
Lee EB, Markus L (1967) Foundations of optimal control theory. Robert E. Krieger Publishing Company, Florida
Lions P-L (1982) Generalized solutions of Hamilton–Jacobi equations, vol 69. Pitman, London
Pighin D (2020) Nonuniqueness of minimizers for semilinear optimal control problems. arXiv:2002.04485
Pighin D (2020) The turnpike property in semilinear control. arXiv:2004.03269
Pighin D, Sakamoto N (2020) The turnpike with lack of observability. arXiv:2007.14081
Porretta A, Zuazua E (2013) Long time versus steady state optimal control. SIAM J Control Optim 51:4242–4273
Porretta A, Zuazua E (2016) Remarks on long time versus steady state optimal control. In: Mathematical paradigms of climate science, Springer, pp 67–89
Roquejoffre J-M (2001) Convergence to steady states or periodic solutions in a class of Hamilton–Jacobi equations. J Mathématiques pures et appliquées 80:85–104
Sakamoto N, van der Schaft AJ (2008) Analytical approximation methods for the stabilizing solution of the Hamilton–Jacobi equation. IEEE Trans Autom Control 53:2335–2350
Trélat E (2005) Contrôle optimal: théorie & applications, vol 36. Vuibert, Paris
Trélat E, Zhang C (2018) Integral and measure-turnpike properties for infinite-dimensional optimal control systems. Math Control Signals Syst 30:3
Trélat E, Zhang C, Zuazua E (2018) Steady-state and periodic exponential turnpike property for optimal control problems in Hilbert spaces. SIAM J Control Optim 56:1222–1252
Trélat E, Zuazua E (2015) The turnpike property in finite-dimensional nonlinear optimal control. J Differ Equ 258:81–114
Willems J (1971) Least squares stationary optimal control and the algebraic Riccati equation. IEEE Trans Autom Control 16:621–634
Zanon M, Faulwasser T (2018) Economic MPC without terminal constraints: gradient-correcting end penalties enforce asymptotic stability. J Process Control 63:1–14
Zanon M, Gros S, Diehl M (2016) A tracking MPC formulation that is locally equivalent to economic MPC. J Process Control 45:30–42
Acknowledgements
The authors are grateful to the referees for numerous remarks and suggestions which helped improve the first version of the manuscript.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Grant Agreement No. 694126-DyCon). The work of E.Z. is partially funded by the Alexander von Humboldt-Professorship program, the European Unions Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie Grant Agreement No. 765579-ConFlex, the Grant MTM2017-92996-C2-1-R COSNET of MINECO (Spain), the Air Force Office of Scientific Research (AFOSR) under Award No. FA9550-18-1-0242. and the Transregio 154 Project “Mathematical Modelling, Simulation and Optimization Using the Example of Gas Networks” of the German DFG.
Appendices
Appendix A. Proof of the turnpike property
This appendix is devoted to the proof of the turnpike property stated in Theorem 1.2. The main difficulty resides in the fact that we are considering the constrained control case. In addition, we do not make the assumption of the steady optimal control \(\overline{u}\) being at the interior of the control set U, which would make the proof much easier, and moreover would allow us to prove turnpike with an exponential rate.
We start by proving the following crucial Lemma which is a direct consequence of [40, Remark 2.1] and we include its proof for self-consistency.
Lemma A.1
Assume (A, C) is detectable and take \(f\in L^2(0,T;{\mathbb {R}}^n)\). Then, there exists a constant \(K=K\left( A,C\right) \ge 0\), independent of T and f, such that for any \(T\ge 1\) and for any y solution to
we have
for any \(t\in [0,T]\).
Proof
In the present proof, K will denote a (sufficiently large) constant depending only on (A, C).
Step 1 Decomposition into stable and antistable part
Following the notation of [15], \({\mathscr {L}}^{-}(A)\) and \({\mathscr {L}}^{0+}(A)\) denote resp. the A-invariant subspaces of \({\mathbb {R}}^n\) spanned by the generalized eigenvectors of A corresponding to eigenvalues \(\lambda \) of A such that \(\text{ Re }(\lambda )<0\) and \(\text{ Re }(\lambda )\ge 0\). By linear algebra,
where \(\oplus \) stands for the direct sum. Then, let y be a solution to (A.1). Denote by \(y_1\) and \(y_2\) resp. the projections of y onto \({\mathscr {L}}^{-}(A)\) and \({\mathscr {L}}^{0+}(A)\). Then, \(y=y_1+y_2\) and, for \(i=1,2\),
where \(f_1\) and \(f_2\) stand for resp. the projection of f onto \({\mathscr {L}}^{-}(A)\) and \({\mathscr {L}}^{0+}(A)\).
Step 2 Estimate for the asymptotically stable part
We have
All the eigenvalues of \(L_A \restriction _{{\mathscr {L}}^{-}(A)}\) are strictly negative, where we have denoted by \(L_A\) the linear operator associated with the matrix A. Then, we have, for any \(s\in [0,T]\)
the constant K depending only on A.
Step 3 An observability inequality in the time interval [0, 1]
To proceed with the antistable part, we shall first prove the existence of an observability constant \(K=K(A,C)\ge 0\), such that for any \(\tilde{y}\in H^1(0,T;{\mathscr {L}}^{0+}(A))\) solution to
we have
To that end, define
and
where \(\tilde{y}_x\) solves
Now, using that, being (A, C) is detectable, all the modes in \({\mathscr {L}}^{0+}(A)\) are observable (see definition of detectability in [15, at the bottom of page 232]), we deduce that both \(\left\| \cdot \right\| _{a}\) and \(\left\| \cdot \right\| _{b}\) are norms on the subspace \({\mathscr {L}}^{0+}(A)\). Since \({\mathscr {L}}^{0+}(A)\) is finite-dimensional, they are equivalent, whence (A.4) follows.
Step 4 Estimate for the antistable part
By definition
Consider an arbitrary interval \([a,b]\subset [0,T]\), with length \(\left| b-a\right| =1\). By step 3, we have
On the one hand, due to the arbitrariness of [a, b], this yields
On the other hand,
Then, for any \(t\in [0,T]\), we have
where in the last inequality we have employed (A.3).
Step 4 Conclusion
Putting together (A.3) and (A.5), we conclude. \(\square \)
Remark A.2
Observe that, assuming that(A, C) is detectable, we have, for some \(\beta = \beta (A,C)>0\), the inequality
This is a consequence of inequality (A.2) applied to the trajectory \(\tilde{y}(t):=ty_s\), (see [40]). The steady inequality (A.6) yields strict convexity of \(J_s\) and hence uniqueness of the minimizer for the stationary optimal control problem (1.4).
Remark A.3
In Definition 1.1, by using \(u-u_1\in L^1(0,+\infty ;{\mathbb {R}}^m)\) and \(y-y_1\in L^1(0,+\infty ;{\mathbb {R}}^n)\) together with \(\frac{d}{ds}\left( y\left( s\right) -y_1\left( s\right) \right) =A\left( y\left( s\right) -y_1\left( s\right) \right) +B\left( u\left( s\right) -u_1\left( s\right) \right) \), \(s\in (0,+\infty )\),
the solution y stabilizes toward \(y_1\), i.e., \(y\left( t\right) -y_1\left( t\right) \underset{t\rightarrow +\infty }{\longrightarrow }0\).
Note also that U-stabilizability follows from exact controllability under the control constraint \(u\left( t\right) \in U\) (see, e.g., [13, 35, 44]). In case \(U={\mathbb {R}}^m\), the U-stabilizability is equivalent to (unconstrained) exponential stabilizability of (A, B) [10, Remark 2.2 page 24].
We now prove the following result, which provides an upper bound \(\left\| y_{_{T}}\right\| \) uniform in T, and also gives the inequality (1.9) from Theorem 1.2.
Lemma A.4
There exists \(K=K(A,B,C,U,x,z,g)\) such that, for any \(T\ge 1\) and for every \(t\in [0,T]\), we have
and
Proof of Lemma A.4
By Lemma A.1 applied to \(y-\overline{y}\), we have
whence
where \(\alpha = \alpha (A,C)>0\) and \(K=K(A,B,C,x,z)\ge 0\). Using the above inequality, together with Lemma 2.2 (inequality (2.3)), yields
Now, by U-stabilizability, there exists a control \(\hat{u}\in L^2(0,+\infty ;U)\), such that
where \(\hat{y}\) is the solution to (1.1), with initial datum x and control \(\hat{u}\). Therefore
where \(=K(A,B,C,U,x,z,g)\). Hence, putting together (A.9) and (A.10), we get
i.e., the desired boundedness for \(y_{_{T}}\).
Moreover, by inequality (2.3), the boundedness from below of g together with (A.10), we have
Then, inequality (A.8) follows from an application of Lemma A.1. This finishes the proof. \(\square \)
We now prove the validity of the turnpike property.
Proof of Theorem 1.2
Inequality (1.9) has already been proved in Lemma A.4. It remains to prove (1.8). Throughout this proof, K will always denote a (sufficiently large) constant depending only on A, B, C, U, x, z and g.
By (1.9), for any \(\eta \in (0,1)\), there exists \(\zeta =\zeta (A,B,C,U,x,z,g,\eta )>0\) such that, for all \(T>\zeta \), we have
and
By integral mean value theorem, for any \(T\ge 1+2\zeta \), there exist \(t_{T,1}\in [0,\zeta ]\) and \(t_{T,2}\in \left[ T-\zeta ,T\right] \), such that
and
By the U-stabilizability, there exists a control \(\tilde{u}\in L^2(t_{T,1},+\infty ;U)\), such that
and its associated trajectory \(\tilde{y}\), solution to (1.1) with initial condition \(y_{_T}(t_{T,1})\) satisfies
with estimates
the constant \(\gamma \) depending only on \(\left( A,B,U\right) \). Set
By using (A.13) and (A.14), we get
with \(\gamma =\gamma \left( A,B,U\right) \).
Now, let us define the functional
defined for any \(u\in L^2\left( t_{T,1},t_{T,2};U\right) \), where \(y(\cdot )\) is the solution to (1.1) with control u and initial condition \(y(t_{T,1}) = y_{_T} (t_{T,1})\).
Let us estimate from above the following quantity
By adapting the techniques of Lemma 2.2 to the functional Q in (A.16), we obtain the analogous version of (2.3), which reads as
Now, using the definition of \(J_{T,x}\) and Q, along with the fact that \(u_{_T}\) minimizes \(J_{T,x}\), we deduce
Using the definition of Q in (A.16) and the choice of the control \(\hat{u}\) in (A.14), we get
Now, noting that \(u_{_T}\) and \(\hat{u}\) coincide in the interval \((t_{T,2} , T)\), and that \(T-t_{T,2} \le \zeta \), we can use Gronwall’s inequality to estimate
where C is independent of T. Moreover, using the local Lipschitz continuity of g and (A.7), we obtain
Then, from the estimate (A.17), an analogous version of the identity (2.2) for the functional Q, combined with (A.17), (A.13) and (A.15), yields
where K is independent of T.
Then, by Lemma A.1, for any \(s\in \left[ t_{T,1},t_{T,2}\right] \),
Finally, for any \(\varepsilon >0\), setting \(\eta =\frac{\varepsilon ^2}{K}\) and \(\tau (A,B,C,U,x,z,g,\varepsilon ) =\zeta (A,B,C,U,x,z,g,\eta )\), we get the thesis. \(\square \)
Appendix B. Riccati theory and proof of Proposition 1.5
Although the proofs of our main results (Theorems 1.2 and 1.3) do not rely on the use of the classical Riccati theory, which is not applicable to our case due to the constraints on the control, we note that in the unconstrained case \(U={\mathbb {R}}^m\), we may use the Riccati theory to obtain the value function W(x) explicitly as a positively definite quadratic form. We recall that, following Theorem 1.3, the value function W(x) is the limiting profile of the asymptotic decomposition of the value function V(T, x).
The proof of Proposition 1.5 is based on the following well-known lemma, concerning the properties of the algebraic Riccati equation, and the corresponding Hamiltonian matrix
One can realize that \(\text{ Ham }\) is the associated matrix to the optimality system (B.5).
Lemma A.1
Assume (A, B) is stabilizable and (A, C) is detectable. Then,
-
(1)
there exists a unique symmetric positive semidefinite solution to the Algebraic Riccati Equation
$$\begin{aligned} -\widehat{E}A-A^*\widehat{E}+\widehat{E}BB^*\widehat{E}=C^*C \qquad \text{(ARE) } \end{aligned}$$(B.1)such that \(A-BB^*\widehat{E}\) is stable, i.e., the real part of the spectrum \(\text{ Re }(\sigma (A-BB^*\widehat{E}))\subset (-\infty ,0)\);
-
(2)
set
$$\begin{aligned} \Lambda :=\begin{bmatrix} I_n&{}S\\ \widehat{E}&{}\widehat{E}S+I_n \end{bmatrix}, \end{aligned}$$(B.2)where S is solution to the Lyapunov equation
$$\begin{aligned} S(A-BB^*\widehat{E})^*+(A-BB^*\widehat{E})S=BB^*. \end{aligned}$$Then, \(\Lambda \) is invertible and
$$\begin{aligned} \Lambda ^{-1}\text{ Ham } \ \Lambda =\begin{bmatrix} A-BB^*\widehat{E}&{}0\\ 0&{}-(A-BB^*\widehat{E})^*. \end{bmatrix} \end{aligned}$$As a consequence, \(\text{ Ham }\) is invertible and its spectrum does not intersect the imaginary axis.
The first part of the above lemma is Riccati theory (see, for instance, [15, Fact 1-(a) and Fact 1-(f)] or [1]). The second partFootnote 3 is taken from [43, subsection III.B]. We are now ready to prove Proposition 1.5.
Proof of Proposition 1.5
First of all, let us show that the minimization of \(J_{\infty ,x}\) is equivalent to the minimization of
where y is the solution to (1.1), with initial datum x and control u. Let us show this by proceeding as in the proof of (2.3), and concluding with Lemma 2.4. We first consider the finite horizon cost functional with final cost \(g=0\), that is
Hence, one has
Then we focus on the term
We recall that the pair \((\overline{u},\overline{y})\) satisfies the steady optimality system which reads as
with \(\overline{u}=-B^* \overline{p}\). On the other hand, the pairs \((u(\cdot ),y(\cdot ))\) and \((\overline{u},\overline{y})\) satisfy the equation in (1.1). Hence, we have
Then, using (B.5) and (B.6) and taking into account that \(y(0)=x\) and \(\overline{u}=-B^* \overline{p}\), we can compute the term (B.4) as follows:
Finally, the conclusion follows by combining (B.3) and (B.7) and then letting \(T\rightarrow +\infty \) since from Lemma 2.4 one has \(u-\overline{u}\in L^{2}(0,+\infty ;{\mathbb {R}}^{m})\), \(y-\overline{y}\in L^{2}(0,+\infty ;{\mathbb {R}}^{n})\) and \(y(T) \rightarrow \overline{y}\).
By [34, Theorem 3.7 pages 237-238], there exists a unique minimizer \(u^{*}\) for \(\widehat{J}_{\infty ,x}\), given by (1.17) and
whence, by (1.10) and (2.15) which is now an equality,
as desired. \(\square \)
Rights and permissions
About this article
Cite this article
Esteve, C., Kouhkouh, H., Pighin, D. et al. The turnpike property and the longtime behavior of the Hamilton–Jacobi–Bellman equation for finite-dimensional LQ control problems. Math. Control Signals Syst. 34, 819–853 (2022). https://doi.org/10.1007/s00498-022-00325-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00498-022-00325-2
Keywords
- Optimal control problems
- Longtime behavior
- The turnpike property
- Hamilton–Jacobi–Bellman equations
- Linear-quadratic