Solving trust region subproblems using Riemannian optimization

Mor, Uria; Shustin, Boris; Avron, Haim

doi:10.1007/s00211-023-01360-0

Solving trust region subproblems using Riemannian optimization

Published: 23 June 2023

Volume 154, pages 1–33, (2023)
Cite this article

Numerische Mathematik Aims and scope Submit manuscript

Uria Mor¹,
Boris Shustin¹ &
Haim Avron¹

489 Accesses
3 Altmetric
Explore all metrics

Abstract

The Trust Region Subproblem is a fundamental optimization problem that takes a pivotal role in Trust Region Methods. However, the problem, and variants of it, also arise in quite a few other applications. In this article, we present a family of iterative Riemannian optimization algorithms for a variant of the Trust Region Subproblem that replaces the inequality constraint with an equality constraint, and converge to a global optimum. Our approach uses either a trivial or a non-trivial Riemannian geometry of the search-space, and requires only minimal spectral information about the quadratic component of the objective function. We further show how the theory of Riemannian optimization promotes a deeper understanding of the Trust Region Subproblem and its difficulties, e.g., a deep connection between the Trust Region Subproblem and the problem of finding affine eigenvectors, and a new examination of the so-called hard case in light of the condition number of the Riemannian Hessian operator at a global optimum. Finally, we propose to incorporate preconditioning via a careful selection of a variable Riemannian metric, and establish bounds on the asymptotic convergence rate in terms of how well the preconditioner approximates the input matrix.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The generalized trust region subproblem: solution complexity and convex hull results

Article 10 October 2020

Adaptive Trust-Region Method on Riemannian Manifold

Article 19 July 2023

Local nonglobal minima for solving large-scale extended trust-region subproblems

Article 02 September 2016

Notes

Standard in the sense that it uses the most natural choice of retraction and Riemannian metric.
We caution that [22] does not use the term “affine eigenvalues”.
This statement is simple corollary of Lemma 2.2 in [20].
In spite of the method’s simplicity, we are not aware of any descriptions of this method earlier then Phan et al.’s (relatively) recent work.

References

Absil, P.-A., Mahony, R., Sepulchre, R.: Optimization Algorithms on Matrix Manifolds. Princeton University Press, Princeton (2008)
Book MATH Google Scholar
Adachi, S., Iwata, S., Nakatsukasa, Y., Takeda, A.: Solving the trust-region subproblem by a generalized eigenvalue problem. SIAM J. Optim. 27(1), 269–291 (2017)
Article MathSciNet MATH Google Scholar
Bakshi, A., Chepurko, N., Jayaram, R.: Testing positive semi-definiteness via random submatrices. In: 61st Annual IEEE Symposium on Foundations of Computer Science (2020)
Beck, A., Vaisbourd, Y.: Globally solving the trust region subproblem using simple first-order methods. SIAM J. Optim. 28(3), 1951–1967 (2018)
Article MathSciNet MATH Google Scholar
Boumal, N.: An Introduction to Optimization on Smooth Manifolds. To appear with Cambridge University Press (2022)
Boumal, N., Voroninski, V., Bandeira, A.S.: Deterministic guarantees for Burer–Monteiro factorizations of smooth semidefinite programs. Commun. Pure Appl. Math. 73(3), 581–608 (2019)
Article MathSciNet MATH Google Scholar
Carmon, Y., Duchi, J.C.: Analysis of Krylov subspace solutions of regularized non-convex quadratic problems. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 31, pp. 10705–10715. Curran Associates Inc., Red Hook (2018)
Google Scholar
Chambers, L.G., Fletcher, R.: Practical methods of optimization. Math. Gaz. 85(504), 562 (2001)
Article Google Scholar
Chapelle, O., Sindhwani, V., Keerthi, S.: Branch and bound for semi-supervised support vector machines. In: Proceedings of the 19th International Conference on Neural Information Processing Systems, NIPS’06, pp. 217–224. MIT Press, Cambridge (2006)
Cucuringu, M., Tyagi, H.: Provably robust estimation of modulo 1 samples of a smooth function with applications to phase unwrapping. arXiv e-prints arXiv:1803.03669 (2018)
Edelman, A., Arias, T.A., Smith, S.T.: The geometry of algorithms with orthogonality constraints. SIAM J. Matrix Anal. Appl. 20(2), 303–353 (1998)
Article MathSciNet MATH Google Scholar
Gander, W., Golub, G.H., von Matt, U.: A constrained eigenvalue problem. Linear Algebra Appl. 114–115, 815–839 (1989). (Special Issue Dedicated to Alan J. Hoffman)
Article MathSciNet MATH Google Scholar
Golub, G.H., von Matt, U.: Quadratically constrained least squares and quadratic problems. Numer. Math. 59(1), 561–580 (1991)
Article MathSciNet MATH Google Scholar
Gould, N., Lucidi, S., Roma, M., Toint, P.: Solving the trust-region subproblem using the Lanczos method. SIAM J. Optim. 9(2), 504–525 (1999)
Article MathSciNet MATH Google Scholar
Hager, W.W.: Minimizing a quadratic over a sphere. SIAM J. Optim. 12(1), 188–208 (2001)
Article MathSciNet MATH Google Scholar
Han, I., Malioutov, D., Avron, H., Shin, J.: Approximating spectral sums of large-scale matrices using stochastic Chebyshev approximations. SIAM J. Sci. Comput. 39(4), A1558–A1585 (2017)
Article MathSciNet MATH Google Scholar
Joachims, T.: Transductive learning via spectral graph partitioning. In: Proceedings of the Twentieth International Conference on International Conference on Machine Learning, ICML’03, pp. 290–297. AAAI Press (2003)
Lucidi, S., Palagi, L., Roma, M.: On some properties of quadratic programs with a convex quadratic constraint. SIAM J. Optim. 8(1), 105–122 (1998)
Article MathSciNet MATH Google Scholar
Luenberger, D.G.: Linear and nonlinear programming. Math. Comput. Simul. 28(1), 78 (1986)
Article Google Scholar
Martínez, J.M.: Local minimizers of quadratic functions on Euclidean balls and spheres. SIAM J. Optim. 4(1), 159–176 (1994)
Article MathSciNet MATH Google Scholar
Mishra, B., Sepulchre, R.: Riemannian preconditioning. SIAM J. Optim. 26(1), 635–660 (2016)
Article MathSciNet MATH Google Scholar
Moré, J.J., Sorensen, D.C.: Computing a trust region step. SIAM J. Sci. Stat. Comput. 4(3), 553–572 (1983)
Article MathSciNet MATH Google Scholar
Nocedal, J., Wright, S.J.: Numerical Optimization, 2nd edn. (2006)
Phan, A.-H., Yamagishi, M., Mandic, D., Cichocki, A.: Quadratic programming over ellipsoids with applications to constrained linear regression and tensor decomposition. Neural Comput. Appl. 32(11), 7097–7120 (2020)
Article Google Scholar
Shustin, B., Avron, H.: Randomized Riemannian preconditioning for orthogonality constrained problems (2020)
Tropp, J.A., Yurtsever, A., Udell, M., Cevher, V.: Practical sketching algorithms for low-rank matrix approximation. SIAM J. Matrix Anal. Appl. 38(4), 1454–1485 (2017)
Article MathSciNet MATH Google Scholar
Vandereycken, B., Vandewalle, S.: A Riemannian optimization approach for computing low-rank solutions of Lyapunov equations. SIAM J. Matrix Anal. Appl. 31(5), 2553–2579 (2010)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

School of Mathematical Sciences, Tel Aviv University, 6997801, Tel Aviv, Israel
Uria Mor, Boris Shustin & Haim Avron

Authors

Uria Mor
View author publications
You can also search for this author in PubMed Google Scholar
Boris Shustin
View author publications
You can also search for this author in PubMed Google Scholar
Haim Avron
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Haim Avron.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported by Israel Science Foundation Grant 1272/17.

Appendix A: Constructing $\phi $

We now show a simple way to construct a smooth $\phi (\cdot )$ fulfilling both requirements stated in Items 1 and 2 in Sect. 6.

First, we choose some time parameter $\epsilon > 0$ and set $ d:= \lambda _{\min }(\textbf{M}) - \epsilon $. We construct $\phi (\alpha )$ to be a smoothed out over-estimation of $ \max (\alpha , -d) $. We first construct an under estimation. Let $\gamma >1$ be another parameter, and define

$$\begin{aligned} \varphi (\alpha )=\frac{\alpha +d}{2}\left( 1-\tanh \left( -\gamma \left( \alpha +d\right) \right) \right) -d. \end{aligned}$$

Then $ \varphi (\alpha ) $ is a smooth function that approximates $\max (\alpha , -d) $, but it is an under estimation: $\varphi (\alpha ) \le \max (\alpha , -d) $.

While the difference between $ \max (\alpha , -d) $ and $\varphi (\alpha )$ reduces significantly when $ \gamma $ grows, we want to make sure that our approximation is greater than or equal to the $ \max (\alpha , -d) $. To that end, let

$$\begin{aligned} \alpha _0:= - \frac{\textrm{W}\left[ 0,e^{-1}\right] +1}{2\gamma } - d, \end{aligned}$$

where $\textrm{W}\left[ 0,\cdot \right] $ is the zero branch of the Lambert-$ \textrm{W}$ function. Now, set:

$$\begin{aligned} \phi (\alpha ):=\varphi (\alpha )-\varphi (\alpha _{0})-d. \end{aligned}$$

(A.1)

See Fig. 5 for a graphical illustration of $\phi $. It is possible to show that

$$\begin{aligned} 0 \le \phi (\alpha ) - \max (\alpha , -\lambda _{\min }(\textbf{M})) \le \frac{\textrm{W}\big [0,e^{-1}\big ]+1}{2\gamma } \big (1 - \tanh \big (\big (\textrm{W}\big [0,e^{-1}\big ]+1\big )/2\big )\big )+ \epsilon , \end{aligned}$$

(A.2)

so Item 1 holds, and the approximation error (Item 2) is small if $\epsilon $ is sufficiently small and $\gamma $ is sufficiently large. The proof of Eq. (A.2) is rather technical and does not convey any additional insight on the btrs and its solution, so we omit it.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Mor, U., Shustin, B. & Avron, H. Solving trust region subproblems using Riemannian optimization. Numer. Math. 154, 1–33 (2023). https://doi.org/10.1007/s00211-023-01360-0

Download citation

Received: 07 February 2022
Revised: 16 March 2023
Accepted: 23 May 2023
Published: 23 June 2023
Issue Date: June 2023
DOI: https://doi.org/10.1007/s00211-023-01360-0

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Solving trust region subproblems using Riemannian optimization

Abstract

Access this article

Similar content being viewed by others

The generalized trust region subproblem: solution complexity and convex hull results

Adaptive Trust-Region Method on Riemannian Manifold

Local nonglobal minima for solving large-scale extended trust-region subproblems

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix A: Constructing \(\phi \)

Rights and permissions

About this article

Cite this article

Mathematics Subject Classification

Navigation

Solving trust region subproblems using Riemannian optimization

Abstract

Access this article

Similar content being viewed by others

The generalized trust region subproblem: solution complexity and convex hull results

Adaptive Trust-Region Method on Riemannian Manifold

Local nonglobal minima for solving large-scale extended trust-region subproblems

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix A: Constructing \(\phi \)

Appendix A: Constructing \(\phi \)

Rights and permissions

About this article

Cite this article

Share this article

Mathematics Subject Classification

Search

Navigation