skip to main content
research-article
Free Access
Just Accepted

Avoiding breakdown in incomplete factorizations in low precision arithmetic

Online AM:12 March 2024Publication History
Skip Abstract Section

Abstract

The emergence of low precision floating-point arithmetic in computer hardware has led to a resurgence of interest in the use of mixed precision numerical linear algebra. For linear systems of equations, there has been renewed enthusiasm for mixed precision variants of iterative refinement. We consider the iterative solution of large sparse systems using incomplete factorization preconditioners. The focus is on the robust computation of such preconditioners in half precision arithmetic and employing them to solve symmetric positive definite systems to higher precision accuracy; however, the proposed ideas can be applied more generally. Even for well-conditioned problems, incomplete factorizations can break down when small entries occur on the diagonal during the factorization. When using half precision arithmetic, overflows are an additional possible source of breakdown. We examine how breakdowns can be avoided and we implement our strategies within new half precision Fortran sparse incomplete Cholesky factorization software. Results are reported for a range of problems from practical applications. These demonstrate that, even for highly ill-conditioned problems, half precision preconditioners can potentially replace double precision preconditioners, although unsurprisingly this may be at the cost of additional iterations of a Krylov solver.

References

  1. A. Abdelfattah, H. Anzt, E. G. Boman, E. Carson, T. Cojean, J. Dongarra, A. Fox, M. Gates, N. J. Higham, X. S. Li, J. Loe, P. Luszczek, S. Pranesh, S. Rajamanickam, T. Ribizel, B. F. Smith, K. Swirydowicz, S. Thomas, S. Tomov, Y. M. Tzai, and U. Meier Yang. 2021. A survey of numerical linear algebra methods utilizing mixed-precision arithmetic. International J. High Performance Computing Applications 35, 4 (2021), 344–369. https://doi.org/10.1177/10943420211003313Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. P. Amestoy, A. Buttari, N. J. Higham, J.-Y. L’Excellent, T. Mary, and B. Vieuble. 2021. Five-Precision GMRES-based iterative refinement. Technical Report MIMS EPrint: 2021.5. Manchester Institute for Mathematical Sciences, University of Manchester.Google ScholarGoogle Scholar
  3. P. Amestoy, A. Buttari, N. J. Higham, J.-Y. L’Excellent, T. Mary, and B. Vieuble. 2023. Combining sparse approximate factorizations with mixed precision iterative refinement. ACM Trans. Math. Software 49, 1 (2023), 4:1–4:28. https://doi.org/10.1145/3582493Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. P. R. Amestoy, I. S. Duff, and J-Y L’Excellent. 2000. Multifrontal parallel distributed symmetric and unsymmetric solvers. Computer methods in applied mechanics and engineering 184, 2-4 (2000), 501–520. https://doi.org/10.1016/S0045-7825(99)00242-XGoogle ScholarGoogle ScholarCross RefCross Ref
  5. M. Arioli and I. S Duff. 2009. Using FGMRES to obtain backward stability in mixed precision. Electronic Transactions on Numerical Analysis 33 (2009), 31–44.Google ScholarGoogle Scholar
  6. M. Arioli, I. S. Duff, S. Gratton, and S. Pralet. 2007. A note on GMRES preconditioned by a perturbed LDLT decomposition with static pivoting. SIAM J. on Scientific Computing 29, 5 (2007), 2024–2044. https://doi.org/10.1137/060661545Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. Baboulin, A. Buttari, J. Dongarra, J. Kurzak, J. Langou, J. Langou, P. Luszczek, and S. Tomov. 2009. Accelerating scientific computations with mixed precision algorithms. Comp. Phys. Comm. 180, 12 (2009), 2526–2533. https://doi.org/10.1016/j.cpc.2008.11.005Google ScholarGoogle ScholarCross RefCross Ref
  8. M. Benzi. 2002. Preconditioning techniques for large linear systems: a survey. J. of Computational Physics 182, 2 (2002), 418–477. https://doi.org/10.1006/jcph.2002.7176Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Benzi, C. D. Meyer, and M. Tůma. 1996. A sparse approximate inverse preconditioner for the conjugate gradient method. SIAM J. on Scientific Computing 17, 5 (1996), 1135–1149. https://doi.org/10.1137/S1064827594271421Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. Buttari, J. Dongarra, and J. Kurzak. 2007. Limitations of the PlayStation 3 for high performance cluster computing. Technical Report MIMS EPrint: 2007.93. Manchester Institute for Mathematical Sciences, University of Manchester.Google ScholarGoogle Scholar
  11. A. Buttari, J. Dongarra, J. Kurzak, P. Luszczek, and S. Tomov. 2008. Using mixed precision for sparse matrix computations to enhance the performance while achieving 64-bit accuracy. ACM Trans. Math. Software 34 (2008), 17:1–17:22. https://doi.org/10.1145/1377596.1377597Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Buttari, J. Dongarra, J. Langou, J. Langou, P. Luszczek, and J. Kurzak. 2007. Mixed Precision Iterative Refinement Techniques for the Solution of Dense Linear Systems. International J. High Performance Computing Applications 21, 4 (2007), 457–466. https://doi.org/10.1177/1094342007084026Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. E. Carson, N. Higham, and S. Pranesh. 2020. Three-Precision GMRES-based Iterative Refinement for Least Squares Problems. SIAM J. on Scientific Computing 42, 6 (2020), A4063–A4083. https://doi.org/10.1137/20M1316822Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. E. Carson and N. J. Higham. 2017. A new analysis of iterative refinement and its application to accurate solution of ill-conditioned sparse linear systems. SIAM J. on Scientific Computing 39, 6 (2017), A2834–A2856. https://doi.org/10.1137/17M1122918Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. E. Carson and N. J. Higham. 2018. Accelerating the solution of linear systems by iterative refinement in three precisions. SIAM J. on Scientific Computing 40, 2 (2018), A817–A847. https://doi.org/10.1137/17M1140819Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. E. Carson and N. Khan. 2023. Mixed precision iterative refinement with sparse approximate inverse preconditioning. SIAM J. Sci. Comput. 45, 3 (2023), C131–C153. https://doi.org/10.1137/22M1487709Google ScholarGoogle ScholarCross RefCross Ref
  17. T. F. Chan and H. A. van der Vorst. 1997. Approximate and incomplete factorizations. In Parallel Numerical Algorithms, ICASE/LaRC Interdisciplinary Series in Science and Engineering IV. Centenary Conference, D.E. Keyes, A. Sameh and V. Venkatakrishnan, eds. Kluver Academic Publishers, Dordrecht, 167–202. https://doi.org/10.1007/978-94-011-5412-3_6Google ScholarGoogle ScholarCross RefCross Ref
  18. J. Demmel, Y. Hida, W. Kahan, X. S. Li, S. Mukherjee, and E. J. Riedy. 2006. Error bounds from extra-precise iterative refinement. ACM Trans. Math. Software 32, 2 (2006), 325–351. https://doi.org/10.1145/1141885.1141894Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. A. Greenbaum. 1997. Estimating the attainable accuracy of recursively computed residual methods. SIAM J. on Matrix Analysis and Applications 18 (1997), 535–551. https://doi.org/10.1137/S0895479895284944Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. J. Grote and T. Huckle. 1997. Parallel preconditioning with sparse approximate inverses. SIAM J. on Scientific Computing 18, 3 (1997), 838–853. https://doi.org/10.1137/S1064827594276552Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. N. J. Higham. 2002. Accuracy and Stability of Numerical Algorithms (second ed.). SIAM, Philadelphia, PA. xxx+680 pages. https://doi.org/10.1137/1.9780898718027Google ScholarGoogle ScholarCross RefCross Ref
  22. N. J. Higham and T. Mary. 2022. Mixed precision algorithms in numerical linear algebra. Acta Numerica 31 (2022), 347–414. https://doi.org/10.1017/S0962492922000022Google ScholarGoogle ScholarCross RefCross Ref
  23. N. J. Higham and S. Pranesh. 2019. Simulating low precision floating-point arithmetic. SIAM J. on Scientific Computing 41, 5 (2019), C585–C602. https://doi.org/10.1137/19M1251308Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. N. J. Higham and S. Pranesh. 2021. Exploiting lower precision arithmetic in solving symmetric positive definite linear systems and least squares problems. SIAM J. on Scientific Computing 43, 1 (2021), A258–A277. https://doi.org/10.1137/19M1298263Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. N. J. Higham, S. Pranesh, and M. Zounon. 2019. Squeezing a matrix into half precision, with an application to solving linear systems. SIAM J. on Scientific Computing 41, 4 (2019), A2536–A2551. https://doi.org/10.1137/18M1229511Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. J. D. Hogg and J. A. Scott. 2010. A fast and robust mixed precision solver for sparse symmetric systems. ACM Trans. Math. Software 37, 2 (2010), 1–24. https://doi.org/10.1145/1731022.1731027 Article 4, 19 pages.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. HSL 2023. HSL. A collection of Fortran codes for large-scale scientific computation. http://www.hsl.rl.ac.uk.Google ScholarGoogle Scholar
  28. D. Hysom and A. Pothen. 1999. Efficient parallel computation of ILU(k) preconditioners. In SC ’99: Proceedings of the 1999 ACM/IEEE Conference on Supercomputing, Portland, OR. ACM/IEEE, 1–29. https://doi.org/10.1145/331532.331561Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. M. T. Jones and P. E. Plassmann. 1995. An Improved Incomplete Cholesky Factorization. ACM Trans. Math. Software 21, 1 (1995), 5–17. https://doi.org/10.1145/200979.200981Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. D. S. Kershaw. 1978. The incomplete Cholesky-conjugate gradient method for the iterative solution of systems of linear equations. J. of Computational Physics 26 (1978), 43–65. https://doi.org/10.1016/0021-9991(78)90098-0Google ScholarGoogle ScholarCross RefCross Ref
  31. P. A. Knight, D. Ruiz, and B. Uçar. 2014. A symmetry preserving algorithm for matrix scaling. SIAM J. on Matrix Analysis and Applications 35, 3 (2014), 931–955. https://doi.org/10.1137/110825753Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. J. Kurzak and J. Dongarra. 2007. Implementation of mixed precision in solving systems of linear equations on the CELL processor. Concurrency and Computation: Practice and Experience 19, 10 (2007), 1371–1385. https://doi.org/10.1002/cpe.1164Google ScholarGoogle ScholarCross RefCross Ref
  33. J. Langou, Jul. Langou, P. Luszczek, J. Kurzak, A. Buttari, and J. Dongarra. 2006. Exploiting the performance of 32 bit floating point arithmetic in obtaining 64 bit accuracy (revisiting iterative refinement for linear systems). In SC’06: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing. https://doi.org/10.1109/SC.2006.30Google ScholarGoogle ScholarCross RefCross Ref
  34. C.-J. Lin and J. J. Moré. 1999. Incomplete Cholesky factorizations with limited memory. SIAM J. on Scientific Computing 21, 1 (1999), 24–45. https://doi.org/10.1137/S1064827597327334Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. N. Lindquist, P. Luszczek, and J. Dongarra. 2020. Improving the performance of the GMRES method using mixed-precision techniques. In Driving Scientific and Engineering Discoveries Through the Convergence of HPC, Big Data and AI: 17th Smoky Mountains Computational Sciences and Engineering Conference, SMC 2020, Oak Ridge, TN, USA, August 26–28, 2020, Revised Selected Papers 17. Springer, 51–66. https://doi.org/10.1007/978-3-030-63393-6_4Google ScholarGoogle ScholarCross RefCross Ref
  36. N. Lindquist, P. Luszczek, and J. Dongarra. 2022. Accelerating restarted GMRES with mixed precision arithmetic. IEEE Transactions on Parallel and Distributed Systems 33, 4 (2022), 1027–1037. https://doi.org/10.1109/TPDS.2021.3090757Google ScholarGoogle ScholarCross RefCross Ref
  37. J. A. Loe, C. A. Glusa, I. Yamazaki, E. G. Boman, and S. Rajamanickam. 2021. Experimental evaluation of multiprecision strategies for GMRES on GPUs. In 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). IEEE, 469–478. https://doi.org/10.1109/IPDPSW52791.2021.00078Google ScholarGoogle ScholarCross RefCross Ref
  38. T. A. Manteuffel. 1980. An incomplete factorization technique for positive definite linear systems. Math. Comp. 34 (1980), 473–497. https://doi.org/10.2307/2006097Google ScholarGoogle ScholarCross RefCross Ref
  39. S. F. McCormick, J. Benzaken, and R. Tamstorf. 2021. Algebraic error analysis for mixed-precision multigrid solvers. SIAM J. on Scientific Computing 43, 5 (2021), S392–S419. https://doi.org/10.1137/20M1348571Google ScholarGoogle ScholarCross RefCross Ref
  40. J. A. Meijerink and H. A. van der Vorst. 1977. An iterative solution method for linear systems of which the coefficient matrix is a symmetric M-matrix. Math. Comp. 31, 137 (1977), 148–162. https://doi.org/10.2307/2005786Google ScholarGoogle ScholarCross RefCross Ref
  41. N. Munksgaard. 1980. Solving sparse symmetric sets of linear equations by preconditioned conjugate gradients. ACM Trans. Math. Software 6, 2 (1980), 206–219. https://doi.org/10.1145/355887.355893Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. D. Ruiz. 2001. A scaling algorithm to equilibrate both rows and columns norms in matrices. Technical Report RAL-TR-2001-034. Rutherford Appleton Laboratory, Chilton, Oxfordshire, England.Google ScholarGoogle Scholar
  43. D. Ruiz and B. Uçar. 2011. A symmetry preserving algorithm for matrix scaling. Technical Report INRIA RR-7552. INRIA, Grenoble, France.Google ScholarGoogle Scholar
  44. Y. Saad. 1994. A flexible inner-outer preconditioned GMRES algorithm. SIAM J. on Scientific and Statistical Computing 14 (1994), 461–469. https://doi.org/10.1137/0914028Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Y. Saad. 2003. Iterative Methods for Sparse Linear Systems (second ed.). SIAM, Philadelphia, PA. xviii+528 pages. https://doi.org/10.1137/1.9780898718003Google ScholarGoogle ScholarCross RefCross Ref
  46. J. A. Scott and M. Tůma. 2011. The importance of structure in incomplete factorization preconditioners. BIT Numerical Mathematics 51 (2011), 385–404. https://doi.org/10.1007/s10543-010-0299-8Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. J. A. Scott and M. Tůma. 2014. HSL_MI28: an efficient and robust limited-memory incomplete Cholesky factorization code. ACM Trans. Math. Software 40, 4 (2014), 24:1–19. https://doi.org/10.1145/2617555Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. J. W. Watts-III. 1981. A conjugate gradient truncated direct method for the iterative solution of the reservoir simulation pressure equation. Soc. Petroleum Engineer J. 21 (1981), 345–353. https://doi.org/10.2118/8252-PAGoogle ScholarGoogle ScholarCross RefCross Ref
  49. M. Zounon, N. J. Higham, C. Lucas, and F. Tisseur. 2022. Performance impact of precision reduction in sparse linear systems solvers. PeerJ Computer Science 8 (2022), e778:1–22. https://doi.org/10.7717/perrj-cs.778Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Avoiding breakdown in incomplete factorizations in low precision arithmetic

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Mathematical Software
          ACM Transactions on Mathematical Software Just Accepted
          ISSN:0098-3500
          EISSN:1557-7295
          Table of Contents

          Copyright © 2024 Copyright held by the owner/author(s).

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Online AM: 12 March 2024
          • Revised: 21 February 2024
          • Accepted: 21 February 2024
          • Received: 1 May 2023
          Published in toms Just Accepted

          Check for updates

          Qualifiers

          • research-article
        • Article Metrics

          • Downloads (Last 12 months)39
          • Downloads (Last 6 weeks)36

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader