Skip to main content
Log in

Numerical stability and tensor nuclear norm

  • Published:
Numerische Mathematik Aims and scope Submit manuscript

Abstract

We present a notion of bilinear stability, which is to numerical stability what bilinear complexity is to time complexity. In bilinear complexity, an algorithm for evaluating a bilinear operator \(\beta : {\mathbb {U}} \times {\mathbb {V}} \rightarrow {\mathbb {W}}\) is a decomposition \(\beta = \varphi _1 \otimes \psi _1 \otimes w_1 + \dots + \varphi _r \otimes \psi _r \otimes w_r \); the number of terms r captures the speed of the algorithm; and its smallest possible value, i.e., the tensor rank of \(\beta \), quantifies the speed of a fastest algorithm. Bilinear stability introduces norms to the mix: The growth factor of the algorithm \(\Vert \varphi _1 \Vert _* \Vert \psi _1 \Vert _* \Vert w_1 \Vert + \cdots + \Vert \varphi _r \Vert _* \Vert \psi _r \Vert _* \Vert w_r \Vert \) captures the accuracy of the algorithm; and its smallest possible value, i.e., the tensor nuclear norm of \(\beta \), quantifies the accuracy of a stablest algorithm. To substantiate this notion, we establish a bound for the forward error in terms of the growth factor and present numerical evidence comparing various fast algorithms for matrix and complex multiplications, showing that larger growth factors correlate with less accurate results. Compared to similar studies of numerical stability, bilinear stability is more general, applying to any bilinear operators and not just matrix or complex multiplications; is more simplistic, bounding forward error in terms of a single (growth) factor; and is truly tensorial like bilinear complexity, invariant under any orthogonal change of coordinates. As an aside, we study a new algorithm for computing complex multiplication in terms of real, much like Gauss’s, but is optimally fast and stable in that it attains both tensor rank and nuclear norm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. See [28, p. 37] [29, p. 8] for example.

  2. Genuine examples to follow in Sects. 4 and 5.

References

  1. Aizenberg, I.: Complex-Valued Neural Networks with Multi-Valued Neurons. Studies in Computational Intelligence, vol. 353. Springer, Berlin (2011)

    Book  MATH  Google Scholar 

  2. Ballard, G., Benson, A.R., Druinsky, A., Lipshitz, B., Schwartz, O.: Improving the numerical stability of fast matrix multiplication. SIAM J. Matrix Anal. Appl. 37(4), 1382–1418 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  3. Bassey, J., Qian, L., Li, X.: A survey of complex-valued neural networks. arXiv:2101.12249 (2021)

  4. Bini, D., Lotti, G.: Stability of fast algorithms for matrix multiplication. Numer. Math. 36(1), 63–72 (1980)

    Article  MathSciNet  MATH  Google Scholar 

  5. Bini, D., Lotti, G., Romani, F.: Approximate solutions for the bilinear form computational problem. SIAM J. Comput. 9(4), 692–697 (1980)

    Article  MathSciNet  MATH  Google Scholar 

  6. Borodin, A., Munro, I.: The Computational Complexity of Algebraic and Numeric Problems. Elsevier Computer Science Library: Theory of Computation Series, No. 1. American Elsevier Publishing Co., Inc., New York-London-Amsterdam (1975)

  7. Brent, R.P.: Algorithms for matrix multiplication. March 1970. Report Stan-CS-70-157, Stanford University

  8. Bürgisser, P., Clausen, M., Shokrollahi, M.A.: Algebraic Complexity Theory. Grundlehren der Mathematischen Wissenschaften, vol. 315. Springer, Berlin (1997)

    MATH  Google Scholar 

  9. Dash, B.N., Khare, N.: Deep complex neural network applications in remote sensing: an introductory review. In: Ranney, K.I., Raynal, A.M. (eds.) Radar Sensor Technology XXV. 11742, pp. 34–44. International Society for Optics and Photonics, SPIE, Bellingham (2021)

    Google Scholar 

  10. Defant, A., Floret, K.: Tensor Norms and Operator Ideals. North-Holland Mathematics Studies, vol. 176. North-Holland Publishing Co., Amsterdam (1993)

    MATH  Google Scholar 

  11. Derksen, H.: On the nuclear norm and the singular value decomposition of tensors. Found. Comput. Math. 16(3), 779–811 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  12. Diestel, J., Fourie, J.H., Swart, J.: The Metric Theory of Tensor Products. American Mathematical Society, Providence (2008)

    Book  MATH  Google Scholar 

  13. Fam, A.T.: Efficient complex matrix multiplication. IEEE Trans. Comput. 37(7), 877–879 (1988)

    Article  MATH  Google Scholar 

  14. Fawzi, A., Balog, M., Huang, A., Hubert, T., Romera-Paredes, B., Barekatain, M., Novikov, A., Ruiz, F.J.R., Schrittwieser, J., Swirszcz, G., Silver, D., Hassabis, D., Kohli, P.: Discovering faster matrix multiplication algorithms with reinforcement learning. Nature 610, 47–53 (2022)

    Article  MATH  Google Scholar 

  15. Friedland, S., Lim, L.-H.: Nuclear norm of higher-order tensors. Math. Comp. 87(311), 1255–1281 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  16. Higham, N.J.: Computing the polar decomposition–with applications. SIAM J. Sci. Stat. Comput. 7(4), 1160–1174 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  17. Higham, N.J.: Stability of a method for multiplying complex matrices with three real matrix multiplications. SIAM J. Matrix Anal. Appl. 13(3), 681–687 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  18. Higham, N.J.: The matrix sign decomposition and its relation to the polar decomposition. In: Proceedings of the 3rd ILAS Conference (Pensacola, FL, 1993) volume 212/213, pp. 3–20 (1994)

  19. Higham, N.J.: Accuracy and Stability of Numerical Algorithms, 2nd edn. Society for Industrial and Applied Mathematics (SIAM), Philadelphia (2002)

    Book  MATH  Google Scholar 

  20. Higham, N.J.: Functions of Matrices. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA (2008)

    Book  MATH  Google Scholar 

  21. Karatsuba, A., Ofman, Y.: Multiplication of many-digital numbers by automatic computers. Dokl. Akad. Nauk SSSR 14(145), 293–294 (1962)

    Google Scholar 

  22. Kenney, C., Laub, A.J.: On scaling Newton’s method for polar decomposition and the matrix sign function. SIAM J. Matrix Anal. Appl. 13(3), 698–706 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  23. Kljuev, V.V., Kokovkin-Ščerbak, N.I.: On the minimization of the number of arithmetic operations for solving linear algebraic systems of equations. Ž. Vyčisl. Mat i Mat. Fiz. 5, 21–33 (1965)

    MathSciNet  Google Scholar 

  24. Knuth, D.E.: The Art of Computer Programming, vol. 2, 3rd edn. Addison-Wesley, Reading (1998)

    MATH  Google Scholar 

  25. Landsberg, J.M.: The border rank of the multiplication of \(2\times 2\) matrices is seven. J. Am. Math. Soc. 19(2), 447–459 (2006)

    Article  MATH  Google Scholar 

  26. Landsberg, J.M.: Geometry and Complexity Theory Cambridge. Studies in Advanced Mathematics, vol. 169. Cambridge University Press, Cambridge (2017)

    Book  Google Scholar 

  27. Lim, L.-H.: Tensors in computations. Acta Numer. 30, 555–764 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  28. Moore, C., Mertens, S.: The Nature of Computation. Oxford University Press, Oxford (2011)

    Book  MATH  Google Scholar 

  29. Rudich, S.: Complexity theory: from Gödel to Feynman. In: Computational Complexity Theory, volume 10 of IAS/Park City Math. Ser., pp. 5–87. Amer. Math. Soc., Providence, RI (2004)

  30. Ryan, R.A.: Introduction to Tensor Products of Banach Spaces. Springer Monographs in Mathematics. Springer, London (2002)

    Book  Google Scholar 

  31. Scardapane, S., Van Vaerenbergh, S., Hussain, A., Uncini, A.: Complex-valued neural networks with nonparametric activation functions. IEEE Trans. Emerg. Topics Comput. 4(2), 140–150 (2018)

    Article  Google Scholar 

  32. Strassen, V.: Gaussian elimination is not optimal. Numer. Math. 13, 354–356 (1969)

    Article  MathSciNet  MATH  Google Scholar 

  33. Strassen, V.: Vermeidung von Divisionen. J. Reine Angew. Math. 264, 184–202 (1973)

    MathSciNet  MATH  Google Scholar 

  34. Strassen, V.: Relative bilinear complexity and matrix multiplication. J. Reine Angew. Math. 375(376), 406–443 (1987)

    MathSciNet  MATH  Google Scholar 

  35. Strassen, V.: Algebraic complexity theory. In: Handbook of Theoretical Computer Science, Vol. A. Elsevier, Amsterdam, pp. 633–672 (1990)

  36. Trabelsi, C., Bilaniuk, O., Zhang, Y., Serdyuk, D., Subramanian, S., Santos, J.F., Mehri, S., Rostamzadeh, N., Bengio, Y., Pal, C.J.: Deep complex networks. In: International Conference on Learning Representations (2018)

  37. Winograd, S.: On multiplication of \(2\times 2\) matrices. Linear Algebra Appl. 4, 381–388 (1971)

    Article  MathSciNet  MATH  Google Scholar 

  38. Ye, K., Lim, L.-H.: Fast structured matrix computations: tensor rank and Cohn-Umans method. Found. Comput. Math. 18(1), 45–95 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  39. Zhang, H., Gu, M., Jiang, X., Thompson, J., Cai, H., Paesani, S., Santagati, R., Laing, A., Zhang, Y., Yung, M., et al.: An optical neural chip for implementing complex-valued neural network. Nat. Commun. 12(1), 1–11 (2021)

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank Nick Higham and Ke Ye for helpful discussions, the two anonymous reviewers for their very pertinent suggestions, and the University of Chicago’s Research Computing Center for its computing resources and services. ZD acknowledges the support of DARPA HR00112190040 and NSF ECCF 2216912. LHL acknowledges the support of DARPA HR00112190040, NSF DMS 1854831, and a Vannevar Bush Faculty Fellowship ONR N000142312863.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lek-Heng Lim.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dai, Z., Lim, LH. Numerical stability and tensor nuclear norm. Numer. Math. 155, 345–376 (2023). https://doi.org/10.1007/s00211-023-01377-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00211-023-01377-5

Mathematics Subject Classification

Navigation