Abstract
We present a notion of bilinear stability, which is to numerical stability what bilinear complexity is to time complexity. In bilinear complexity, an algorithm for evaluating a bilinear operator \(\beta : {\mathbb {U}} \times {\mathbb {V}} \rightarrow {\mathbb {W}}\) is a decomposition \(\beta = \varphi _1 \otimes \psi _1 \otimes w_1 + \dots + \varphi _r \otimes \psi _r \otimes w_r \); the number of terms r captures the speed of the algorithm; and its smallest possible value, i.e., the tensor rank of \(\beta \), quantifies the speed of a fastest algorithm. Bilinear stability introduces norms to the mix: The growth factor of the algorithm \(\Vert \varphi _1 \Vert _* \Vert \psi _1 \Vert _* \Vert w_1 \Vert + \cdots + \Vert \varphi _r \Vert _* \Vert \psi _r \Vert _* \Vert w_r \Vert \) captures the accuracy of the algorithm; and its smallest possible value, i.e., the tensor nuclear norm of \(\beta \), quantifies the accuracy of a stablest algorithm. To substantiate this notion, we establish a bound for the forward error in terms of the growth factor and present numerical evidence comparing various fast algorithms for matrix and complex multiplications, showing that larger growth factors correlate with less accurate results. Compared to similar studies of numerical stability, bilinear stability is more general, applying to any bilinear operators and not just matrix or complex multiplications; is more simplistic, bounding forward error in terms of a single (growth) factor; and is truly tensorial like bilinear complexity, invariant under any orthogonal change of coordinates. As an aside, we study a new algorithm for computing complex multiplication in terms of real, much like Gauss’s, but is optimally fast and stable in that it attains both tensor rank and nuclear norm.
Similar content being viewed by others
References
Aizenberg, I.: Complex-Valued Neural Networks with Multi-Valued Neurons. Studies in Computational Intelligence, vol. 353. Springer, Berlin (2011)
Ballard, G., Benson, A.R., Druinsky, A., Lipshitz, B., Schwartz, O.: Improving the numerical stability of fast matrix multiplication. SIAM J. Matrix Anal. Appl. 37(4), 1382–1418 (2016)
Bassey, J., Qian, L., Li, X.: A survey of complex-valued neural networks. arXiv:2101.12249 (2021)
Bini, D., Lotti, G.: Stability of fast algorithms for matrix multiplication. Numer. Math. 36(1), 63–72 (1980)
Bini, D., Lotti, G., Romani, F.: Approximate solutions for the bilinear form computational problem. SIAM J. Comput. 9(4), 692–697 (1980)
Borodin, A., Munro, I.: The Computational Complexity of Algebraic and Numeric Problems. Elsevier Computer Science Library: Theory of Computation Series, No. 1. American Elsevier Publishing Co., Inc., New York-London-Amsterdam (1975)
Brent, R.P.: Algorithms for matrix multiplication. March 1970. Report Stan-CS-70-157, Stanford University
Bürgisser, P., Clausen, M., Shokrollahi, M.A.: Algebraic Complexity Theory. Grundlehren der Mathematischen Wissenschaften, vol. 315. Springer, Berlin (1997)
Dash, B.N., Khare, N.: Deep complex neural network applications in remote sensing: an introductory review. In: Ranney, K.I., Raynal, A.M. (eds.) Radar Sensor Technology XXV. 11742, pp. 34–44. International Society for Optics and Photonics, SPIE, Bellingham (2021)
Defant, A., Floret, K.: Tensor Norms and Operator Ideals. North-Holland Mathematics Studies, vol. 176. North-Holland Publishing Co., Amsterdam (1993)
Derksen, H.: On the nuclear norm and the singular value decomposition of tensors. Found. Comput. Math. 16(3), 779–811 (2016)
Diestel, J., Fourie, J.H., Swart, J.: The Metric Theory of Tensor Products. American Mathematical Society, Providence (2008)
Fam, A.T.: Efficient complex matrix multiplication. IEEE Trans. Comput. 37(7), 877–879 (1988)
Fawzi, A., Balog, M., Huang, A., Hubert, T., Romera-Paredes, B., Barekatain, M., Novikov, A., Ruiz, F.J.R., Schrittwieser, J., Swirszcz, G., Silver, D., Hassabis, D., Kohli, P.: Discovering faster matrix multiplication algorithms with reinforcement learning. Nature 610, 47–53 (2022)
Friedland, S., Lim, L.-H.: Nuclear norm of higher-order tensors. Math. Comp. 87(311), 1255–1281 (2018)
Higham, N.J.: Computing the polar decomposition–with applications. SIAM J. Sci. Stat. Comput. 7(4), 1160–1174 (1986)
Higham, N.J.: Stability of a method for multiplying complex matrices with three real matrix multiplications. SIAM J. Matrix Anal. Appl. 13(3), 681–687 (1992)
Higham, N.J.: The matrix sign decomposition and its relation to the polar decomposition. In: Proceedings of the 3rd ILAS Conference (Pensacola, FL, 1993) volume 212/213, pp. 3–20 (1994)
Higham, N.J.: Accuracy and Stability of Numerical Algorithms, 2nd edn. Society for Industrial and Applied Mathematics (SIAM), Philadelphia (2002)
Higham, N.J.: Functions of Matrices. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA (2008)
Karatsuba, A., Ofman, Y.: Multiplication of many-digital numbers by automatic computers. Dokl. Akad. Nauk SSSR 14(145), 293–294 (1962)
Kenney, C., Laub, A.J.: On scaling Newton’s method for polar decomposition and the matrix sign function. SIAM J. Matrix Anal. Appl. 13(3), 698–706 (1992)
Kljuev, V.V., Kokovkin-Ščerbak, N.I.: On the minimization of the number of arithmetic operations for solving linear algebraic systems of equations. Ž. Vyčisl. Mat i Mat. Fiz. 5, 21–33 (1965)
Knuth, D.E.: The Art of Computer Programming, vol. 2, 3rd edn. Addison-Wesley, Reading (1998)
Landsberg, J.M.: The border rank of the multiplication of \(2\times 2\) matrices is seven. J. Am. Math. Soc. 19(2), 447–459 (2006)
Landsberg, J.M.: Geometry and Complexity Theory Cambridge. Studies in Advanced Mathematics, vol. 169. Cambridge University Press, Cambridge (2017)
Lim, L.-H.: Tensors in computations. Acta Numer. 30, 555–764 (2021)
Moore, C., Mertens, S.: The Nature of Computation. Oxford University Press, Oxford (2011)
Rudich, S.: Complexity theory: from Gödel to Feynman. In: Computational Complexity Theory, volume 10 of IAS/Park City Math. Ser., pp. 5–87. Amer. Math. Soc., Providence, RI (2004)
Ryan, R.A.: Introduction to Tensor Products of Banach Spaces. Springer Monographs in Mathematics. Springer, London (2002)
Scardapane, S., Van Vaerenbergh, S., Hussain, A., Uncini, A.: Complex-valued neural networks with nonparametric activation functions. IEEE Trans. Emerg. Topics Comput. 4(2), 140–150 (2018)
Strassen, V.: Gaussian elimination is not optimal. Numer. Math. 13, 354–356 (1969)
Strassen, V.: Vermeidung von Divisionen. J. Reine Angew. Math. 264, 184–202 (1973)
Strassen, V.: Relative bilinear complexity and matrix multiplication. J. Reine Angew. Math. 375(376), 406–443 (1987)
Strassen, V.: Algebraic complexity theory. In: Handbook of Theoretical Computer Science, Vol. A. Elsevier, Amsterdam, pp. 633–672 (1990)
Trabelsi, C., Bilaniuk, O., Zhang, Y., Serdyuk, D., Subramanian, S., Santos, J.F., Mehri, S., Rostamzadeh, N., Bengio, Y., Pal, C.J.: Deep complex networks. In: International Conference on Learning Representations (2018)
Winograd, S.: On multiplication of \(2\times 2\) matrices. Linear Algebra Appl. 4, 381–388 (1971)
Ye, K., Lim, L.-H.: Fast structured matrix computations: tensor rank and Cohn-Umans method. Found. Comput. Math. 18(1), 45–95 (2018)
Zhang, H., Gu, M., Jiang, X., Thompson, J., Cai, H., Paesani, S., Santagati, R., Laing, A., Zhang, Y., Yung, M., et al.: An optical neural chip for implementing complex-valued neural network. Nat. Commun. 12(1), 1–11 (2021)
Acknowledgements
The authors would like to thank Nick Higham and Ke Ye for helpful discussions, the two anonymous reviewers for their very pertinent suggestions, and the University of Chicago’s Research Computing Center for its computing resources and services. ZD acknowledges the support of DARPA HR00112190040 and NSF ECCF 2216912. LHL acknowledges the support of DARPA HR00112190040, NSF DMS 1854831, and a Vannevar Bush Faculty Fellowship ONR N000142312863.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Dai, Z., Lim, LH. Numerical stability and tensor nuclear norm. Numer. Math. 155, 345–376 (2023). https://doi.org/10.1007/s00211-023-01377-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00211-023-01377-5