Abstract
We analyze the forward error in the floating point summation of real numbers, for computations in low precision or extreme-scale problem dimensions that push the limits of the precision. We present a systematic recurrence for a martingale on a computational tree, which leads to explicit and interpretable bounds with nonlinear terms controlled explicitly rather than by big-O terms. Two probability parameters strengthen the precision-awareness of our bounds: one parameter controls the first order terms in the summation error, while the second one is designed for controlling higher order terms in low precision or extreme-scale problem dimensions. Our systematic approach yields new deterministic and probabilistic error bounds for three classes of mono-precision algorithms: general summation, shifted general summation, and compensated (sequential) summation. Extension of our systematic error analysis to mixed-precision summation algorithms that allow any number of precisions yields the first probabilistic bounds for the mixed-precision FABsum algorithm. Numerical experiments illustrate that the probabilistic bounds are accurate, and that among the three classes of mono-precision algorithms, compensated summation is generally the most accurate. As for mixed precision algorithms, our recommendation is to minimize the magnitude of intermediate partial sums relative to the precision in which they are computed.
Similar content being viewed by others
Notes
For simplicity, the conditioning also includes those \(\delta _{\ell }\), \(1\le \ell \le k-1\), that are not descendants in the partial order. With stochastic rounding such \(\delta _\ell \) are fully independent of \(\delta _k\).
If n does not admit an exact floating point representation, then we could append an additional vertex for the artificial ‘addition’ \(n+0\), to induce the rounding of n.
The dots do not refer to differentiation!
An especially alert reviewer discovered that the typo was found in March 2007, as mentioned in the earliest errata for [22] from January 2011.
Although the quantities depend on n and \(\eta \), we omit the subscripts, and simply write \(S_k\) instead of \(S_{k, n, \eta }\).
Our simulation of half-precision ignores the range restriction realmax = 65504.
References
Abdelfattah, A., Anzt, H., Boman, E.G., Carson, E., Cojean, T., Dongarra, J., Fox, A., Gates, M., Higham, N.J., Li, X.S., et al.: A survey of numerical linear algebra methods utilizing mixed-precision arithmetic. Int. J. High Perform. Comput. Appl. 35(4), 344–369 (2021)
Blanchard, P., Higham, N.J., Mary, T.: A class of fast and accurate summation algorithms. SIAM J. Sci. Comput. 42(3), A1541–A1557 (2020)
Chung, F., Lu, L.: Concentration inequalities and martingale inequalities: a survey. Internet Math. 3(1), 79–127 (2006)
Connolly, M.P., Higham, N.J., Mary, T.: Stochastic rounding and its probabilistic backward error analysis. SIAM J. Sci. Comput. 43(1), A566–A585 (2021)
Constantinides, G., Dahlqvist, F., Rakamaric, Z., Salvia, R.: Rigorous roundoff error analysis of probabilistic floating-point computations (2021). ArXiv:2105.13217
Dahlqvist, F., Salvia, R., Constantinides, G.A.: A probabilistic approach to floating-point arithmetic (2019). ArXiv:1912.00867
Demmel, J., Hida, Y.: Accurate and efficient floating point summation. SIAM J. Sci. Comput. 25(4), 1214–1248 (2003/04)
El Arar, E.M., Sohier, D., de Oliveira Castro, P., Petit, E.: Bounds on non-linear errors for variance computation with stochastic rounding (2023). ArXiv:2304.05177
Goldberg, D.: What every computer scientist should know about floating-point arithmetic. ACM Comput. Surv. 23(1), 5–48 (1991)
Hallman, E.: A refined probabilistic error bound for sums (2021). ArXiv:2104.06531
Higham, N.J.: Accuracy and Stability of Numerical Algorithms, 2nd edn. SIAM, Philadelphia (2002)
Higham, N.J., Mary, T.: A new approach to probabilistic rounding error analysis. SIAM J. Sci. Comput. 41(5), A2815–A2835 (2019)
Higham, N.J., Mary, T.: Sharper probabilistic backward error analysis for basic linear algebra kernels with random data. SIAM J. Sci. Comput. 42(5), A3427–A3446 (2020)
Higham, N.J., Mary, T.: Mixed precision algorithms in numerical linear algebra. Acta Numer. 31, 347–414 (2022)
Higham, N.J., Pranesh, S.: Simulating low precision floating-point arithmetic. SIAM J. Sci. Comput. 41(5), C585–C602 (2019)
IEEE Computer Society: IEEE Standard for Floating-Point Arithmetic, IEEE Standard 754-2008 (2019). http://ieeexplore.ieee.org/document/4610935
Ipsen, I.C.F., Zhou, H.: Probabilistic error analysis for inner products. SIAM J. Matrix Anal. Appl. 41(4), 1726–1741 (2020)
Jeannerod, C.P., Rump, S.M.: Improved error bounds for inner products in floating-point arithmetic. SIAM J. Matrix Anal. Appl. 34(2), 338–344 (2013)
Jeannerod, C.P., Rump, S.M.: On relative errors of floating-point operations: optimal bounds and applications. Math. Comput. 87(310), 803–819 (2018)
Kahan, W.: Further remarks on reducing truncation errors. Commun. ACM 8(1), 40 (1965)
Kahan, W.: Implementation of algorithms (lecture notes by W. S. Haugeland and D. Hough). Tech. Rep. 20, Department of Computer Science, University of California, Berkeley, CA 94720 (1973)
Knuth, D.: The Art of Computer Programming, 3rd edn. Addison-Wesley, Reading, MA (1998)
Lange, M., Rump, S.: Sharp estimates for perturbation errors in summations. Math. Comput. 88(315), 349–368 (2019)
Lohar, D., Prokop, M., Darulova, E.: Sound probabilistic numerical error analysis. In: Intern. Conf. Integrated Formal Methods, pp. 322–340. Springer, Cham (2019)
Mitzenmacher, M., Upfal, E.: Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis. Cambridge University Press, Cambridge (2005)
Roch, S.: Modern discrete probability: an essential toolkit. University Lecture (2015)
Rump, S.M.: Error estimation of floating-point summation and dot product. BIT Numer. Math. 52(1), 201–220 (2012)
Acknowledgements
We are greatly indebted to Claude-Pierre Jeannerod for his many helpful suggestions that improved the paper, and to the two reviewers for their unusually careful and constructive reading of the paper. We also thank Johnathan Rhyne for helpful discussions.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This research was supported in part by grants DMS-1745654 and DMS-1760374 from the National Science Foundation, and grant DE-SC0022085 from the Department of Energy.
A Proof of Lemma 8
A Proof of Lemma 8
Define \(\beta \equiv u(1+u)^2\) and
By assumption, \(\beta < 1\). Lemma 7 implies
where \(Z_2\le u\omega _2\) and \(C_2\le u(1+u)\omega _2\). For \(3\le k \le n\), define the vectors
From (A.3) follows the componentwise inequality
where \({\textbf {U}}\) is an upper shift matrix. Solving for \({\textbf {c}}_k\) gives another componentwise inequality with a unit upper triangular matrix \({\textbf {I}}-\beta {\textbf {U}}\),
and a bound
The bound for \(\Vert {\textbf {z}}_k\Vert _2\) follows from (A.2) and the definition of \(\beta \),
Finally, from \(Y_k = (1+u)C_{k-1}\) follows the Frobenius norm bound
where the higher order terms in \(\alpha \) follow from the Taylor series expansion \((1-\beta )^{-2}=1 +2u +\mathcal {O}(u^2)\),
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Hallman, E., Ipsen, I.C.F. Precision-aware deterministic and probabilistic error bounds for floating point summation. Numer. Math. 155, 83–119 (2023). https://doi.org/10.1007/s00211-023-01370-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00211-023-01370-y