Correction to: Numer. Math. (2017) 135:1207–1220 https://doi.org/10.1007/s00211-016-0829-7

1 Introduction

In [1], the influence of random equation ordering in a linear system \(By=b\) on deriving upper bounds for the convergence speed of the classical successive over-relaxation (SOR) method

$$\begin{aligned} y^{(k+1)}=y^{(k)}+\omega (D+\omega L)^{-1}(b-By^{(k)}), \qquad k=0,1,\ldots , \end{aligned}$$
(1)

was studied. For simplicity, it is assumed that the system is consistent with solution y and that B is a complex \(n\times n\) Hermitian positive semi-definite matrix with positive diagonal part D and strictly lower triangular part L. Two strategies of involving permutations of the system were considered. For the so-called shuffled SOR iteration, in each step the SOR update formula (1) is applied to a uniformly at random and independently chosen permutation of the linear system, while for the preshuffled SOR iteration the iteration (1) is performed for \(k=0,1,\ldots \) after a single permutation of the system at the beginning. To study the convergence properties of such iterations, 2-norm estimates involving the lower triangular part \(L_\sigma \) of the permuted matrix \(B_\sigma =P_\sigma B P_\sigma ^*\) played a crucial role, where \(P_\sigma \) denotes the permutation matrix associated with the permutation \(\sigma \) acting according to \((P_\sigma y)_i=y_{\sigma _i}\), \(i=1,\ldots ,n\).

The necessary properties were formulated as Theorem 2 and 3 in [1]. Unfortunately, the proof of Theorem 2 uses a wrong formula for \(P_\sigma ^*L_\sigma L_\sigma ^*P_\sigma \). This was pointed out by T. Yilmaz in [2]. In the next section, a proof of Theorem 2 based on the correct formula is provided, including a slight improvement of the involved constants.

Even though the proof of Theorem 3 in [1] is correct, we use the opportunity to give new estimates for the absolute constant in the inequality

$$\begin{aligned} \inf _\sigma \Vert L_\sigma \Vert \le C \Vert B\Vert . \end{aligned}$$
(2)

For the class of all Hermitean matrices (not necessarily positive-definite) we show that taking \(C=245\) is feasible which considerably improves the value \(C=C_2=2905\) stated in [1]. For positive semi-definite B, one can even take \(C=122{,}3\). These bounds are consequences of recent quantitative improvements of the Anderson paving conjecture and will be shown in Sect. 3. The derivation in [1] that (2) holds for positive semi-definite B with unit diagonal with the smaller value \(C=C_1=32{,}42\) is based on an flawed application of earlier results on the size of one-sided pavings and thus is not correct. It remains an interesting open question to find more precise bounds for the constant C in (2). For the class of positive semi-definite B with unit diagonal I conjecture that \(C=2/\pi \) is the best possible choice.

2 Correct statement and proof of Theorem 2 from [1]

Theorem 2 in [1] concerns the 2-norm estimate of the matrix

$$\begin{aligned} E:= \frac{1}{n!} \sum _{\sigma } P_\sigma ^*L_\sigma L_\sigma ^*P_\sigma , \end{aligned}$$
(3)

which plays a cruical role in the estimates of the expected squared error of the shuffled SOR iteration, see Theorem 4 a) there. As was mentioned above, its proof uses a wrong formula for the entries of \(P_\sigma ^*L_\sigma L_\sigma ^*P_\sigma \). The correct formula [2] is

$$\begin{aligned} (P_\sigma ^*L_\sigma L_\sigma ^*P_\sigma )_{s,t} =\sum _{k=1}^{\min (\sigma ^{-1}_s,\sigma ^{-1}_t)-1} H_{s,\sigma _k}H_{\sigma _k,t}, \end{aligned}$$
(4)

where \(H_{i,j}\) are the entries of the Hermitean matrix \(H=B-D\), the non-diagonal part of B, and \(\sigma ^{-1}\) is the permutation inverse to \(\sigma \).

To see (4), recall that

$$\begin{aligned} (L_\sigma )_{i,k}=\left\{ \begin{array}{ll} H_{\sigma _i,\sigma _k},&{}\;k<i,\\ 0,&{}\;k\ge i,\end{array}\right. \qquad (L^*_\sigma )_{k,j}=(L_\sigma )_{j,k}^*=\left\{ \begin{array}{ll} H_{\sigma _k,\sigma _j},&{}\;k<j,\\ 0,&{}\;k\ge j.\end{array}\right. \end{aligned}$$

Consequently,

$$\begin{aligned} (L_\sigma L^*_\sigma )_{i,j} = \sum _{k=1}^{\min (i,j)-1} H_{\sigma _i,\sigma _k}H_{\sigma _k,\sigma _j} \end{aligned}$$

and, by setting \(i=\sigma ^{-1}_s\), \(j=\sigma ^{-1}_t\), we arrive at (4).

Based on (4), we next derive a formula for E in terms of the Hermitean positive-definite matrix \(H^2\), namely,

$$\begin{aligned} E=\frac{1}{3} H^2 + \frac{1}{6} D_{H^2}, \end{aligned}$$
(5)

where \(D_{H^2}\) is the diagonal part of \(H^2\). Indeed, from (3) and (4) we have

$$\begin{aligned} n!\cdot E_{s,t}= & {} \sum _\sigma \sum _{k=1}^{\min (\sigma ^{-1}_s,\sigma ^{-1}_t)-1} H_{s,\sigma _k}H_{\sigma _k,t}\\= & {} \sum _{m=1}^n H_{s,m}H_{m,t}\cdot n_{m;s,t}, \end{aligned}$$

where \(n_{m;s,t}\) stands for the cardinality of the set of all permutations \(\sigma \) such that, for some \(k<\min (\sigma ^{-1}_s,\sigma ^{-1}_t)\), we have \(\sigma _k=m\). Equivalently, this is the cardinality of the set of all \(\sigma \) such that \(\sigma ^{-1}_m < \min (\sigma ^{-1}_s,\sigma ^{-1}_t)\). It is not hard to see that these cardinalities equal

$$\begin{aligned} n_{m;s,s}=\frac{1}{2} n!,\;\; m\ne s,\qquad n_{m;s,t}=\frac{1}{3} n!,\;\; m\ne s\ne t\ne m. \end{aligned}$$

Indeed, for the case \(m\ne t=s\), any \(\sigma \) in the associated set is obtained by first choosing two indices ki with \(k<i\) from \(\{1,\ldots ,n\}\) and setting \(\sigma ^{-1}_m=k\), \(\sigma ^{-1}_s=i\) (this is possible in \(n(n-1)/2\) different ways) and then independently assigning the remaining \(n-2\) indices arbitrarily (this is possible in \((n-2)!\) different ways). A similar reasoning applies to the case \(m\ne s\ne t\ne m\), where one starts with a subset of 3 different indices kij with \(k< i <j\), sets

$$\begin{aligned} \sigma ^{-1}_m=k,\quad \sigma ^{-1}_s=i,\quad \sigma ^{-1}_t=j, \end{aligned}$$

or alternatively

$$\begin{aligned} \sigma ^{-1}_m=k,\quad \sigma ^{-1}_t=i,\quad \sigma ^{-1}_s=j \end{aligned}$$

(altogether \(n(n-1)(n-2)/3\) different possibilities) and assigns the remaining \(n-3\) indices arbitrarily (\((n-3)!\) different possibilities). For index constellations, where \(m=s\) or \(m=t\), one obviously has \(n_{m;s,t}=0\). With this, we arrive for \(s=t\) at

$$\begin{aligned} E_{s,s} =\frac{1}{2} \sum _{m\ne s} H_{s,m}H_{m,s}=\frac{1}{2} \sum _{m=1}^n H_{s,m}H_{m,s} =\frac{1}{2} (H^2)_{s,s}, \end{aligned}$$

since \(H_{m,m}=0\). Similarly, for \(s\ne t\) we have

$$\begin{aligned} E_{s,t} =\frac{1}{3} \sum _{m\ne s,m\ne t} H_{s,m}H_{m,t}=\frac{1}{3} (H^2)_{s,t}. \end{aligned}$$

This establishes the formula (5).

Note that, up to this point, the calculations hold for any Hermitean B. Since \(H^2\) and its diagonal part \(D_{H^2}\) are automatically positive semi-definite, we thus get

$$\begin{aligned} \Vert E\Vert \le \frac{1}{3} \Vert H^2\Vert +\frac{1}{6} \Vert D_{H^2}\Vert \le \frac{1}{2} \Vert H^2\Vert . \end{aligned}$$

Since \(\Vert H\Vert \le \Vert B\Vert +\Vert D\Vert \le 2\Vert B\Vert \) for any B we also have

$$\begin{aligned} \Vert E\Vert \le \frac{1}{2} \Vert H\Vert ^2 \le 2\Vert B\Vert ^2 \end{aligned}$$

for all Hermitean B.

If the Hermitean B is positive semi-definite, then \(H=B-D\) has norm \(\Vert H\Vert \le \Vert B\Vert \) since

$$\begin{aligned} -\Vert B\Vert \Vert x\Vert ^2 \le -(Dx,x) \le (Hx,x) \le (Bx,x) \le \Vert B\Vert \Vert x\Vert ^2. \end{aligned}$$

Thus, in this case \(\Vert E\Vert \le \frac{1}{2} \Vert B\Vert ^2\). If, in addition, B has unit diagonal (i.e., \(D=I\)) then slightly more precise bounds are possible. Indeed, then

$$\begin{aligned} \Vert H^2\Vert =\lambda _{\max }(H^2)=\max ((\lambda _{\max }(B)-1)^2,\lambda _{\min }(B)-1)^2)\le \max ((\Vert B\Vert -1)^2,1). \end{aligned}$$

In summary, we have proved the following replacement of Theorem 2 from [1].

Theorem

Let B be an arbitrary Hermitean matrix, and \(H=B-D\) its non-diagonal part. Then the matrix E defined in (3) satisfies

$$\begin{aligned} \Vert E\Vert \le \frac{1}{2} \Vert H\Vert ^2 \le 2\Vert B\Vert ^2. \end{aligned}$$

If, in addition, B is positive semi-definite then

$$\begin{aligned} \Vert E\Vert \le \frac{1}{2} \Vert B\Vert ^2. \end{aligned}$$

Compared to the statement of Theorem 2 in [1], the constants in these estimates are reduced by a factor of two which also leads to better constants in Theorem 4 a) in [1].

3 New constants in Theorem 3 from [1]

The proof of Theorem 3 in [1], i.e., the proof of (2) with a constant C independent of the size of B, is essentially based on the existence of so-called \((k,\epsilon )\)-pavings for Hermitean matrices with zero (or small) diagonal part such as \(H=B-D\). We use a consequence of a recent refinement [3] of the original proof of the Anderson paving conjecture. If one carefully follows the proof of Theorem 1.1 in [3, Section 5.2] specialized to the pair \([H,-H]\) (in particular, if one uses the more precise bound at the end of the proof of Theorem 5.6 there) then one sees that for any \(\epsilon \in (0,1)\) there exists a \((k,\epsilon )\)-paving of H if \(4k^{-1/2}+2k^{-1}\le \epsilon \). Equivalently, for any \(k\ge 20\) there exists a \((k,\epsilon _k)\)-paving for H with

$$\begin{aligned} \epsilon _k:=4k^{-1/2}+2k^{-1}<1. \end{aligned}$$

It was shown in the proof of Theorem 3 and in the remarks following it in [1] that this implies the estimate (2) with the constant

$$\begin{aligned} C=\min _{k\ge 20} \frac{k-1}{1-\epsilon _k} < 122{,}3 \end{aligned}$$

(the minimum is achieved for \(k=43\)). Since \(\Vert H\Vert \le 2\Vert B\Vert \) for general Hermitean B and \(\Vert H\Vert \le \Vert B\Vert \) for positive semi-definite B, this yields the respective statements about the constant C in (2) in Sect. 1.