1 Introduction

In this paper, we consider the problem of testing goodness-of-fit for distributions defined on the surface of the sphere \(\mathcal{S}^{2}\) or of the hypersphere \({{\mathcal {S}}}^{d-1}\), where \(d\ge 3\). In this respect, there is a plethora of such tests for distributions defined on \({\mathbb {R}}^d\), even in the multivariate case \(d>1\). Besides, also goodness-of-fit tests on the circular domain \({{\mathcal {S}}}^{1}\) is a relatively well-explored area. For the latter case we refer, e.g., to García-Portugués and Verdebout (2018, Chaps. 6 and 7), Jammalamadaka et al. (2019), and Jammalamadaka et al. (2020).

On the other hand, the same problem for data taking values on \(\mathcal{S}^{d-1}\), where \(d\ge 3\), has been mostly confined to testing for uniformity. Nevertheless, and while the notion of a “non-preferred direction” (Provide proper quotation mark) and hence of testing for uniformity is certainly central to (hyper)spherical data analysis, there are several more flexible distributions which, in fact, often have the uniform as a special case. The reader is referred to the monographs (Ley and Verdebout 2017, Sect. 2.3; Mardia and Jupp 2000, Sect. 9.3), for such non-uniform models for (hyper)spherical data. At the same time, it seems that goodness-of-fit tests specifically tailored to (hyper)spherical laws are scarce, certainly in the case of a composite null hypothesis, where parameters need to be estimated from the data at hand, but also for a completely specified hypothesis with fixed (known) parameter values. For the latter case, the test based on nearest neighbors proposed in Ebner et al. (2018) seems to be one of the few tests available, while to the best of our knowledge, there is much need for research in the case of a composite hypothesis.

In view of these lines, we suggest a procedure for testing the goodness-of-fit of distributions defined on \({{\mathcal {S}}}^{d-1}\), where \(d\ge 3\).Footnote 1 The suggested test is innovative in that it is general-purpose suitable for arbitrary (hyper)spherical distributions, either with fixed or with estimated parameters, and it is straightforwardly applicable, provided that we can easily draw Monte Carlo samples from the distribution under test.

To be specific, let \(\Vert \cdot \Vert \) be the Euclidean norm in \(\mathbb {R}^d\), \(d \ge 2\), and write \({{\mathcal {S}}}^{d-1}:= \{x \in \mathbb {R}^d: \Vert x\Vert =1\}\) for the surface of the unit sphere in \(\mathbb {R}^d\). Suppose X is a random (column) vector in \(\mathbb {R}^d\) taking on values in \(\mathcal{S}^{d-1}\) with a density f with respect to surface measure.

We start our exposition with the simple null hypothesis

$$\begin{aligned} H_0: f=f_0, \end{aligned}$$
(1)

where \(f_0\) is some given density on \({{\mathcal {S}}}^{d-1}\), which should be tested against the general alternative \(H_A\) that the distributions pertaining to f and \(f_0\) are different.

Most of the approaches for testing goodness-of-fit depend on some discrepancy measure between a given distributional quantity, such as the density, the distribution function or the characteristic function (CF), and a corresponding empirical counterpart, and thereby typically assume that the functional form of this distributional quantity is known under the null hypothesis. Here, we focus on using the CF rather than the density or distribution function in constructing our test. The functional form of the CF, however, is available only for distributions on the real line \({\mathbb {R}}^1\) and for a few selected cases of multivariate distributions, such as the multivariate normal and the multivariate stable distributions; see Ebner and Henze (2020) and Meintanis et al. (2015).

In order to circumvent this obstacle, which is even more challenging for distributions taking on values on \({{\mathcal {S}}}^{d-1}\), in addition to the data at hand we also consider a Monte Carlo sample from the distribution under test and thereafter formulate our test as a two-sample test that incorporates a pair of empirical quantities obtained from the two samples, i.e., one quantity from the data at hand and the other from the aforementioned Monte Carlo sample.

This idea also applies to the problem of testing the composite null hypothesis

$$\begin{aligned} H_{0,\vartheta }: f(\cdot ) =f_0(\cdot ,\vartheta ) \quad \text {for some} \ \vartheta \in \Theta , \end{aligned}$$
(2)

against general alternatives. Here, \(\{f_0(\cdot ,\vartheta ): \vartheta \in \Theta \}\) is a given family of densities on \({{\mathcal {S}}}^{d-1}\) that is parameterized in terms of \(\vartheta \in \Theta \), where \(\Theta \subset \mathbb {R}^s\) for some \(s \ge 1\). In this case, the Monte Carlo sample should be drawn from the specific member of the family under test that corresponds to an estimate of the parameter \(\vartheta \), obtained on the basis of the data at hand.

In this connection, we note that the idea of a goodness-of-fit method that makes use of an artificial sample from the distribution under test seems to date back to Friedman (2003), at least for independent data and simple hypotheses. Recently, Chen and Xia (2023) employ artificial samples for a test of multivariate normality in high dimensions using nearest neighbors, while Arboretti et al. (2021) applies a test procedure for mixed data by means of artificial samples.

The remainder of this work unfolds as follows: In Sect. 2, we formulate our test statistic, and in Sect. 3 we obtain its limit null distribution as well as the corresponding law under fixed alternatives to \(H_0\). In Sect. 4, the validity of a bootstrap resampling scheme necessary for actually carrying out the test for simple hypotheses with fixed parameters is established, while in Sect. 5 a corresponding bootstrap resampling scheme is suggested for the composite hypothesis test statistic. Section 6 contains an extensive Monte Carlo study of the finite-sample behavior of the new tests, including comparisons, while Sect. 7 illustrates real-data applications. The final Sect. 8 provides some discussion. All proofs of the theorems of Sects. 3 and 4 are provided in Appendix 1.

2 Test statistic

Let \(\varphi (t)=\mathbb {E}(\textrm{e}^{\mathrm{{i}}t^\top X})\), \(t \in \mathbb {R}^{d}\), denote the CF of X, where \(\top \) denotes transpose, and \(\mathrm{{i}}=\sqrt{-1}\) stands for the imaginary unit.

We start our exposition with the simple null hypothesis (1) and note that, if \(X_0\) has density \(f_0\) and CF \(\varphi _0(t) = \mathbb {E}(\textrm{e}^{\mathrm{{i}}t^\top X_0})\), \(t \in \mathbb {R}^{d}\), say, the standard CF-based statistic for testing \(H_0\) versus \(H_A\) is given by

$$\begin{aligned} D_{n,w}= \int |\varphi _n(t)-\varphi _0(t)|^2 w(t) \, \mathrm{{d}} t. \end{aligned}$$
(3)

Here,

$$\begin{aligned} \varphi _n(t)=\frac{1}{n}\sum _{j=1}^n\exp (\textrm{i} t^\top X_j), \ t \in \mathbb {R}^d, \end{aligned}$$
(4)

is the empirical CF of \(X_1,\ldots ,X_n\), and \(X_1,\ldots ,X_n\) are independent and identically distributed (i.i.d.) copies of X. In (3), the domain of integration as well as the nonnegative weight function \(w(\cdot )\) will be specified below in a way that \(D_{n,w}\) is amenable to computation and so that a test that rejects the null hypothesis \(H_0\) for large values of \(D_{n,w}\) is consistent against each alternative. Notice that \(D_{n,w}\) is an estimator of the population Fourier-type discrepancy measure

$$\begin{aligned} D_{w}(f,f_0)= \int |\varphi (t)-\varphi _0(t)|^2 w(t) \, \mathrm{{d}} t \end{aligned}$$
(5)

between f and \(f_0\).

As already mentioned in the Introduction, a test statistic formulated by (3) is only feasible if the null CF \(\varphi _0(\cdot )\) is known and if integration, as indicated in \(D_{n,w}\), can be performed in general dimension for some suitable weight function \(w(\cdot )\). These prerequisites, however, are rarely fulfilled, and in order to circumvent this obstacle we suggest to replace \(\varphi _0(t)\) occurring in (3) by

$$\begin{aligned} \psi _{m}(t)=\frac{1}{m}\sum _{j=1}^m\exp (\textrm{i} t^\top Y_{j}), \quad t \in \mathbb {R}^d. \end{aligned}$$
(6)

Here, \(Y_{1}, \ldots , Y_{m}\) are i.i.d. random vectors, which are independent of \(X_1,\ldots ,X_n\) and have the same distribution as \(X_0\). Notice that \(\psi _m(t)\) is the empirical CF of \(Y_{1}, \ldots , Y_{m}\) and thus an estimator of \(\varphi _0(t)\). Of course, realizations of \(Y_{1}, \ldots , Y_{m}\) are generated via Monte Carlo. In the spirit of meanwhile time-honored weighted \(L^2\)-statistics (see, e.g., Ebner and Henze (2020) for an overview of weighted \(L^2\)-statistics in the context of testing for multivariate normality), we therefore replace \(\varphi _0(t)\) in (3) by \(\psi _m(t)\) and thus arrive at the test statistic

$$\begin{aligned} T_{n,m,w} = \frac{mn}{m+n} \int |\varphi _n(t)-\psi _{m}(t)|^2 w(t)\, \mathrm{{d}} t. \end{aligned}$$
(7)

Notice that \(T_{n,m,w}\) is reminiscent of a CF-based test for a two-sample problem, one sample being the data \(X_1,\ldots ,X_n\) at hand, while the other consists of artificial data generated under the null hypothesis \(H_0\). For more details on weighted \(L^2\)-statistics for the two-sample problem which are based on the empirical CF, the reader is referred to Meintanis (2005) and Alba Fernández et al. (2008).

In the setting of the composite null hypothesis \(H_{0,\vartheta }\) figuring in (2), the test statistic in (7) should be modified accordingly in order to take into account the extra variability introduced by the presence of the unknown parameter \(\vartheta \) and its estimator. Specifically for the composite null hypothesis we suggest the test statistic

$$\begin{aligned} {\widehat{T}}_{n,m,w}= \frac{mn}{m+n} \int |\varphi _n(t)-\widehat{\psi }_{m}(t)|^2 w(t)\, \mathrm{{d}}t, \end{aligned}$$
(8)

with \(\varphi _n(t)\) defined in (4) and

$$\begin{aligned} {\widehat{\psi }}_{m}(t)=\frac{1}{m}\sum _{j=1}^m\exp (\textrm{i}t^\top \widehat{Y}_{j}),\quad t \in \mathbb {R}^d, \end{aligned}$$
(9)

where \(\widehat{Y}_{j}\), \(j \in \{1,\ldots ,m\}\), are i.i.d. copies of a random vector having density \(f_0(\cdot ;\widehat{\vartheta }_n)\). Here, \( \widehat{\vartheta }_n:=\widehat{\vartheta }_n(X_1,\ldots ,X_n)\) is some estimator of \(\vartheta \) computed from \(X_1,\ldots ,X_n\).

We close this section by noting that CF-based methods for goodness-of-fit using the notion of artificial samples have been recently proposed in Chen et al. (2022) and Karling et al. (2023), the first within the family of multivariate elliptical distributions and the latter for skewed families of distributions.

3 Asymptotics

In this section, we provide the limit distribution of \(T_{n,m,w}\) defined in (7). To be flexible with respect to both the region of integration and to the weight function w, let M be some nonempty Borel set in \(\mathbb {R}^d\), and let \(\mu \) be some finite measure on (the Borel subsets of) M. Thus, M could be \(\mathbb {R}^d\) itself, and \(\mu \) could be absolutely continuous with respect to the Lebesgue measure in \(\mathbb {R}^d\), or M could be \({{\mathcal {S}}}^{d-1}\), and \(\mu \) could be absolutely continuous with respect to the spherical measure. Notably, M could also be some countable subset T of \(\mathbb {R}^d\), with \(\mu \) having a probability mass with respect to the counting measure on T.Footnote 2

In this setting, let \(X,X_1,X_2, \ldots \) and \(Y,Y_1,Y_2, \ldots \) be independent M-valued random vectors that are defined on some common probability space \((\Omega ,{{\mathcal {A}}},\mathbb {P})\). Moreover, let \(X,X_1,X_2, \ldots \) be i.i.d. with density f with respect to \(\mu \) and CF \(\varphi (t) = \mathbb {E}[\exp (\textrm{i} t^\top X)]\), \(t \in \mathbb {R}^d\). Furthermore, let \(Y,Y_1,Y_2, \ldots \) be i.i.d. with density g with respect to \(\mu \) and CF \(\psi (t) = \mathbb {E}[\exp (\textrm{i} t^\top Y)]\), \(t \in \mathbb {R}^d\). Recall that

$$\begin{aligned} \varphi _n(t):= \frac{1}{n}\sum _{j=1}^n\exp (\textrm{i} t^\top X_j), \qquad \psi _m(t)=\frac{1}{m}\sum _{j=1}^m\exp (\textrm{i} t^\top Y_j), \quad t \in \mathbb {R}^d, \end{aligned}$$

are the empirical CF’s of \(X_1,\ldots ,X_n\) and \(Y_1,\ldots ,Y_m\), respectively. This section tackles the limit distribution of

$$\begin{aligned} T_{n,m}:= \frac{nm}{n+m} \int _M \big | \varphi _n(t) - \psi _m(t)\big |^2 \, \mu (\text {d}t) \end{aligned}$$

as \(m,n \rightarrow \infty \), under each of the conditions \(\varphi = \psi \) and \(\varphi \ne \psi \). Putting

$$\begin{aligned} \text {C}(x):= \int _M \cos \big (t^\top x\big ) \, \mu (\text {d}t), \quad \text {S}(x):= \int _M \sin \big (t^\top x\big ) \, \mu (\text {d}t), \quad x \in \mathbb {R}^d, \end{aligned}$$
(10)

and noting that \(\big | \varphi _n(t) - \psi _m(t)\big |^2 = \big (\varphi _n(t) - \psi _m(t)\big )\big (\overline{\varphi _n(t)} - \overline{\psi _m(t)}\big )\), where \({\overline{z}}\) stands for the complex conjugate of a complex number z, straightforward algebra yields

$$\begin{aligned} \frac{m+n}{mn} T_{n,m}= & {} \frac{1}{n^2}\sum _{j,k=1}^n \big ( \text {C}(X_k-X_j) + \textrm{i}\, \text {S}(X_k-X_j)\big )\\{} & {} - \frac{1}{mn} \sum _{k=1}^m \sum _{j=1}^n \big ( \text {C}(X_j-Y_k) + \textrm{i}\, \text {S}(X_j-Y_k)\big )\\{} & {} - \frac{1}{mn} \sum _{k=1}^m \sum _{j=1}^n \big ( \text {C}(Y_k-X_j) + \textrm{i}\, \text {S}(Y_k-X_j)\big )\\{} & {} + \frac{1}{m^2} \sum _{j,k =1}^m \big (\text {C}(Y_k-Y_j) + \textrm{i}\, \text {S}(Y_k-Y_j)\big ). \end{aligned}$$

Since \(\text {S}(-x) = - \text {S}(x)\) and \(\text {S}(0_d) =0\), where \(0_d\) is the origin in \(\mathbb {R}^d\), the sum of the imaginary parts vanishes, and we obtain

$$\begin{aligned} \frac{m+n}{mn} T_{n,m}= & {} \frac{1}{n^2} \! \sum _{j,k=1}^n \text {C}(X_j\! -\! X_k) - \frac{2}{mn} \sum _{j=1}^n \sum _{k=1}^m \text {C}(X_j\! -\! Y_k)\nonumber \\ {}{} & {} + \frac{1}{m^2} \! \sum _{j,k=1}^m \text {C}(Y_j\! -\! Y_k). \end{aligned}$$
(11)

Since the computation of the test statistic \(T_{m,n}\) requires the evaluation of the integrals \(\text {C}(x)\) and \(\text {S}(x)\) occurring in (10), it seems to be indispensable to impose some restrictions on the set M and the measure \(\mu \) in order to render the test feasible. To this end, we assume that M – like \(\mathbb R^{d}\), \({{\mathcal {S}}}^{d-1}\) or the grid \(\mathbb {Z}^d\) – is symmetric with respect to the origin \(0_d\), i.e., we have \(-M =M\), where \(-M:= \{-x: x \in M\}\). Moreover, we suppose that \(\mu \) is invariant with respect to the reflection \(T(x):= -x\), \(x \in \mathbb {R}^d\), i.e., we have \(\mu = \mu ^T\), where \(\mu ^T\) is the image of \(\mu \) under T. The condition that the set M should be symmetric with respect to the origin is directly related to the uniqueness of CF’s, since this uniqueness only holds if two such functions coincide for each possible argument t. Thus typically we take \(M= {\mathbb {R}}^d\), although simplifications may occur, e.g., when testing for a distribution on the real line, in which case we may take \(M=(0,\infty )\) or for a circular distribution for which \(M=\mathbb {Z}^1\) suffices. Invariance of the measure \(\mu \) is another natural requirement since the discrepancy measure \(\big | \varphi _n(t) - \psi _m(t)\big |^2\), which is integrated with respect to \(\mu \), also remains invariant under sign changes.

By transformation of integrals, we then obtain

$$\begin{aligned} \text {S}(x) = \int _M \sin (t^\top x) \, \mu (\text {d}t) = - \text {S}(x), \quad x \in \mathbb {R}^d, \end{aligned}$$

which, together with \(\text {S}(-x) = - \text {S}(x)\), \(x \in \mathbb {R}^d\), implies

$$\begin{aligned} \text {S}(x) = 0, \qquad x \in \mathbb {R}^d. \end{aligned}$$
(12)

Putting

$$\begin{aligned} \text {CS}(\xi ):= \cos \xi + \sin \xi ,\quad \xi \in \mathbb {R}, \end{aligned}$$

we now use the addition theorems \(\cos (\alpha - \beta ) = \cos \alpha \cos \beta + \sin \alpha \sin \beta \) and \(\sin (\alpha + \beta ) = \sin \alpha \cos \beta + \cos \alpha \sin \beta \). In view of (12), we thus have

$$\begin{aligned} \int _M \text {CS}(t^\top X_j) \text {CS}(t^\top X_k) \, \mu (\text {d}t) = \int _M \cos (t^\top (X_j-X_k))\, \mu (\text {d}t) = \text {C}(X_k-X_j), \end{aligned}$$

(\(j,k \in \{1,\ldots ,n\}\)). In this way, some algebra yields

$$\begin{aligned} T_{n,m} = \frac{mn}{m+n} \int _M \left( \frac{1}{n}\sum _{j=1}^n \text {CS}(t^\top X_j) - \frac{1}{m} \sum _{k=1}^m \text {CS}(t^\top Y_k)\right) ^2 \mu (\text {d}t). \end{aligned}$$
(13)

Now, writing \({{\mathcal {B}}}(M)\) for the \(\sigma \)-field of Borel sets on M, let \(\mathbb {H}:= \text {L}^2(M,{{\mathcal {B}}}(M),\mu )\) be the separable Hilbert space of (equivalence classes of) measurable functions \(u:M \rightarrow \mathbb {R}\) satisfying \(\int _M u^2 \, \text {d} \mu < \infty \), equipped with the inner product \(\langle u,v\rangle = \int _M u v \, \text {d} \mu \) and the norm \(\Vert u\Vert _\mathbb {H}= \langle u,u \rangle ^{1/2}\), \(u \in \mathbb {H}\).

Theorem 3.1

Suppose that \(\varphi = \psi \). If M is symmetric with respect to \(0_d\) and \(\mu \) is invariant with respect to reflections at \(0_d\), then there is a centered Gaussian random element W of \(\mathbb {H}\) having covariance kernel \(K(s,t) = \mathbb {E}\big [ W(s) W(t)\big ]\), where

$$\begin{aligned} K(s,t) = {\mathrm{{Cov}}}\big (\mathrm{{CS}}(s^\top X),\mathrm{{CS}}(t^\top X)\big ), \quad s,t \in M, \end{aligned}$$
(14)

such that \(T_{n,m} {\mathop {\longrightarrow }\limits ^{\mathcal{D}}}\Vert W\Vert ^2_\mathbb {H}\) as \(n,m \rightarrow \infty \).

The next result gives the almost sure limit of \((m+n)T_{n,m}/(mn)\) as \(m,n \rightarrow \infty \).

Theorem 3.2

Let

$$\begin{aligned} \varrho (t):= \mathbb {E}[\mathrm{{CS}}(t^\top X)], \quad b(t):= \mathbb {E}[\mathrm{{CS}}(t^\top Y)], \quad t \in M. \end{aligned}$$
(15)

Under the conditions on M and \(\mu \) stated in Theorem 3.1, we have

$$\begin{aligned} \lim _{n,m \rightarrow \infty } \frac{m+n}{mn} T_{n,m} = \int _M \big (\varrho (t)-b(t)\big )^2 \, \mu (\mathrm{{d}}t) \quad \mathbb {P}\text {-almost surely}. \end{aligned}$$

Let

$$\begin{aligned} \Delta := \int _M |\varphi (t) - \psi (t)|^2\, \mu (\text {d}t). \end{aligned}$$

It is readily seen that \(\Delta \) equals the almost sure limit figuring in Theorem 3.2. Thus, \(\Delta \) is the measure of deviation between the distributions of X and Y, expressed in form of a weighted \(L^2\)-distance of the corresponding CFs, and this measure of deviation is estimated by \(T_{n,m}\). As the next result shows, the statistic \(T_{n,m}\), when suitably normalized, has a normal limit distribution as \(n,m \rightarrow \infty \) if \(\Delta >0\). To prove this result, we need a technical condition which sometimes is called the usual limiting regime in two-sample problems (see, e.g., Henze and Penrose 1999), namely

$$\begin{aligned} \lim _{m,n \rightarrow \infty } \frac{m}{m+n} = \tau \end{aligned}$$
(16)

for some \(\tau \in [0,1]\).

In contrast to Theorems 3.1 and 3.2, this condition is needed now to assess the asymptotic proportions of the X- sample and the Y-sample. In what follows, put

$$\begin{aligned} K_1(s,t) = \text {Cov}(\text {CS}(s^\top X),\text {CS}(t^\top X)), \quad K_2(s,t) = \text {Cov}(\text {CS}(s^\top Y),\text {CS}(t^\top Y)),\nonumber \\ \end{aligned}$$
(17)

and let

$$\begin{aligned} K^*(s,t) = \tau K_1(s,t) +(1-\tau ) K_2(s,t), \quad s,t \in M. \end{aligned}$$
(18)

Furthermore, let

$$\begin{aligned} z(t) = \varrho (t) - b(t), \end{aligned}$$
(19)

where \(\varrho (t)\) and b(t) are given in (15).

Theorem 3.3

Suppose the standing assumptions on M and \(\mu \) hold. If \(\Delta >0\), then

$$\begin{aligned} \sqrt{\frac{mn}{m+n}} \left( \frac{T_{n,m}}{\frac{mn}{m+n}} - \Delta \right) {\mathop {\longrightarrow }\limits ^{\mathcal{D}}}\mathrm{{N}}(0,\sigma ^2) \end{aligned}$$

under the limiting regime (16), where

$$\begin{aligned} \sigma ^2 = 4 \int _M \int _M K^*(s,t) z(s)z(t) \, \mu (\textrm{d}s) \mu (\textrm{d}t). \end{aligned}$$

Remark 3.4

Compared to Chen et al. (2022), who address the problem of composite hypotheses, the limit results of this section are obtained for simple hypotheses without estimated parameters. However, the results obtained herein hold for artificial sample size \(m\ne n\), which is much more general and thus flexible than the case \(m=n\) treated in Chen et al. (2022). Moreover, our setting is different from that of elliptical distributions on the classical Euclidean space \(\mathbb R^d\). In the following section, we suggest a resampling version of the test, and we prove its asymptotic validity.

4 Resampling under a simple hypothesis

Since, under \(H_0\), both the finite-sample and the limit distribution of \(T_{n,m}\) as \(n,m \rightarrow \infty \) depend on the unknown underlying distribution of \(X_1\), we use a bootstrap procedure in order to carry out a test that rejects \(H_0\) for large values of \(T_{n,m}\). The bootstrap distribution of \(T_{n,m}\) is the conditional distribution of \(T_{n,m}\) given the pooled sample \(X_1,\ldots ,X_n,Y_1,\ldots ,Y_m\), and a test of \(H_0\) at nominal level \(\alpha \) rejects \(H_0\) if \(T_{n,m}\) exceeds the \((1-\alpha )\)-quantile of this bootstrap distribution. Since the bootstrap distribution is difficult to compute, it is estimated by a Monte Carlo procedure that repeatedly samples from the empirical distribution of the pooled sample. To be specific, one first computes the observed value \(t_{n,m}\) of \(T_{n,m}\) based on realizations \(x_1,\ldots ,x_n,y_1,\ldots ,y_m\) of \(X_1,\ldots ,X_n,Y_1,\ldots ,Y_m\), respectively. In a second step, one generates b independent samples by Monte Carlo simulation. Here, for each \(j \in \{1,\ldots ,b\}\), the \(j^{\textrm{th}}\) sample consists of \(x_1(j),\ldots ,x_n(j),y_1(j),\ldots ,y_m(j)\), where these values have been chosen independently of each other with a uniform distribution over \(\{x_1,\ldots ,x_n,y_1,\ldots ,y_m\}\) (which means, in particular, that \(x_1(j)\) can also be taken from any of \(y_1,\ldots ,y_m\)). For each \(j \in \{1,\ldots ,b\}\), one then computes the value \(t_{n,m}(j) = T_{n,m}(x_1(j),\ldots ,x_n(j),y_1(j),\ldots ,y_m(j))\) of the test statistic \(T_{n,m}\). Letting \(c_{n,m;1-\alpha }\) denote the \((1-\alpha )\)-quantile of the b values \(t_{n,m}(1), \ldots , t_{n,m}(b)\), the hypothesis \(H_0\) is rejected if \(t_{n,m} > c_{n,m;1-\alpha }\).

To prove that this bootstrap procedure yields a test of \(H_0\) of asymptotic level \(\alpha \), we use a Hilbert space central limit theorem for triangular arrays (see Kundu et al. 2000). This theorem reads as follows.

Theorem 4.1

Let \(\{e_j: j \ge 1\}\) be a complete orthonormal basis of the separable Hilbert space \(\mathbb {H}\) with inner product \(\langle \cdot , \cdot \rangle \) and norm \(\Vert \cdot \Vert _{\mathbb {H}}\). For each \(m \ge 1\), let \(X_{m1},X_{m2}, \ldots , X_{mm}\) be independent \(\mathbb {H}\)-valued random elements such that \(\mathbb {E}(\langle X_{mj}, e_\ell \rangle ) =0\) and \(\mathbb {E}\Vert X_{mj}\Vert _{\mathbb {H}}^2 < \infty \) for each \(j \in \{1,\ldots ,m\}\) and each \(\ell \ge 1\). Put \(S_m = \sum _{j=1}^m X_{mj}\), and let \(C_m\) be the covariance operator of \(S_m\). Assume that the following conditions hold:

  1. (i)

    \(\lim _{m\rightarrow \infty } \langle C_me_k,e_\ell \rangle = a_{k\ell }\) (say) exists for each \(k \ge 1\) and \(\ell \ge 1\),

  2. (ii)

    \(\lim _{m \rightarrow \infty } \sum _{k=1}^\infty \langle C_me_ke_k\rangle = \sum _{k=1}^\infty a_{kk} < \infty \),

  3. (iii)

    \(\lim _{m\rightarrow \infty } L_m(\varepsilon ,e_k) = 0\) for each \(\varepsilon >0\) and each \(k \ge 1\), where

    $$\begin{aligned} L_m(\varepsilon ,h):= \sum _{j=1}^m \mathbb {E}\big (\langle X_{mj},h \rangle ^2 \textbf{1}\big \{ |\langle X_{mj}, h \rangle | > \varepsilon \big \} \big ), \quad h \in \mathbb {H}. \end{aligned}$$

Then \(S_m {\mathop {\longrightarrow }\limits ^{\mathcal{D}}}G\) as \(m \rightarrow \infty \), where G is a centered random element of \(\mathbb {H}\) with covariance operator C characterized by \(\langle C h,e_\ell \rangle = \sum _{j=1}^\infty \langle h,e_j\rangle a_{j\ell }\) for each \(h \in \mathbb {H}\) and each \(\ell \ge 1\).

To apply Theorem 4.1 in our situation of the Hilbert space \(\mathbb {H}:= \text {L}^2(M,{{\mathcal {B}}}(M),\mu )\) let, in greater generality than considered so far, \(X_1^{m,n}, \ldots , X_n^{m,n},Y_1^{m,n},\ldots , Y_m^{m,n}\) be i.i.d. M-valued random vectors with common distribution, and put

$$\begin{aligned} \varrho _{m,n}(t) = \mathbb {E}\big [ \text {CS}\big (t^\top X_1^{m,n}\big )\big ], \quad t \in M. \end{aligned}$$

Moreover, let

$$\begin{aligned} U_{m,n}(t):= & {} \frac{1}{\sqrt{n}} \sum _{j=1}^n \big \{ \text {CS}\big (t^\top X_j^{m,n}\big ) - \varrho _{m,n}(t)\big \}, \\ V_{m,n}(t):= & {} \frac{1}{\sqrt{m}} \sum _{k=1}^m \big \{ \text {CS}\big (t^\top Y_k^{m,n}\big ) - \varrho _{m,n}(t)\big \}, \quad t \in M \end{aligned}$$

(cf. (27)), and write \(U_{m,n} = U_{m,n}(\cdot )\), \(V_{m,n} = V_{m,n}(\cdot )\) as well as

$$\begin{aligned} {\widetilde{W}}_{n,m}:= a_{m,n}U_{m,n} - b_{m,n} V_{m,n}, \end{aligned}$$
(20)

where \(a_{m,n}\) and \(b_{m,n}\) are defined in (26).

Theorem 4.2

In the setting given above suppose that, under the limiting regime (16), we have \(X_1^{m,n} {\mathop {\longrightarrow }\limits ^{\mathcal{D}}}X_\infty \) for some M-valued random vector \(X_\infty \) with distribution \(H_\infty \). Put \(\varrho _\infty (t):= \mathbb {E}[\mathrm{{CS}}(t^\top X_\infty )]\), \(t \in M\), and \(W_\infty (t):= \mathrm{{CS}}(t^\top X_\infty ) - \varrho _\infty (t)\), \(t \in M\). Moreover, let \(W_\infty := W_\infty (\cdot )\) be the random element of \(\mathbb {H}= \text {L}^2(M,{{\mathcal {B}}}(M),\mu )\) with covariance operator \(C_\infty \) that is associated with the covariance function

$$\begin{aligned} c_\infty (s,t) = \mathbb {E}\big [\mathrm{{CS}}(s^\top X_\infty ) \mathrm{{CS}}(t^\top X_\infty ) \big ] - \varrho _\infty (s) \varrho _\infty (t), \quad s,t \in M, \end{aligned}$$

via

$$\begin{aligned} \langle C_\infty g,h\rangle = \int _M \int _M c_\infty (s,t) \, g(s)h(t)\, \mu (\text {d}s) \mu (\text {d}t). \end{aligned}$$

We then have \({\widetilde{W}}_{n,m} {\mathop {\longrightarrow }\limits ^{\mathcal{D}}}W_\infty \).

Notice that the test statistic \(T_{n,m}\) occurring in (13), computed on the random variables \(X_1^{m,n}, \ldots , X_n^{m,n},Y_1^{m,n},\ldots ,Y_n^{m,n}\), equals \(\Vert {\widetilde{W}}_{n,m}\Vert _{\mathbb {H}}^2 \), where \({\widetilde{W}}_{n,m}\) is given in (20). From Theorem 4.2, we thus have the following corollary.

Corollary 4.3

The limit distribution of the test statistic \(T_{n,m}\) under the limiting regime (16) is that of \(\Vert W_\infty \Vert {_{\mathbb {H}}}^2\).

Let \(F_n\) and \(G_m\) denote the empirical distributions of \(X_1,\ldots ,X_n\) and \(Y_1,\ldots ,Y_m\), respectively, and write \( H_{n,m}:= \frac{n}{m+n} F_n + \frac{m}{m+n} G_m \) for the empirical distribution of the pooled sample \(X_1,\ldots ,X_n,Y_1,\ldots ,Y_m\). By the Glivenko–Cantelli theorem, \(H_{n,m}\) converges weakly to \(H_\infty := (1-\tau ) F + \tau F = F\) with probability one under the limiting regime (16), and thus the bootstrap distribution of \(T_{n,m}\) converges almost surely to the distribution of \(\Vert W_\infty \Vert _{\mathbb {H}}^2\). The latter distribution coincides with the distribution of \(\Vert W\Vert _{\mathbb {H}}^2\), where W is given in Theorem 3.1. This shows the asymptotic validity of the bootstrap.

Remark 4.4

The resampling bootstrap procedure applied herein may also be replaced by a permutation procedure. The validity of the exhaustive permutation (that includes all possible permutations) may be directly obtained by observing that, under the null hypothesis \(H_0\), the observations \((x_1,\ldots ,x_n,y_1,\ldots ,y_m)\) are exchangeable. Another potential resampling scheme may be that of weighted bootstrap; see Alba-Fernándes et al. (2017).

5 Resampling under a composite hypothesis

Analogously to \(T_{n,m,w}\), the limit null distribution of \(\widehat{T}_{n,m,w}={\widehat{T}}_{n,w}(x _1,\ldots ,x _n,\widehat{y }_{1},\ldots ,\widehat{y}_{m})\) depends (in a very complicated way) on unknown quantities, and hence it cannot be used to compute critical values and actually carry out the test. To this end, we consider a parametric bootstrap procedure involving the test statistic in (8) computed on the basis of bootstrap observations from \(f_0(\cdot ;\vartheta )\), where the parameter \(\vartheta \) is replaced by estimators.

More precisely, let \({\widehat{T}}_{n,m,w, \mathrm{{obs}}}\) denote the observed value of the test statistic. For given \(\alpha \in (0,1)\), write \(t^*_{n,m,\alpha }\) for the upper \(\alpha \)-percentile of the bootstrap distribution of \(T_{n,m,w}\). We then define the test function as

$$\begin{aligned} \Xi _{n,m}^*=\left\{ \begin{array}{ll} 1, &{} \text {if}\quad T_{n,m,w,\mathrm{{obs}}} \ge t^*_{n,m,\alpha }, \\ 0, &{} \text {otherwise}. \end{array} \right. \end{aligned}$$
(21)

In practice, the bootstrap distribution of \(T_{n,m,w, \mathrm{{obs}}}\) is approximated as follows:

  1. 1.

    ep=0pt

  2. 2.

    Generate a bootstrap sample \(x _1^*, \ldots , x _n^*\) from \(f_0(\cdot ;\widehat{\vartheta }_n)\).

  3. 3.

    Calculate the estimator \(\vartheta ^*_n=\vartheta _n(x _1^*, \ldots , x _n^*)\)

  4. 4.

    Generate a bootstrap sample \(y _{1}^*, \ldots , y _{m}^*\) from \(f_0(\cdot ;{\vartheta }^*_n)\).

  5. 5.

    Compute \(T_{n,m,w}^*=T_{n,m,w}(x _1^*, \ldots , x _n^*, y _{1}^*, \ldots , y _{m}^*)\).

  6. 6.

    Repeat steps 1–4 a number of times, say b, and thus obtain \(T^*_{n,m,w,1}, \ldots , T^*_{n,m,w,\mathrm{{b}}}\).

Then we approximate the upper \(\alpha \)–percentile \(t^*_{n,m,\alpha }\) in (21) of the null distribution of \(T_{n,m,w}\) by the upper \(\alpha \)-percentile of the empirical distribution of \(T^*_{n,m,w,1}, \ldots , T^*_{n,m,w,\mathrm{{b}}}\).

Although we provide no asymptotic theory for the resampling under a composite hypothesis, our simulations show that the above method works well. Nevertheless, it remains an open problem to formally prove that this bootstrap is asymptotically valid.

6 Simulations

In this section we provide results of competitive Monte Carlo simulations for the case of both a simple and a composite hypothesis. We throughout restrict the simulation to the spherical setting for the dimension \(d=3\) and, for computational feasibility, to the sample size \(n=50\). All simulations are performed using the statistical programming language R, see R Core Team (2022). We implement the test statistic by fixing the measure \(\mu (\text {d}t)\) to be the density of the zero-mean spherical stable distribution. Then the test may be computed as in Eq. (11) with C\((x)=\textrm{e}^{-\gamma \Vert x\Vert ^\xi }\). Specifically, we have

$$\begin{aligned} T^{\{\xi \}}_{n,m,\gamma }= & {} \frac{1}{m\! +\! n}\nonumber \\ {}{} & {} \times \bigg (\frac{m}{n} \! \sum _{j,k=1}^n \mathrm{{e}}^{-\gamma \Vert X_j\! -\! X_k\Vert ^\xi } - 2 \sum _{j=1}^n \sum _{k=1}^m \mathrm{{e}}^{-\gamma \Vert X_j\! -\! Y_k\Vert ^\xi } + \frac{n}{m} \! \sum _{j,k=1}^m \mathrm{{e}}^{-\gamma \Vert Y_j\! -\! Y_k\Vert ^\xi }\bigg ),\nonumber \\ \end{aligned}$$
(22)

where the notation \(T^{\{\xi \}}_{n,m,\gamma }\) is used to emphasize the flexibility of the test with respect to the tuning parameters \((\xi ,\gamma ) \in (0,2] \times (0,\infty )\), which are at the disposal of the practitioner and may be employed in order to obtain better power against different alternatives.

Another option, although not yielding a proper measure, is to apply the so–called energy statistic suggested by Székely and Rizzo (2013), which again results from (11) with C\((x)=-\Vert x\Vert ^\xi , \ \xi \in (0,2)\). The explicit formula for this statistic is obtained from (22) by replacing \(\mathrm{{e}}^{-\gamma \Vert \cdot \Vert ^\xi }\) with \(-\Vert \cdot \Vert ^\xi \). This test statistic will be denoted by \(SR^{\{\xi \}}_{n,m}\).

The spherical distributions were generated using the package Directional, see Tsagris et al. (2021), and the uniformity tests by the package sphunif, see García-Portugués and Verdebout (2020).

6.1 Testing the simple hypothesis of uniformity

We test the hypothesis

$$\begin{aligned} H_0: f(\cdot )\equiv 1/|\mathcal {S}^{d-1}|, \end{aligned}$$

where \(|\mathcal {S}^{d-1}|=2 \pi ^{d/2}/\Gamma (d/2)\) is the surface area of the \((d-1)\)-sphere. Hence we test whether f is the density of the uniform law \(\mathcal {U}(\mathcal {S}^{d-1})\), which is a classical testing problem in directional statistics. For an overview of existing procedures we refer to García-Portugués and Verdebout (2018). As competing tests we consider the following procedures:

  • The modified Rayleigh test \(R_n\), see Mardia and Jupp (2000, Sect. 10.4.1) based on the mean of the directions,

  • The Ebner et al. (2018) test \(NN^J_a\) based on volumes of the J nearest neighbor balls with power a,

  • The Bingham test \(B_n\), see Bingham (1974), based on the empirical scatter matrix of the sample,

  • The Sobolev test, see Giné (1975), \(G_n\) based on Sobolev norms, and

  • The test of Cuesta-Albertos et al. (2009) \(CA_n\), which is based on random projections that characterize the uniform law,

  • The test \(XM_n\) in Xu and Matsuda (2020), based on a U-statistic representation of a directional kernel Stein discrepancy with fixed tuning parameter \(\kappa =1\).

Empirical critical values for each testing procedure have been obtained by a Monte Carlo simulation study under \(H_0\) with 100,000 replications each.

We considered the following alternatives to the uniform distribution on \(\mathcal {S}^{d-1}\). These alternatives are chosen to simulate different uni-, bi- and trimodal models. For details on the hyperspherical von Mises–Fisher distribution, see Sect. 9.3 in Mardia and Jupp (2000).

  • The density of the von Mises–Fisher distribution in dependence of the mean direction \(\theta \in \mathcal {S}^{d-1}\) and concentration parameter \(\kappa \ge 0\) is given by

    $$\begin{aligned} f(x)=\frac{\left( \kappa /2 \right) ^{d/2-1}}{\Gamma (d/2) I_{d/2-1}(\kappa )} \exp (\kappa x^\top \theta ),\quad x\in \mathcal {S}^{d-1}. \end{aligned}$$

    Here, \(I_{d/2-1}\) is the modified Bessel function of the first kind and order \(d/2-1\). This class is denoted with \(\text {vMF}(\theta , \kappa )\).

  • We simulate a mixture of two von Mises–Fisher distributions with different centers by the following procedure. Simulate \(U\sim \mathcal {U}(0,1)\) and, independently, \(Y_i \sim \text {vMF}(\theta _i, \kappa _i)\) with corresponding location and concentration parameters for \(i=1,2\), and choose \(p \in (0,1)\). Then we generate a member X of the random sample according to

    $$\begin{aligned} X = Y_1 {\textbf{1}}{\{ U < p \} } + Y_2 {\textbf{1}}{\{ U \ge p \} }. \end{aligned}$$

    We denote this alternative class by \(\text {MMF}((p,1-p), (\theta _1, \theta _2), (\kappa _1, \kappa _2))\).

  • In a similar manner as for the mixture of two vMF distributions, we simulate a mixture of three von Mises–Fisher distributions with different centers, by additionally simulating independently a third random vector \(Y_3\sim \text {vMF}(\theta _3, \kappa _3)\) and generating the member X by

    $$\begin{aligned} X = Y_1 {\textbf{1}}{\{ U< p \} } + Y_2 {\textbf{1}}{\{ p \le U < 2p \} } + Y_3 {\textbf{1}}{\{ U \ge 2p \} },\quad {p\in (0,1/2)}. \end{aligned}$$

    We denote this class with \(\text {MMF}((p,p,1-2p), (\theta _1, \theta _2, \theta _3), (\kappa _1, \kappa _2, \kappa _3))\).

In each of the alternatives, we put \(\texttt{1}=(1,\ldots ,1)/\sqrt{d}, \mu _1=(1,0,\ldots ,0),\,\) and \(\mu _2=(-1,1,\ldots ,1)/\sqrt{d}\). The results of the simulation study for the new test applied at \(\xi =2\), as well as the stated competitors are displayed in Tables 1 and 2. In this section, for the sake of simplicity, we suppress the sample size n and the Monte Carlo size m in the notation of the new test and write \(T_{\gamma }^{\{\xi \}}\). As can be seen, the suggested tests perform well in comparison, although they are never the best performing procedures. This behavior might be explained by the approximation of the true CF under the null hypothesis. To investigate the impact of the sample size m of the simulated data set \(Y_1,\ldots ,Y_m\), we simulated the empirical power of the test for four vMF distributions and for different values of m, see Fig. 1. Clearly, the choice of m has an impact on the estimation, and larger values of m are desirable, but increasing m leads to longer computation time. Table 3 exhibits the impact of the weighting measure \(\mu \) and hence of the choice of the function C\((\cdot )\). In this regard, we compare our test with the energy test of Székely and Rizzo (2013), denoted by \(SR^{\{\xi \}}\), for different combinations of the tuning parameters \((\gamma ,\xi )\). In terms of power for the uni- and bimodal alternatives considered, the choice of C\((\cdot )\) has nearly no influence on the empirical power, with the exception of the MMF\(((0.5,0.5),(-\mu _1,\mu _1),(2,2))\) alternative, where \(T_{\gamma }^{\{1\}}\), \(T_{\gamma }^{\{1.5\}}\) and \(T_{\gamma }^{\{2\}}\) outperform the \(SR^{\{\xi \}}\)-procedures for some values of the tuning parameter \(\gamma \).

Table 1 Empirical rejection rates for testing uniformity for the test \(T^{\{2\}}_{\gamma }\) and competitors (\(n=50\), \(m=500\), \(\alpha =0.05\), 10,000 replications)
Table 2 Empirical rejection rates for testing uniformity for the test \(T^{\{2\}}_{\gamma }\) and competitors (\(n=50\), \(m=500\), \(\alpha =0.05\), 10,000 replications)
Fig. 1
figure 1

Simulated empirical rejection rates for \(T^{\{2\}}_{1}\) and different values of \(10\le m\le 500\) and concentration parameters \(\kappa \in \{0.25,0.5,0.75,1\}\) (\(n=50\), \(\alpha =0.05\), 10,000 replications)

6.2 Testing the fit to the von Mises–Fisher distribution

For the case of a composite hypothesis we consider the hypothesis that the underlying density belongs to the family of von Mises–Fisher distributions vMF\((\kappa ,\theta )\), i.e., we test the hypothesis

$$\begin{aligned} H_0: f(\cdot )=\frac{\left( \kappa /2 \right) ^{d/2-1}}{\Gamma (d/2) I_{d/2-1}(\kappa )} \exp (\kappa \cdot ^\top \theta ),\quad \text{ for } \text{ some } \kappa \ge 0 \text{ and } \theta \in \mathcal {S}^{d-1}, \end{aligned}$$
(23)

against general alternatives. The main difference to subsection 6.1 is that we consider a test to a family of distributions, where the parameters are unknown and hence have to be estimated. To test the hypothesis we chose \(T_{\gamma }^{\{2\}}\) in Eq. (22), for different values of the tuning parameter \(\gamma \), and we implemented the parametric bootstrap procedure from Sect. 5. To approximate the unknown parameters we calculated the maximum likelihood estimates for \(\kappa \) and \(\theta \), as proposed in Sect. 10.3.1 of Mardia and Jupp (2000). As far as we know, testing composite hypotheses for spherical or hyperspherical distributions with estimated parameters has not been considered before in the literature. As alternative models we chose the same distributions as described in Subsection 6.1.

In view of the extensive computation time due to the parametric bootstrap procedure, we considered the simulation setting \(n=50\), \(m=200\), a sample size of 500 in the bootstrap algorithm, and 5000 Monte Carlo replications. Throughout the study, we fixed the significance level to 0.05. The results are reported in Table 4. Notably, the novel test maintains the nominal significance level very closely, and its power with respect to bimodal alternatives increases the more these two modes are pronounced.

6.3 Testing the fit to the angular central Gaussian distribution

In this subsection, we consider testing the fit to an angular central Gaussian model, i.e., we test the hypothesis

$$\begin{aligned} H_0: f(\cdot )=|\Sigma |^{-1/2}(\cdot ^\top \Sigma ^{-1} \cdot )^{-d/2},\quad \text{ for } \text{ some } \Sigma , \end{aligned}$$
(24)

where \(\Sigma \) is a symmetric positive definite (\(d \times d\))-parameter matrix, which is identifiable up to multiplication by a positive scalar. For information regarding this model, see Mardia and Jupp (Mardia and Jupp 2000, Sect. 9.4.4), and for a numerical procedure to approximate the maximum likelihood estimator of the unknown parameter matrix \(\Sigma \), see Tyler (1987). To the best of our knowledge, testing the fit to the angular central Gaussian family has not been considered in the literature.

The simulation parameters match the ones of Sect. 6.2. In complete analogy with previous simulations, we considered \(T_{\gamma }^{\{2\}}\) in Eq. (22), for different values of the tuning parameter \(\gamma \), and we implemented the parametric bootstrap procedure from Sect. 5. To simulate different models under the null hypothesis, we generated a realization of a random matrix

$$\begin{aligned} A=\left( \begin{array}{ccc} -0.846 &{} -0.531 &{} -0.779 \\ 0.609 &{} -0.096 &{} 0.761 \\ 0.851 &{} 0.133 &{} -0.666 \end{array}\right) \end{aligned}$$

and computed the covariance matrices \(\Sigma _\ell =(A^\ell )^\top A^\ell \), \(\ell \in \{1,2,3,4\}\), where the power matrix \(A^\ell =(a_{ij}^\ell )_{i,j=1,2,3}\) is defined component-by-component, and \(\Sigma _0 = \text {diag}(1,2,3)\). The corresponding alternatives are denoted by ACG\(_\ell \), \(\ell =0,\ldots ,4\). Results are presented in Table 5. In this case the bootstrap testing procedure controls the type I error, while performing well for most of the alternatives considered.

Table 3 Empirical rejection rates for testing uniformity for the test \(T^{\{\xi \}}_{\gamma }\) for \(\xi =1,1.5\) as well as \(SR^{\{\xi \}}\) (\(n=50\), \(m=500\), \(\alpha =0.05\), 10,000 replications)
Table 4 Empirical rejection rates for testing the fit to a von Mises–Fisher distribution (\(n=50\), \(m=200\), \(\alpha =0.05\), 500 bootstrap sample size, 5000 replications)
Table 5 Empirical rejection rates for testing the fit to an angular central Gaussian distribution (\(n=50\), \(m=200\), \(\alpha =0.05\), 500 bootstrap sample size, 5000 replications)

7 Real data

We revisit the paleomagnetic data in Scealy and Wood (2019), which is an example of spherical data. Paleomagnetic data consist of observations on the direction of magnetism in either rocks, sediment, or in archeological specimens. These data are measured at various geological points in time and spatial locations. The directions are usually measured as declination and inclination angles based on strike and dip coordinates, see Scealy and Wood (2019) and the references therein for more information. The data considered are taken from the GEOMAGIA50.v3 database, see Brown et al. (2015). For simplicity, we analyze the data provided in the supplementary material of Scealy and Wood (2019). The full data set consists of \(n=1137\) entries (variables are age, dec, inc, lat, and lon) collected at a single spatial location, which is the Eifel maars (EIF) lakes in Germany with relocated nearby data, for details see Scealy and Wood (2019). The analyzed directions are given by the variables declination D (dec defined on \([0^\circ ,360^\circ ]\)) and inclination I (inc defined on \([-90^\circ ,90^\circ ]\)). They are converted to Cartesian coordinates by \(x_1=\sin (I)\), \(x_2=\cos (I)\cos (D)\), and \(x_3=\cos (I)\sin (D)\), ensuring \(x=(x_1,x_2,x_3)\in \mathcal {S}^2\). For a plot of the data, see Fig. 2 (left).

We test the composite hypothesis (23) of a von Mises–Fisher distribution by \(T_{n,m,\gamma }^{\{2\}}\) of Eq. (22), for different values of the tuning parameter \(\gamma \), and we fix \(m=500\) with a bootstrap sample size \(b=1000\). The bootstrap p-values are reported in Table 6. With the exception of the tuning parameter \(\gamma =0.5\), the p-values indicate that we are not able to reject the hypothesis of fit of an underlying von Mises–Fisher distribution at any level. For their analysis, the authors in Scealy and Wood (2019) consider rocks of age 1250 and hence determine a subset of the data of sample size \(n=50\), for a plot see Fig. 2 (right). They propose to use a new spherical model, namely a distribution of Kent type, by applying a transformation to the von Mises–Fisher density. The results of our test of fit to the von Mises–Fisher law for the subset are displayed in the second row of Table 6. For all significance levels and each choice of the tuning parameter, the tests reject the null hypothesis, indicating a poor fit of the von Mises–Fisher family for the subset of the data.

As a second parametric family of distributions, we consider the Kent distribution, defined by the density

$$\begin{aligned} f(x)=\frac{1}{c(A,\kappa )} \exp (\kappa \, x^\top \theta +x^\top Ax),\quad x\in \mathcal {S}^{d-1}. \end{aligned}$$
(25)

Here, \(\kappa >0\) is a concentration parameter and \(\theta \in \mathcal {S}^{d-1}\) is the mean direction. Moreover, A is a symmetric (\(d\times d\))-matrix with tr\((A)=0\) and \(A\theta =0\) that depends on an ‘ovality’ parameter \(\beta \), see Mardia and Jupp (2000). Hence we test the hypothesis that the data stems from a density of type (25), where the parameters \(\kappa ,\theta \) and \(\beta \) are unknown. These parameters have been estimated by the method of maximum-likelihood, and the same bootstrap parameters are applied as above. The bootstrap p-values are reported in Table 6. Interestingly, the full data set is rejected by each of the tests on every significance level. We thus conclude that the Kent distribution is not a suitable model. However, we obtain a different impression for the subset of the data with age fixed to 1250. For this data set, none of the tests can reject the hypothesis of an underlying Kent distribution

Fig. 2
figure 2

Full data set (\(n=1137\)) of archeomagnetic directions (left) and subsample (\(n=50\)) consisting of directions the same age 1250 (right)

Table 6 MLE estimates (\(\widehat{\kappa }_n\), \(\widehat{\theta }_n\)) of the parameters of the von Mises–Fisher distribution and bootstrap p-values of the test \(T_{n,m,\gamma }^{\{2\}}\) applied to the paleomagnetic data set

The results obtained in this section confirm the statements about the data sets made in Scealy and Wood (2019).

8 Discussion

We have studied goodness-of-fit tests for spherical data. Our tests apply to both simple hypotheses with all parameters assumed known and to composite hypotheses, with parameters estimated from the data at hand. Limit theory is developed under the null hypothesis as well as under alternatives, while the asymptotic validity of a resampling version of the tests is established. The new procedures perform well in finite samples, and they are competitive against other methods, whenever such methods are available. An application illustrates the usefulness of the new tests for data-modelling on the sphere. In closing we note that, despite the fact that our study focuses on spherical distributions, the method suggested herein also applies to higher dimensions, provided that one can draw samples from the distribution under test and, in the case of composite hypotheses, one is able to estimate the distributional parameters. This remark notwithstanding, we acknowledge that computational problems become progressively more involved with increasing dimension, and thus the feasibility of our method should be assessed on a case-by-case basis.