Abstract
In this paper, the asymptotic power of a method for testing hypotheses on the equality of two distributions is investigated; it can be regarded as a generalization of the Wilcoxon–Mann–Whitney test. We consider a class of distributions such that the mathematical expectation of the square of some auxiliary function is finite. For the case where the alternative distribution differs from the zero distribution only by the shift, the asymptotic distribution of the test and the asymptotic power of the test are explicitly found. Up to now, the power of this test has been studied only using statistical modeling.
Similar content being viewed by others
Notes
The examples were constructed by Anna Belkova, Master of the Faculty of Mathematics and Mechanics, St. Petersburg State University.
REFERENCES
G. Zech and B. Aslan, “New test for the multivariate two-sample problem based on the concept of minimum energy,” J. Stat. Comput. Simul. 75, 109–119 (2005).
V. Melas and D. Salnikov, “On asymptotic power of the new test for equality of two distributions,” in Recent Developments in Stochastic Methods and Applications, Ed. by A. N. Shiryaev, K. E. Samouylov, and D. V. Kozyrev (Springer-Verlag, Cham, 2021), in Ser.: Springer Proceedings in Mathematics and Statistics, Vol. 371, pp. 204–214.
E. L. Leman, Testing Statistical Hypotheses (Wiley, New York, 1959; Nauka, Moscow, 1979).
H. Buening, “Kolmogorov–Smirnov and Cramer–von Mises type two-sample tests with various weight functions,” Commun. Stat. - Simul. Comput. 30, 847–865 (2001).
T. W. Anderson and D. A. Darling, “A test of goodness-of-fit,” J. Am. Stat. Assoc. 49, 765–769 (1954).
W. Hoeffding, “A class of statistics with asymptotically normal distribution,” Ann. Math. Stat. 19, 293–325 (1948).
Funding
The work was carried out with financial support of the Russian Foundation for Basic Research (grant no. 20-01-00096-a).
Author information
Authors and Affiliations
Corresponding author
Additional information
Translated by L. Kartvelishvili
ANNEX
ANNEX
Proof of Lemma 3.1. We introduce the notation
The proof follows from the well-known formula (see, e.g., ([6], p. 296))
and the obvious identity
by direct, but nontrivial calculations.
In fact, we use the standard form of writing
\(S_{y}^{2}\) and \(S_{z}^{2}\) we imply in an analogous way. We denote
Using formulas (10), we obtain
From (10) and (11) we get
Hence,
and we get
We assume that hypothesis H0 holds. According to the classical central-limit theorem, \(\sqrt {{{\Phi }_{{nn}}}} \) has a distribution converging at n → ∞ to the normal distribution with zero expectation and the variance J1. The last proposition of the lemma is verified by direct computation. Thus, Lemma 3.1 is proven. It follows from this lemma that the test Φnn in this case is equivalent to the test (\(\bar {x}\) – \(\bar {y}\))2.
\(\square \)
Proof of Lemma 3.2. We introduce the notation
By definition, the function Un(u, g) represents U statistics (see [6]). We recall that we put m = n and
Consequently,
where Z = (Z1, …, Z2n) = (X1, …, Xn, Y1, …, Yn). We apply the limit theorem (see Theorem 7.1 [6]) to each of the expressions ΦA(X, g), ΦB(Y, g), and ΦAB(X, Y, g). A direct calculation shows that the nonsingularity condition holds if condition (4) is met. First, suppose that condition (4) is met for g(x) = g*(x) = x2.
According to the limit theorem ([6]), nΦA(X, g) and nΦA(X, g*) have a normal distribution in the limit. Since normal distributions are completely determined by the parameters of shift and scale, we have the equality
where a and \(\tilde {c}\) are some numbers, whereas \({{\tilde {\eta }}_{n}}\) is a random variable converging in distribution to a constant equal to zero. Since Xi – Xj and Yi – Yj, 1 ≤ i < j ≤ n have the same distribution, we get
with the same constants a and \(\tilde {c}\) as in equality (17).
For the same reason and taking into account equality (15), for ΦAB we obtain the formula
where the constant a is the same as in (17); however, \(\bar {c} \ne \tilde {c}\) and \({{\bar {\eta }}_{n}} \ne {{\tilde {\eta }}_{n}}\). We obtain
where ηn converges in probability to zero, a and c are constants, a is the same as in formula (17), and g*(x) = x2. By Lemma 3.1, \(\frac{1}{{{{J}_{1}}}}n{{T}_{n}}\)(X, Y, g*) converges in distribution to L2. Thus, the limit distribution nTn(X, Y, g) has the form a2L2 + c.
We consider the case where condition (4) does not hold for g(x) = g*(x).
We suppose K is an arbitrary positive number and
where \({{\tilde {X}}_{i}}\) = Xi if \(\left| {{{X}_{i}}} \right| \leqslant K\) and \({{\tilde {X}}_{i}}\) = K and Xi > 0, \({{\tilde {X}}_{i}}\) = –K Xi < 0. We suppose \({{\tilde {Y}}_{i}}\) are defined in a similar way. Now condition (4) holds both for the given function g(x) and for g(x) = x2 (due to the finite variance of the modified variables).
We consider the value
Due to the above arguments, the limit distribution of this value has the form R(L, a, c). When K → ∞, the limit distribution exists (according to Theorem 7.1 [6]) and has the same form. Thus, Lemma 3.2 is proven.
\(\square \)
1.1 4. CONCLUSIONS
In this paper, we obtain an asymptotic distribution of the considered test and find a formula for the asymptotic power. Using statistical modeling, it is established that the found formula makes it possible to obtain theoretical power values that statistically insignificantly differ from the empirical powers found by modeling. The results can be used in order to determine the rational sample size, i.e., to design an experiment to test hypotheses. The found formulas are also useful for further investigation of the test in question. For example, for the optimal choice of an auxiliary function.
About this article
Cite this article
Melas, V.B. On the Asymptotic Power of a Method for Testing Hypotheses on the Equality of Distributions. Vestnik St.Petersb. Univ.Math. 56, 182–189 (2023). https://doi.org/10.1134/S1063454123020115
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S1063454123020115