On the Asymptotic Power of a Method for Testing Hypotheses on the Equality of Distributions

Melas, V. B.

doi:10.1134/S1063454123020115

On the Asymptotic Power of a Method for Testing Hypotheses on the Equality of Distributions

MATHEMATICS
Published: 08 June 2023

Volume 56, pages 182–189, (2023)
Cite this article

Vestnik St. Petersburg University, Mathematics Aims and scope Submit manuscript

V. B. Melas¹

29 Accesses
Explore all metrics

Abstract

In this paper, the asymptotic power of a method for testing hypotheses on the equality of two distributions is investigated; it can be regarded as a generalization of the Wilcoxon–Mann–Whitney test. We consider a class of distributions such that the mathematical expectation of the square of some auxiliary function is finite. For the case where the alternative distribution differs from the zero distribution only by the shift, the asymptotic distribution of the test and the asymptotic power of the test are explicitly found. Up to now, the power of this test has been studied only using statistical modeling.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On a Power of Optimal Test for Asymptotic Distinction of Statistical Hypotheses for Distributions with Heavy Tails*

Article 13 January 2015

On Asymptotic Power of the New Test for Equality of Two Distributions

On the Asymptotic Efficiency of Tests in Hypothesis Testing Problems

Article 10 October 2017

Notes

The examples were constructed by Anna Belkova, Master of the Faculty of Mathematics and Mechanics, St. Petersburg State University.

REFERENCES

G. Zech and B. Aslan, “New test for the multivariate two-sample problem based on the concept of minimum energy,” J. Stat. Comput. Simul. 75, 109–119 (2005).
Article MathSciNet MATH Google Scholar
V. Melas and D. Salnikov, “On asymptotic power of the new test for equality of two distributions,” in Recent Developments in Stochastic Methods and Applications, Ed. by A. N. Shiryaev, K. E. Samouylov, and D. V. Kozyrev (Springer-Verlag, Cham, 2021), in Ser.: Springer Proceedings in Mathematics and Statistics, Vol. 371, pp. 204–214.
E. L. Leman, Testing Statistical Hypotheses (Wiley, New York, 1959; Nauka, Moscow, 1979).
H. Buening, “Kolmogorov–Smirnov and Cramer–von Mises type two-sample tests with various weight functions,” Commun. Stat. - Simul. Comput. 30, 847–865 (2001).
Article MathSciNet MATH Google Scholar
T. W. Anderson and D. A. Darling, “A test of goodness-of-fit,” J. Am. Stat. Assoc. 49, 765–769 (1954).
Article MATH Google Scholar
W. Hoeffding, “A class of statistics with asymptotically normal distribution,” Ann. Math. Stat. 19, 293–325 (1948).
Article MathSciNet MATH Google Scholar

Download references

Funding

The work was carried out with financial support of the Russian Foundation for Basic Research (grant no. 20-01-00096-a).

Author information

Authors and Affiliations

St. Petersburg State University, 199034, St. Petersburg, Russia
V. B. Melas

Authors

V. B. Melas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to V. B. Melas.

Additional information

Translated by L. Kartvelishvili

ANNEX

Proof of Lemma 3.1. We introduce the notation

$$Z = (X,Y) = ({{X}_{1}}, \ldots ,{{X}_{n}},{{Y}_{1}}, \ldots ,{{Y}_{n}}),\quad V(Z) = \frac{1}{2}\sum\limits_{i = 1}^{2n} {\sum\limits_{j = 1}^{2n} {{{{({{Z}_{i}} - {{Z}_{j}})}}^{2}}.} } $$

The proof follows from the well-known formula (see, e.g., ([6], p. 296))

$$\frac{1}{{n(n - 1)}}\sum\limits_{1 \leqslant i < j \leqslant n} {{{{({{X}_{i}} - {{X}_{j}})}}^{2}}} = \frac{1}{{(n - 1)}}\sum\limits_{i = 1}^n {{{{({{X}_{i}} - \bar {x})}}^{2}}} $$

(10)

and the obvious identity

$$\sum\limits_{i = 1}^{2n} {\sum\limits_{j = 1}^{2n} {{{{({{Z}_{i}} - {{Z}_{j}})}}^{2}}} } = \sum\limits_{i,j = 1}^n {{{{({{X}_{i}} - {{X}_{j}})}}^{2}}} + \sum\limits_{i,j = 1}^n {{{{({{Y}_{i}} - {{Y}_{j}})}}^{2}}} + 2\sum\limits_{i = 1}^n {\sum\limits_{j = 1}^n {{{{({{X}_{i}} - {{Y}_{j}})}}^{2}}} } $$

(11)

by direct, but nontrivial calculations.

In fact, we use the standard form of writing

$$S_{x}^{2} = \frac{1}{{(n - 1)}}\sum\limits_{i = 1}^n {{{{({{X}_{i}} - \bar {x})}}^{2}};} $$

$S_{y}^{2}$ and $S_{z}^{2}$ we imply in an analogous way. We denote

$${{S}_{{xy}}} = \frac{1}{{{{n}^{2}}}}\sum\limits_{i = 1}^n {\sum\limits_{j = 1}^n {{{{({{X}_{i}} - {{Y}_{j}})}}^{2}}.} } $$

Using formulas (10), we obtain

$$V(Z) = 2n\left[ {\sum\limits_{i = 1}^n {{{{({{X}_{i}} - (\bar {x} + \bar {y}){\text{/}}2)}}^{2}}} + \sum\limits_{j = 1}^n {{{{({{Y}_{i}} - (\bar {x} + \bar {y}){\text{/}}2)}}^{2}}} } \right] = 2n(n - 1)(S_{x}^{2} + S_{y}^{2}) + {{n}^{2}}{{(\bar {x} - \bar {y})}^{2}}.$$

(12)

From (10) and (11) we get

$${{n}^{2}}{{S}_{{xy}}} = V(Z) - n(n - 1)(S_{x}^{2} + S_{y}^{2}).$$

(13)

Hence,

$${{S}_{{xy}}} = \frac{1}{n}(n - 1)(S_{x}^{2} + S_{y}^{2}) + {{(\bar {x} - \bar {y})}^{2}}$$

and we get

$${{\Phi }_{{nn}}} = {{S}_{{xy}}} - \frac{1}{n}(n - 1)(S_{x}^{2} + S_{y}^{2}) = {{(\bar {x} - \bar {y})}^{2}}.$$

We assume that hypothesis H₀ holds. According to the classical central-limit theorem, $\sqrt {{{\Phi }_{{nn}}}} $ has a distribution converging at n → ∞ to the normal distribution with zero expectation and the variance J₁. The last proposition of the lemma is verified by direct computation. Thus, Lemma 3.1 is proven. It follows from this lemma that the test Φ_nn in this case is equivalent to the test ($\bar {x}$ – $\bar {y}$)².

$\square $

Proof of Lemma 3.2. We introduce the notation

$${{U}_{n}}(u,g) = {{\left( \begin{gathered} n \\ 2 \\ \end{gathered} \right)}^{{ - 1}}}\sum\limits_{1 \leqslant {{u}_{i}} < {{u}_{j}} \leqslant n} {g({{u}_{i}} - {{u}_{j}})} ,\quad u = ({{u}_{1}}, \ldots ,{{u}_{n}}).$$

(14)

By definition, the function U_n(u, g) represents U statistics (see [6]). We recall that we put m = n and

$${{\Phi }_{{AB}}} = {{\Phi }_{{AB}}}(X,Y,g) = - \frac{1}{{{{n}^{2}}}}\sum\limits_{i,j = 1}^n {g({{X}_{i}} - {{Y}_{j}}),} $$

$${{\Phi }_{A}}(X,g) + {{\Phi }_{B}}(Y,g) = - \frac{1}{{{{n}^{2}}}}\sum\limits_{1 \leqslant i < j \leqslant n} {g({{X}_{i}} - {{X}_{j}})} - \frac{1}{{{{n}^{2}}}}\sum\limits_{1 \leqslant i < j \leqslant n} {g({{Y}_{i}} - {{Y}_{j}}).} $$

Consequently,

$${{\Phi }_{A}}(X,g) = - \frac{1}{2}\frac{{n - 1}}{n}{{U}_{n}}(X,g),\quad - {\kern 1pt} {{\Phi }_{B}}(Y,g) = \frac{1}{2}\frac{{n - 1}}{n}{{U}_{n}}(Y,g),$$

(15)

$${{\Phi }_{{AB}}}(X,Y,g) = \frac{1}{{{{n}^{2}}}}\left( \begin{gathered} 2n \\ 2 \\ \end{gathered} \right){{U}_{{2n}}}(Z,g) - \frac{1}{2}\frac{{n - 1}}{n}{{U}_{n}}(X,g) - \frac{1}{2}\frac{{n - 1}}{n}{{U}_{n}}(Y,g),$$

(16)

where Z = (Z₁, …, Z_2n) = (X₁, …, X_n, Y₁, …, Y_n). We apply the limit theorem (see Theorem 7.1 [6]) to each of the expressions Φ_A(X, g), Φ_B(Y, g), and Φ_AB(X, Y, g). A direct calculation shows that the nonsingularity condition holds if condition (4) is met. First, suppose that condition (4) is met for g(x) = g*(x) = x².

According to the limit theorem ([6]), nΦ_A(X, g) and nΦ_A(X, g*) have a normal distribution in the limit. Since normal distributions are completely determined by the parameters of shift and scale, we have the equality

$$\frac{1}{n}\sum\limits_{1 \leqslant i < j \leqslant n} {g({{X}_{i}} - {{X}_{j}})} = {{a}^{2}}\frac{1}{n}\sum\limits_{1 \leqslant i < j \leqslant n} {g{\kern 1pt} ^*{\kern 1pt} ({{X}_{i}} - {{X}_{j}})} + \tilde {c} + {{\tilde {\eta }}_{n}},$$

(17)

where a and $\tilde {c}$ are some numbers, whereas ${{\tilde {\eta }}_{n}}$ is a random variable converging in distribution to a constant equal to zero. Since X_i – X_j and Y_i – Y_j, 1 ≤ i < j ≤ n have the same distribution, we get

$$\frac{1}{n}\sum\limits_{1 \leqslant i < j \leqslant n} {g({{Y}_{i}} - {{Y}_{j}})} = {{a}^{2}}\frac{1}{n}\sum\limits_{1 \leqslant i < j \leqslant n} {g{\kern 1pt} ^*{\kern 1pt} ({{Y}_{i}} - {{Y}_{j}})} + \tilde {c} + {{\tilde {\eta }}_{n}}$$

(18)

with the same constants a and $\tilde {c}$ as in equality (17).

For the same reason and taking into account equality (15), for Φ_AB we obtain the formula

$$\frac{1}{{{{n}^{2}}}}\sum\limits_{i,j = 1}^n {g({{X}_{i}} - {{Y}_{j}})} = a\frac{1}{{{{n}^{2}}}}\sum\limits_{i,j = 1}^n {g{\kern 1pt} ^*{\kern 1pt} ({{X}_{i}} - {{Y}_{j}})} + {{\bar {\eta }}_{n}} + \bar {c},$$

where the constant a is the same as in (17); however, $\bar {c} \ne \tilde {c}$ and ${{\bar {\eta }}_{n}} \ne {{\tilde {\eta }}_{n}}$. We obtain

$$n{{T}_{n}}(X,Y,g) = {{a}^{2}}n{{T}_{n}}(X,Y,g^*) + c + {{\eta }_{n}},$$

where η_n converges in probability to zero, a and c are constants, a is the same as in formula (17), and g*(x) = x². By Lemma 3.1, $\frac{1}{{{{J}_{1}}}}n{{T}_{n}}$(X, Y, g*) converges in distribution to L². Thus, the limit distribution nT_n(X, Y, g) has the form a²L² + c.

We consider the case where condition (4) does not hold for g(x) = g*(x).

We suppose K is an arbitrary positive number and

$$\tilde {X} = ({{\tilde {X}}_{1}}, \ldots ,{{\tilde {X}}_{n}}),\quad \tilde {Y} = ({{\tilde {Y}}_{1}}, \ldots ,{{\tilde {Y}}_{n}}),$$

where ${{\tilde {X}}_{i}}$ = X_i if $\left| {{{X}_{i}}} \right| \leqslant K$ and ${{\tilde {X}}_{i}}$ = K and X_i > 0, ${{\tilde {X}}_{i}}$ = –K X_i < 0. We suppose ${{\tilde {Y}}_{i}}$ are defined in a similar way. Now condition (4) holds both for the given function g(x) and for g(x) = x² (due to the finite variance of the modified variables).

We consider the value

$$n\left\{ {\frac{1}{{{{n}^{2}}}}\sum\limits_{i,j = 1}^n {g({{{\tilde {X}}}_{i}} - {{{\tilde {Y}}}_{j}})} - \frac{1}{{{{n}^{2}}}}\sum\limits_{i < j} {g({{{\tilde {X}}}_{i}} - {{{\tilde {X}}}_{j}})} - \frac{1}{{{{n}^{2}}}}\sum\limits_{i < j} {g({{{\tilde {Y}}}_{i}} - {{{\tilde {Y}}}_{j}})} } \right\}.$$

Due to the above arguments, the limit distribution of this value has the form R(L, a, c). When K → ∞, the limit distribution exists (according to Theorem 7.1 [6]) and has the same form. Thus, Lemma 3.2 is proven.

$\square $

1.1 4. CONCLUSIONS

In this paper, we obtain an asymptotic distribution of the considered test and find a formula for the asymptotic power. Using statistical modeling, it is established that the found formula makes it possible to obtain theoretical power values that statistically insignificantly differ from the empirical powers found by modeling. The results can be used in order to determine the rational sample size, i.e., to design an experiment to test hypotheses. The found formulas are also useful for further investigation of the test in question. For example, for the optimal choice of an auxiliary function.

About this article

Cite this article

Melas, V.B. On the Asymptotic Power of a Method for Testing Hypotheses on the Equality of Distributions. Vestnik St.Petersb. Univ.Math. 56, 182–189 (2023). https://doi.org/10.1134/S1063454123020115

Download citation

Received: 12 March 2022
Revised: 12 March 2022
Accepted: 17 November 2022
Published: 08 June 2023
Issue Date: June 2023
DOI: https://doi.org/10.1134/S1063454123020115

Keywords:

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the Asymptotic Power of a Method for Testing Hypotheses on the Equality of Distributions

Abstract

Access this article

Similar content being viewed by others

On a Power of Optimal Test for Asymptotic Distinction of Statistical Hypotheses for Distributions with Heavy Tails*

On Asymptotic Power of the New Test for Equality of Two Distributions

On the Asymptotic Efficiency of Tests in Hypothesis Testing Problems

Notes

REFERENCES

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

ANNEX

1.1 4. CONCLUSIONS

About this article

Cite this article

Keywords:

Navigation

On the Asymptotic Power of a Method for Testing Hypotheses on the Equality of Distributions

Abstract

Access this article

Similar content being viewed by others

On a Power of Optimal Test for Asymptotic Distinction of Statistical Hypotheses for Distributions with Heavy Tails*

On Asymptotic Power of the New Test for Equality of Two Distributions

On the Asymptotic Efficiency of Tests in Hypothesis Testing Problems

Notes

REFERENCES

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

ANNEX

ANNEX

1.1 4. CONCLUSIONS

About this article

Cite this article

Share this article

Keywords:

Search

Navigation