Application of the empirical Bayes approach to nonparametric testing for high-dimensional data

. In [5] a simple, data-driven and computationally eﬃcient procedure of (non-parametric) testing for high-dimensional data have been introduced. The procedure is based on randomization and resampling, a special sequential data partition procedure, and χ 2 -type test statistics. However, the χ 2 test has small power when deviations from the null hypothesis are small or sparse. In this note test statistics based on the nonparametric maximum likelihood and the empirical Bayes estimators


Introduction
Let X := (X(1), . . . , X(N )) be a sample of the size N of iid observations of a random vector X having a distribution P on R d . We are interested in testing (nonparametric) properties of P in case the dimension d of observations is large.
Thus far, there is no generally accepted methodology for the multivariate nonparametric hypothesis testing. Traditional approaches to multivariate nonparametric hypothesis testing are based on empirical characteristic function [1], nonparametric distribution density estimators and smoothing [3,4], multivariate nonparametric Monte Carlo tests [12], and classical univariate nonparametric statistics calculated for data projected onto the directions found via the projection pursuit [11,7].
More advanced technique is based on Vapnik-Chervonenkis theory, the uniform functional central limit theorem and inequalities for large deviation probabilities [9,2]. Recently, especially in applications, the Bayes approach and Markov chain Monte Carlo methods are widely used (see, e.g., [10] and references therein).
In [5] a simple, data-driven and computationally efficient procedure of nonparametric testing for high-dimensional data have been introduced. The procedure is based on randomization and resampling (bootstrap), a special sequential data partition procedure, and χ 2 -type statistics.
The goal of this note is to propose more efficient than χ 2 test statistics based on the nonparametric maximum likelihood (NML) and the empirical Bayes (EB) estimators in an auxiliary nonparametric mixture model.

Simple testing procedure
Let P 0 and P 1 be two disjoint classes of d-dimensional distributions, P := P 0 ∪ P 1 . Consider a nonparametric hypothesis testing problem: Suppose that there exists a continuous (in some topology) mapping Ψ : P → P 0 such that P 0 = {P ∈ P: Ψ (P ) = P }. One can take, for example, Ψ (P ) = argmin Q∈P0 ̺(Q, P ) where ̺ is a distance in P.
LetP denote the empirical distribution based on the sample X and defineP 0 := Ψ (P ). Under the null hypothesis the empirical distributionsP andP 0 for large N should be close since they both are the approximations to the same distribution P 0 . Thus, any measure of discrepancy betweenP andP 0 can be taken as a test statistic for (1). In [5] the following discrepancy measure T 0 has been calculated.
Generate two independent random samples X P and X 0 of size N from the distri-butionsP andP 0 , respectively. Let X * denote the joint sample of X P and X 0 , Further, let S := {S k , k = 1, . . . , K}, be a sequence of partitions of X * with |S k | = k elements produced by some binary partition algorithm. Initially S 1 := {X * }, and for k = 2, . . . , K the next partition S k is obtained from the previous S k−1 by splitting some set from S k−1 into two disjoint subsets.
For a fixed partition Thus, Y Q is a k-dimensional vector with jth component equal to the number of elements of X Q in the set S k j (j = 1, . . . , k). Denote here the operations are performed coordinatewise. When the number of observations Y P (j) + Y 0 (j) in the each set S k j , j = 1, . . . , k, is large and the null hypothesis H 0 holds, the distribution of the vector η 0 can be approximated by (k − 1)-dimensional standard normal distribution. Therefore it is natural to take χ 2 statistic |η 0 | 2 as the discrepancy measure betweenP andP 0 and to use it as a test statistic for (1). Actually, with the the statistic |η 0 | 2 , the null hypothesis is tested instead (here 0 k stands for the null vector in R k ). The approximate covariance matrix of the statistics T 0 , however, depends on the alternative H P . Therefore variance-stabilizing transformation is used giving a new discrepancy vector Moreover, χ 2 test has a small power when the dimension n of η is large and either each component of the mean θ := Eη only slightly differs from 0 n or only a few θ components are nonzero. In the next section we apply the nonparametric maximum likelihood estimator and the nonparametric empirical Bayes method to construct a more powerful criterion to test H η 0 and hence H 0 .

Auxiliary testing problem and empirical Bayes
Let us consider an auxiliary testing problem where η ∼ Normal n (θ, I n ) and θ ∈ R n is an unknown mean vector. In the (empirical) Bayes approach, the unknown parameter θ is treated as random. Thus, we consider a nonparametric Gaussian mixture model with a mixture distribution G η = θ + z, θ and z are independent, (6) z ∼ Normal n (0 n , I n ), For ν > 0, by µ ν (y | G) we denote the posterior ν-moment of θ 1 given η 1 = y Here ϕ denotes the standard normal distribution density. The homogeneity hypothesis (5) states that in fact there is no mixture, G is the degenerated at 0 distribution. Since E|η| 2 = nEθ 2 1 + n, a criterion for testing the null hypothesis H η 0 can be based on an estimator of the functional Alternatives to the direct estimator (μ 2 ) χ 2 := n −1 |η| 2 − 1 are the nonparametric maximum likelihood estimator (NMLE) and the nonparametric empirical Bayes (NEB) estimator HereĜ =Ĝ ML is the NMLE of the mixture distribution G. For Gaussian mixtures, it does exist and is strongly consistent (see, e.g., [8]). We consider also the NEB statistic which is a biased toward 0 estimator of µ 2 .
They also have shown via simulations that in some casesθ significantly outperforms other known counterparts including James-Stein estimator. Sinceθ is location invariant, this suggests that the criterion for testing (5) based on the statistics (μ 2 1 ) EB might be more powerful especially for close alternatives.
The asymptotic properties of (μ 2 1 ) EB can be derived from that of |θ − θ| 2 . In this paper we are interested in finite sample properties of (μ 2 1 ) EB and present simulation results for some natural alternatives.

Simulation experiment and concluding remarks
The following three alternatives of θ i distribution are considered: For various combinations of the parameters a, n and m, simulations with 1000 replications have been performed. The parameter a > 0 represents the difficulty of the testing problem. The simulations show some improvements in power of NEB test in comparison with χ 2 test. Figs. 1-3 illustrate the typical results. Here power plots for the test statistic (μ 2 1 ) EB and for χ 2 test versus a are given for n = 50 and m = 8.  The NMLEĜ ML is calculated by making use of the EM algorithm for a finite Gaussian mixture with prespecified and fixed centers of the mixture components (see, e.g., [6]). The number of the components m = 15. This means that actually the restricted NMLE is substituted forĜ ML .

Concluding remarks
The initial nonparametric testing problem (1) for high-dimensional data is reduced to the auxiliary testing problem (5) using the method proposed in [5]. In the empirical Bayes setting the null hypothesis H η 0 can be restated as G = δ 0 , where G is the prior distribution of the unknown parameters θ i , i = 1, . . . , n, and δ 0 is the degenerate at 0 distribution. Thus, any discrepancy measure between the δ 0 and NMLEĜ ML of G can be used for testing (5), in particular, χ 2 test or nonparametric likelihood ratio criterion. In the paper the finite sample properties of the test based on the NEB statistic (μ 2 1 ) EB (see (14)) are investigated by means of simulations.
Preliminary simulation results show some improvements of NEB test as compared with χ 2 . Since NMLEĜ ML calculation is an iterative and time consuming procedure the results can depend on the calculation method and the number of iterations.