Skip to content
Publicly Available Published by De Gruyter July 19, 2018

A New Class of Robust Two-Sample Wald-Type Tests

  • Abhik Ghosh , Nirian Martin , Ayanendranath Basu EMAIL logo and Leandro Pardo

Abstract

Parametric hypothesis testing associated with two independent samples arises frequently in several applications in biology, medical sciences, epidemiology, reliability and many more. In this paper, we propose robust Wald-type tests for testing such two sample problems using the minimum density power divergence estimators of the underlying parameters. In particular, we consider the simple two-sample hypothesis concerning the full parametric homogeneity as well as the general two-sample (composite) hypotheses involving some nuisance parameters. The asymptotic and theoretical robustness properties of the proposed Wald-type tests have been developed for both the simple and general composite hypotheses. Some particular cases of testing against one-sided alternatives are discussed with specific attention to testing the effectiveness of a treatment in clinical trials. Performances of the proposed tests have also been illustrated numerically through appropriate real data examples.

1 Introduction

Testing of parametric hypothesis is an important paradigm of statistical inference. In many real life applications like medical sciences, biology, epidemiology, sociology, reliability etc., we need to compare data from two independent samples through appropriate two-sample tests of hypotheses. Examples include, but are not limited to, comparing the means of any biomarker or success of any treatment between control and treatment groups, comparing lifetime of two populations in reliability, etc.

Mathematically, let X, βX,PθθΘ be the statistical space associated with the random variable X , where βX is the σ-field of Borel subsets AX and PθθΘ is a family of probability distributions defined on the measurable space X, βX where Θ is an open subset of Rp, with p1. Probability measures Pθ are assumed to possess the densities fθx=dPθ/dμx, where μ is a σ-finite measure on X, βX. We shall denote by F=fθ:θΘRp a set of parametric model densities.

On the basis of two independent random samples X1,...,Xn and Y1,....,Ym of sizes n and m, respectively, from two densities fθ1x and fθ2x belonging to F, we can solve the problem of complete homogeneity by testing

(1)H0:θ1=θ2 versus H1:θ1θ2.

The classical test statistics for testing eq. (1) are the likelihood ratio test, Wald test and Rao test, where the unknown parameters are estimated by the maximum likelihood estimators (MLEs). Some alternative test statistics have also been presented in the literature based on divergence measures; see, for instance, Basu et al. [1] and Pardo [2]. It is well-known that the MLE is a BAN estimator, i.e., asymptotically efficient, but at the same time it has serious lack of robustness against data contamination and model misspecification. In order to avoid the robustness problem, appropriate testing procedures have been developed in the statistical literature based on suitable robust estimators. For example, Basu et al. [3] have introduced a family of test statistics for testing eq. (1) based on the density power divergence (DPD) measure between fθ1 and fθ2 when the parameters are estimated by the minimum density power divergence estimator (MDPDE) of Basu et al. [4]; see Section 1.1 for more details about the MDPDE.

Note that, if the problem considered in eq. (1) has been solved, we will be able to apply it for the particular (and most common) normal populations with fθ1N(μ1,σ1) and fθ2N(μ2,σ2) to test the following problem of complete homogeneity

H0:(μ1,σ1)=μ2,σ2 versus H1:(μ1,σ1)μ2,σ2.

But there are other interesting problems of testing for partial homogeneity, for instance, to test

H0:μ1=μ2 versus H1:μ1μ2

when the variances are the same but unknown; this is a particular case of the general composite hypotheses involving two samples. This particular problem with normal population has been considered in [5] on the basis of a family of test statistics based on the DPD measure and by estimating the unknown parameters using the MDPDE. The results presented in their paper have been excellent in relation to the robustness and efficiency trade-off; for some suggested members of their proposed test family the loss in efficiency based on the size and the power under pure data was not really significant but the improvement in terms of robustness under contaminated data was highly significant. Although their approach can theoretically be extended beyond the simple case of normal populations, from a practical point of view, it is often not very easy to compute the density power divergence measure between fθ1 and fθ2. In this paper we present a new family of test statistics which are easy to calculate based on only the MDPDEs for any general two-sample problem (involving nuisance parameters as well) and with any parametric distribution. These test statistics are Wald-type test statistics and their usefulness have been illustrated in the literature of one sample testing problems by Basu et al. [6] and Ghosh et al. [7]. In the present paper, not only will we present the asymptotic distribution of the proposed Wald-type test statistics for the two-sample problems but will also provide a theoretical study of their robustness properties along with suitable examples and numerical illustrations.

The rest of the paper is organized as follows: In Section 1.1 we present some important background results and definitions in relation to the MDPDE that will be necessary for the rest of the paper. Section 2 is devoted to developing the family of Wald-type tests for solving the problem of complete homogeneity given by eq. (1). We study its asymptotic distribution as well as the theoretical robustness properties with examples in the same section. In Section 3, we present a family of Wald-type tests for the more general composite hypotheses in the two sample context. We again derive their asymptotic distributions and robustness properties. Illustrations are provided for the special case of testing partial homogeneity in presence of nuisance parameters like, for example, testing equality of two normal means with unknown (nuisance) variances. In Section 4, we briefly describe the extensions for testing the two-sample hypotheses against one-sided alternatives. Section 5 presents several real life applications of our proposal with interesting data from applied sciences like medical science, biology, reliability etc. Appropriate simulation studies with some comments on the choice of the robustness tuning parameters are presented in Section 6. The paper ends with a short concluding remark in Section 7. For brevity in presentation, the proofs of all the results have been moved to Appendix 8.

1.1 The minimum density power divergence estimator: Asymptotic properties and robustness

Given any two densities fθ1 and fθ2 from F, the density power divergence with a nonnegative tuning parameter β, is defined as [4]

(2)dβ(fθ1,fθ2)=fθ21+β(x)1+1βfθ2β(x)fθ1(x)+1βfθ11+β(x)dx,forβ>0,fθ1(x)lnfθ1(x)fθ2(x)dx,forβ=0.

The divergence corresponding to β=0 may be derived from the general case by taking the continuous limit as β0+, and the resulting d0(fθ1,fθ2) turns out to be the Kullback-Leibler divergence.

Let G represent the distribution function corresponding to the underlying true density g that generates the data and we want to model it by the parametric model density fθF. The corresponding minimum DPD functional at G with tuning parameter β, denoted by Uβ(G), is defined through the relation dβ(g,fUβ(G))=minθΘdβ(g,fθ). Therefore the MDPDE of θ with tuning parameter β is given by θˆβ=Uβ(Gn),

where Gn is the empirical distribution function associated with the observed random sample X1,,Xn from the population having density g. As the last term of eq. (2) does not depend on θ, θˆβ is indeed given by

(3)θˆβ=argminθΘfθ1+β(x)dx1+1β1ni=1nfθβ(Xi),if β>0,
(4)and θˆβ=argminθΘ1ni=1nlnfθ(Xi),if β=0.

Notice that θˆβ for β=0 coincides with the maximum likelihood estimator (MLE). Denoting

Vθx=fθ1+β(x)dx1+1βfθβ(x),

the expression in eq. (3) can be written as θˆβ=argminθΘ1ni=1nVθ(Xi). It shows that the MDPDE is an M-estimator.

The functional Uβ(G) is Fisher consistent; it takes the value θ0, the true value of the parameter, when the true density is a member of the model with g=fθ0. Let us assume g=fθ0 and define the quantities

(5)Jβθ=uθ(x)uθT(x)fθ1+β(x)dx,Kβθ=uθ(x)uθT(x)fθ1+2β(x)dxξβθξβTθ,

where ξβθ=uθ(x)fθ1+β(x)dx and uθ(x)=θlnfθ(x). Then, following [1, 4], it can be shown that, under Assumptions (D1)–(D5) of Basu et al. [1][p. 304] to be referred as “Basu et al. conditions" in the rest of the paper,

(6)n1/2(θˆβθ0)LnN(0p,Σβ(θ0)),

where Σβ(θ)=Jβ1(θ)Kβ(θ)Jβ1(θ). It is a simple exercise to see that for β=0, Jβ=0θ=Kβ=0θ=IFθ, the Fisher information matrix associated to the model under consideration. Therefore we obtain the classical well known result,

n1/2(θˆβ=0θ0)LnN(0p,IF1(θ0)).

Next, the influence function (IF) can be used to study the robustness of the MDPDE. Note that, if the influence function is bounded, the corresponding estimator or test statistic is said to have local robustness against infinitesimal contamination. More simply, the influence function IFx,Uβ,Fθ0 is the first derivative of an estimator or statistic viewed as a functional Uβ, which describes the normalized influence on the estimate or statistic of an infinitesimal contamination at a distant point x in the sample space. In [4] it was established that the influence function (IF) of the minimum DPD functional is given by

(7)IFx,Uβ,Fθ0=limε0UβFεUβFθ0ε=Jβ1(θ0)uθxfθ0β(x)ξθ0,

where Fε=(1ε)Fθ0+εx is the ε-contaminated distribution of Fθ0, the distribution function corresponding to fθ, with respect to the point mass distribution x at x. If we assume that Jβ(θ0) and ξθ0 are finite, the IF is a bounded function of x whenever uθxfθ0β(x) is bounded. And this is the case for most common parametric models at β>0 implying the robustness of MDPDEs with β>0.

2 A simple two-sample problem

Let X1,...,Xn and Y1,...,Ym be two samples of sizes n and m respectively from two populations having densities belonging to F with parameters θ1 and θ2. The most common problem under this setup is to test the complete homogeneity of the two populations given by eq. (1). But some component of the parameters can also be just nuisance in many applications, for example, as in eq. (2).

In general notation, let us assume that

θ1=θ1,1,...,θ1,r,θ1,r+1,...,θ1,pT=θ1T,0θ1TTand θ2=θ21,1,...,θ2,r,θ2,r+1,...,θ2,pT=θ2T,0θ2TT

with 0θ1 and 0θ2 known (pr)-vectors. Based on X1,...,Xn we can get the MLE, θˆ1, of θ1 and based on Y1,...,Ym the MLE, θˆ2, of θ2. Assuming θ1=θ2 we can obtain an estimator, oθˆ1 of the common value θ1 by using the two random samples X1,...,Xn and Y1,...,Ym together. It is well-known that, under θ1=θ2,

(8)mnm+nθˆ1θˆ2LnN(0r,ωIF1(θ1,0θ1)+1ωIF1(θ1,0θ2))

with

ω=limm,nmm+n.

Based on eq. (8), the classical Wald test statistic for testing

(9)H0:θ1=θ2 versus H1:θ1θ2,

is given by Wm,n=mnm+n(θˆ1θˆ2)TmIF1(θˆ1,0θ1)m+n+nIF1(θˆ2,0θ2)m+n1θˆ1θˆ2=mnθˆ1θˆ2TmIF1(θˆ1,0θ1)+nIF1(θˆ2,0θ2)1θˆ1θˆ2.

We can observe that, when r = p we have IF1(θ1,0θ1)=IF1(θ1,0θ2)=IF1(θ0), with θ1=θ2=θ0 and the Wald test statistic becomes

(10)Wm,n=mnm+nθˆ1θˆ2TIF((0)θˆ)θˆ1θˆ2,

where (0)θˆ denotes the MLE of θ0 based on the pooled sample.

As an example, in the case of two normal populations, with known variances σ12 and σ22, we can test H0:μ1=μ2 by the Wald test statistic

Wm,n=mnμˆ1μˆ22mσ12+nσ22=μˆ1μˆ22σ12n+σ22m.

Although it has several nice optimum properties, it is highly non-robust in presence of outliers even in any one sample. Here, we will generalize this classical Wald test to make it robust through replacing the non-robust MLEs by the corresponding robust MDPDEs. In the following we will present the results for r = p, i.e., to test for the hypothesis in eq. (1). The case r = p can be studied in a similar way.

Let us assume (1)θˆβ and (2)θˆβ denote the MDPDEs of θ1 and θ2 respectively, obtained by minimizing the DPD with tuning parameter β for each of the two samples separately. Further, under the null hypothesis H0:θ1=θ2=θ0 in eq. (1), we can consider the two samples pooled together as one i.i.d. sample of size m+n from a population having density function fθ0; let (0)θˆβ denote the corresponding MDPDE of θ0 with tuning parameter β based on the pooled sample. Note that, all the three estimators (1)θˆβ , (2)θˆβ and (0)θˆβ should coincide with θ0 asymptotically under H0 with probability tending to one. Assuming identifiability of the model family, the difference between the two estimators (1)θˆβ and (2)θˆβ gives us an idea of the distinction between the two samples and hence indicate any departure from the null hypothesis. So, we define a generalized Wald-type test statistic by

(11)Tm,n(β)=nmn+m(1)θˆβ(2)θˆβTΣβ(0)θˆβ1(1)θˆβ(2)θˆβ.

Note that, at β=0, all the MDPDEs used coincide with corresponding MLEs and hence the generalized Wald-type test statistic Tm,n(β) coincides with the classical Wald test statistic Wm,n given in eq. (10).

2.1 Asymptotic properties

In order to perform any statistical test, we first need to derive the asymptotic distribution of the test statistics under H0. Using the asymptotic properties of the MDPDEs presented in Section 1.1, we can easily obtain the asymptotic null distribution of the proposed test statistics Tm,n(β) which is presented in the following theorem. Throughout the rest of the paper, we will assume Conditions (A)–(D) of Lehmann [8][p. 429] about the assumed model family which we will refer as “Lehmann conditions". Also, we consider the following assumption.

Assumption (A):

  1. mm+nω(0,1) as m,n

  2. The asymptotic variance-covariance matrix Σβ(θ) of the MDPDE with tuning parameter β is continuous in θ.

Theorem 2.1

Suppose the model density satisfies the Lehmann and Basu et al. conditions, and Assumption (A) holds. Then the asymptotic distribution of Tm,n(β) under the null hypothesis in eq. (9) is χp2, the chi-square distribution with p degrees of freedom.

The asymptotic null distribution of the test in [3] is a linear combination of chi-square distribution and hence it is somewhat difficult to obtain the critical values of their test in practice. On the contrary, our proposed tests have a simple chi-square limit under the null hypothesis and hence are much easier to perform. Our proposal provides, in this sense, an advantageous procedure for testing.

However, when the null hypothesis is not correct, i.e., θ1θ2, then the pooled estimator (0)θˆβ no longer converges to θ1 or θ2; rather it will then converge in probability to a new value θ3, say, which is a function of θ1, θ2 and ω. For example, if the estimators are additive in sample data, e.g. sample mean, then θ3=(1ω)θ1+ωθ2. Define lθ3,β(θ1,θ2)=(θ1θ2)TΣβ(θ3)1(θ1θ2). Then we have the following result.

Theorem 2.2

Suppose the model density satisfies the Lehmann and Basu et al. conditions, and Assumption (A) holds. Then, as m,n, we have for any θ1θ2

mnm+nl(0)θˆβ,β((1)θˆβ,(2)θˆβ)lθ3,β(θ1,θ2)Lm,nN0,4σθ3,β2(θ1,θ2),

where σθ3,β2(θ1,θ2)=(θ1θ2)TΣβ(θ3)1ωΣβ(θ1)+(1ω)Σβ(θ2)Σβ(θ3)1(θ1θ2).

This theorem leads to an approximation to the power function πm,n,α(β)(θ1,θ2)=PTm,n(β)>χp,α2 of the proposed Wald-type tests for testing eq. (9) at the significance level α, where χp,α2 denotes the (1α)-th quantile of the χp2 distribution.

Corollary 2.3

Under the assumption of Theorem 2.2, we have

πm,n,α(β)(θ1,θ2)=1Φnn+mnm2σθ3,β(θ1,θ2)χp,α2nmn+mlθ3,β(θ1,θ2),θ1θ2,

for a sequence of distributions Φn() tending uniformly to the standard normal distribution Φ().

The corollary also helps us to determine the sample size requirement for our proposed test to achieve any pre-specified power level. Further, we have πm,n,α(β)(θ1,θ2)1 for any θ1θ2 as m,n. Hence the proposed test with rejection rule Tm,n(β)>χp,α2 is consistent.

Corollary 2.4

Under the assumption of Theorem Theorem 2.2, the proposed Wald-type test is consistent in the Fraser’s sense.

Next, we look at the performance of the proposed test under contiguous alternatives. Now, in case of two sample problem, we can have different types of contiguous alternatives. For example, we can assume θ2 to be fixed and θ1 converging to θ2 so that H1,n:θ1=θ1,n=θ2+n12Δ1 for some p-vector Δ1 of non-zero reals such that θ2+n12Δ1Θ. Conversely, we can have θ1 to be fixed and H1,m′′:θ2=θ2,m=θ1+m12Δ2 for some Δ2Rp{0} with θ1+m12Δ2Θ. Here, we consider a general form of the contiguous alternative given by

(12)H1,n,m:θ1=θ1,n=θ0+n12Δ1,θ2=θ2,m=θ0+m12Δ2,(Δ1,Δ2)Rp×Rp{(0p,0p)},

for some fixed θ0Θ. Note that, putting Δ2=0 in eq. (12) we get H1,n back from H1,n,m, whereas Δ1=0 yields H1,m′′. The following theorem gives the asymptotic distribution of the proposed test statistics Tm,n(β) under this general contiguous alternatives H1,m,n.

Theorem 2.5

Suppose the model density satisfies the Lehmann and Basu et al. conditions and the assumption (A) holds. Then the asymptotic distribution of Tm,n(β) under the contiguous alternative H1,n,m given by eq. (12) is χp2(δβ), the non-central chi-square distribution with p degrees of freedom and non-centrality parameter δβ=W(Δ1,Δ2)TΣβ(θ0)1W(Δ1,Δ2) with W(Δ1,Δ2)=ωΔ11ωΔ2.

We can easily obtain the asymptotic power πβ(Δ1,Δ2) under the contiguous alternatives H1,n,m from the above theorem. In particular, denoting the distribution function of a random variable Z by FZ, we have

(13)πβ(Δ1,Δ2)=1Fχp2(δβ)(χp,α2).

Example 2.1 (Testing equality of two Normal means with known equal variances)

We first present the simplest possible case of testing two normal means with known equal variance σ2. Here the model family is F={N(θ,σ2):θR} with σ being known. In this case, the asymptotic variance Σβ(θ) of the MDPDE with tuning parameter β is given by Σβ(θ)=1+β21+2β3/2σ2. Hence, our generalized Wald-type test statistics has a much simpler form in this case given by

Tm,n(β)=mnm+n1+β21+2β3/2(1)θˆβ(2)θˆβσ2,

and it has χ12 asymptotic distribution under H0. Note that, at β=0, this test statistic coincides with the classical Wald-test statistic Wm,n=mnm+n(1)θˆ0(2)θˆ0σ2=mnm+nXˉYˉσ2, where Xˉ and Yˉ are the sample means of X1,,Xm and Y1,,Yn respectively.

Clearly, these tests are consistent for any β0 by Corollary 2.4. Further, the asymptotic power of the proposed test under contiguous alternatives H1,m,n can be easily obtained as

πβ(Δ1,Δ2)=1Fχ12(δβ)(χ1,α2),

with δβ=1+β21+2β3/2σ2W(Δ1,Δ2)2.Table 1 presents the values of πβ(Δ1,Δ2) over β[0,1] for different values of W(Δ1,Δ2). Note that, whenever W(Δ1,Δ2)=0, the alternative coincides with null and hence we get back the level of the test and as W(Δ1,Δ2) increases the power also increases as expected. Clearly, this asymptotic power decreases as β increases but this loss is not significant at small positive values of β. This fact is quite intuitive as the classical Wald-test at β=0 is asymptotically most powerful under pure model. But, as we will see in the next two subsections, we can gain much higher robustness with respect to the outliers at the cost of this small loss in asymptotic power.

Table 1:

Asymptotic contiguous power of the proposed Wald-type test at 95% level for testing equality of two normal means as in Example 2.1 with known common σ2=1.

β
WΔ1200.10.30.50.70.91
00.0500.0500.0500.0500.0500.0500.050
10.1700.1690.1600.1500.1400.1310.127
20.5160.5110.4840.4490.4130.3800.364
30.8510.8470.8210.7840.7420.6980.677
50.9990.9990.9980.9960.9920.9850.981

2.2 Influence function of the wald-type test statistics

The robustness of any two sample test is relatively complicated compared to the one sample case because, here, one may have contamination in either of the two sample or even in both the samples. Let us first derive the Hampel’s influence function (IF) of the two sample Wald-type test statistics to study the robustness of the proposed test. Consider the set-up of previous subsection and denote G1=Fθ1 and G2=Fθ2. Then, ignoring the multiplier nmn+m, we can define the statistical functional corresponding to the proposed Wald-type test statistics Tm,n(β) as

Tβ(G1,G2)=Uβ(G1)Uβ(G2)TΣβ1(θ0)Uβ(G1)Uβ(G2),

where Uβ is the MDPDE functional defined in Section 1.1.

Now consider the contaminated distributions G1,ε=(1ε)G1+εx and G2,ε=(1ε)G2+εy where ε is the contaminated proportion and x, y are the point of contamination in the two samples respectively. Then the Hampel’s first-order influence function of our test functional, when the contamination is only in the first sample, is given by

IF(1)(x;Tβ,G1,G2)=εTβ(G1,ε,G2)ε=0=2(Uβ(G1)Uβ(G2))TΣβ1(θ0)IF(x;Uβ,G1).

Similarly, if there is contamination only in the second sample, then the corresponding IF is given by

IF(2)(y;Tβ,G1,G2)=εTβ(G1,G2,ε)ε=0=2(Uβ(G1)Uβ(G2))TΣβ1(θ0)IF(y;Uβ,G1).

Finally, if we assume that the contamination is in both the samples, Hampel’s IF turns out to be

IF(x,y;Tβ,G1,G2)=εTβ(G1,ε,G2,ε)ε=0=2(Uβ(G1)Uβ(G2))TΣβ1(θ0)Dβ(x,y),

where Dβ(x,y)=IF(x;Uβ,G1)IF(y;Uβ,G2). Now, in particular, if we assume the null hypothesis to be true with G1=G2=Fθ1, then Uβ(G1)=Uβ(G2)=θ1. Therefore, all the above three types of influence function will be zero at the null hypothesis in eq. (9), which implies that the Wald-type tests are not robust for all β0. This is clearly not informative about the robustness of the tests as we all know the non-robust nature of Tm,n(0) (which is the classical Wald test statistic Wm,n).

Therefore, we need to consider the second order influence function for this case of two sample problem. When there is contamination only in the first sample, the corresponding second order IF is given by

IF2(1)(x;Tβ,G1,G2)=22εTβ(G1,ε,G2)|ε=0=2(Uβ(G1)Uβ(G2))TΣβ1(θ0)IF2(x;Uβ,G1)+2IF(x;Uβ,G1)TΣβ1(θ0)IF(x;Uβ,G1).

For the particular case of null distribution θ1=θ2, it simplifies to

IF2(1)(x;Tβ,Fθ1,Fθ1)=2IF(x;Uβ,Fθ1)TΣβ1(θ0)IF(x;Uβ,Fθ1).

Similarly, if the contamination is in the second sample only, then the second order IF simplifies to

IF2(2)(y;Tβ,Fθ1,Fθ1)=2IF(y;Uβ,Fθ1)TΣβ1(θ0)IF(y;Uβ,Fθ1).

Note that these two IFs are bounded with respect to the contamination points x or y if and only if the IF of the corresponding MDPDE used is bounded; but it is the case for all β>0 under most common parametric models. Hence for any β>0, the proposed test gives robust inference with respect to contamination in any one of the samples. However, at β=0 the MDPDE becomes the non-robust MLE having unbounded influence function and so using that estimator makes the classical Wald test statistic to be highly non-robust also.

Finally for the case of contamination in both samples, the corresponding second order IF is given by

IF2(x,y;Tβ,G1,G2)=22εTβ(G1,ε,G2,ε)ε=0=2(Uβ(G1)Uβ(G2))TΣβ1(θ0)IF2(x;Uβ,G1)IF2(y;Uβ,G2)+2Dβ(x,y)TΣβ1(θ0)Dβ(x,y).

In particular, at the null hypothesis θ1=θ2, we have

IF2(x,y;Tβ,Fθ1,Fθ1)=2Dβ(x,y)TΣβ1(θ0)Dβ(x,y).

Note that if x = y then Dβ(x,y)=0 and hence this second order influence function is zero implying the robustness of the proposed test with any values of the parameter; this is expected intuitively as the same contamination in both the samples nullifies each other for testing the equivalence of the two samples as in eq. (9). However, if xy, then the influence function of our test is bounded if and only if the difference Dβ(x,y) between the influence functions of the MDPDEs used is bounded. This happens whenever the IF of the MDPDE is bounded, i.e., at β>0.

Figure 1: Second order influence function of the proposed Wald-type test statistics and corresponding gross error sensitivity γβ,1$\gamma_{\beta,1}$ under contamination only in first sample for testing equality of two normal means as in Example 2.1 with known common σ2=1$\sigma^2=1$.
Figure 1:

Second order influence function of the proposed Wald-type test statistics and corresponding gross error sensitivity γβ,1 under contamination only in first sample for testing equality of two normal means as in Example 2.1 with known common σ2=1.

Example 2.2 (Continuation of Example 2.1)

Let us again consider the previous example on testing two normal means as in Example 2.1. We have seen that the proposed Wald-type tests are consistent for all β0 but their power against contiguous alternatives decreases slightly as β increases. Now let us verify the claimed robustness of these tests.

Clearly, the first order IFs of the test statistics will always be zero. For contamination only in the first sample, the second order IF of the test statistic Tβ at the null hypothesis in eq. (9) has a simpler form given by

IF2(1)(x;Tβ,Fθ1,Fθ1)=2σ21+2β3/2(xθ1)2eβ(xθ1)2σ2.
Figure 1a presents the plot of this second order IF for different values of β[0,1]. It is evident from the figure that the second order IF is unbounded at β=0 implying the non-robustness of the classical Wald test statistic; but it is bounded for all β>0 implying the robustness of our proposals. Further, Figure 1b presents the plot of the maximum possible influence of infinitesimal contamination on the test statistics, known as the “gross error sensitivity", computed as
γβ,1=supxIF2(1)(x;Tβ,Fθ1,Fθ1)=2σ3β1+β1+βeβσ.

It clearly shows that the robustness of our proposed test statistics increases as β increases (since γβ,1 decreases). Thus, just like the trade-off between efficiency and robustness of MDPDE, the parameter β again controls the trade-off between asymptotic contiguous power and robustness for the proposed MDPDE based test statistics.

Similar inferences can also be drawn for contamination only in the second sample.

Next consider the case when there is contamination in both the samples. In this case, the second order IF is given by

IF2(x,y;Tβ,Fθ1,Fθ1)=2σ21+2β3/2(xθ1)eβ(xθ1)22σ2(yθ1)eβ(yθ1)22σ22.

The plot of IF2(x,y;Tβ,Fθ1,Fθ1) have been presented in Figure 2, which clearly show the robust nature of our proposals at β>0 and the non-robust nature of the classical Wald test (at β=0) unless x = y. By looking at the maximum possible influence in this case, we can again see that, even under contamination in both the samples, the robustness of our proposed Wald-type test statistics increases as β increases.

Figure 2: Second order influence function of the proposed Wald-type test statistics under contamination in both the samples for testing equality of two normal means as in Example 2.1 with known common σ2=1$\sigma^2=1$.
Figure 2:

Second order influence function of the proposed Wald-type test statistics under contamination in both the samples for testing equality of two normal means as in Example 2.1 with known common σ2=1.

2.3 Power and level influence functions

The robustness of a test statistic, although necessary, may not be sufficient in all the cases since the performance of any test is finally measured through its level and power. In this section, we consider the effect of contamination on the asymptotic power and level of the proposed Wald-type tests. Due to consistency, the asymptotic power against any fixed alternative will be one. So, we again consider the contiguous alternatives H1,m,n given by eq. (12) along with contamination over these alternatives. Following Hampel et al. [9], the effect of contaminations should tend to zero, as the alternatives tend to the null (i.e., θ1,nθ0 and θ2,mθ0 as m,n) at the same rate to avoid confusion between the neighborhoods of the two hypotheses (also see [7, 10, 11 ,12, 13] for some one sample applications). Further, in case of the present two sample problem, the contamination can be in any one sample or in both the samples. When the contamination is only in the first sample, we consider the corresponding contamination distribution for the first population as

F1,n,ε,xL=1εnFθ0+εnxF1,n,ε,xP=1εnFθ1,n+εnx,

for the level and power calculations respectively along with the usual uncontaminated distributions for the second population. Then the corresponding level influence function (LIF) and the power influence function (PIF) at the null θ1=θ2=θ0 are given by

LIF(1)(x;Tβ,Fθ0)=limm,nεP(F1,n,ε,xL,Fθ0)(Tm,n(β)>χp,α2)|ε=0,
PIF(1)(x;Tβ,Fθ0)=limm,nεP(F1,n,ε,xP,Fθ2,m)(Tm,n(β)>χp,α2)|ε=0.

Similarly, when contamination is assumed to be only in the second sample, then we take the uncontaminated distributions for the first population and the contaminated distribution for the second population as

F2,m,ε,yL=1εmFθ0+εmyF2,m,ε,yP=1εmFθ2,m+εmy,

for the level and power calculations respectively. Corresponding LIF and PIF at the null θ1=θ2=θ0 are given by

LIF(2)(y;Tβ,Fθ0)=limm,nεP(Fθ0,F2,m,ε,yL)(Tm,n(β)>χp,α2)|ε=0,
PIF(2)(y;Tβ,Fθ0)=limm,nεP(Fθ1,n,F2,m,ε,yP)(Tm,n(β)>χp,α2)|ε=0.

Finally, while considering contamination in both the samples with above contaminated distributions, we define the corresponding LIF and PIF as

LIF(x,y;Tβ,Fθ0)=limm,nεP(F1,n,ε,xL,F2,m,ε,yL)(Tm,n(β)>χp,α2)|ε=0,
PIF(x,y;Tβ,Fθ0)=limm,nεP(F1,n,ε,xP,F2,m,ε,yP)(Tm,n(β)>χp,α2)|ε=0.

First let us derive the asymptotic distribution of the proposed Wald-type test statistics Tm,n(β) under the contaminated distributions. Let us define Δ˜i=Δi+εIF(xi;Uβ,Fθ0) for i = 1,2 with x1=x and x2=y. Then we have the following theorem.

Theorem 2.6

Suppose the model density satisfies the Lehmann and Basu et al. conditions and Assumption (A) holds. Then the asymptotic distribution of Tm,n(β) under any contaminated contiguous alternative distributions (D1,D2) is χp2λ where λ is the parameter of non-centrality given by λ=W˜εTΣβ(θ0)1W˜ε, where

(14)W˜ε=WΔ˜1,Δ2,if (D1,D2)=(F1,n,ε,xP,Fθ2,m),=WΔ1,Δ˜2,if (D1,D2)=(Fθ1,n,F2,m,ε,yP),=WΔ˜1,Δ˜2,if (D1,D2)=(F1,n,ε,xP,F2,m,ε,yP).

From the above theorem, we get the asymptotic power of the proposed Wald-type tests under the contaminated contiguous alternatives as

πβ(Δ1,Δ2;ε)=P(D1,D2)Tm,n(β)>χp,α2=1Fχp2W˜εTΣβ(θ0)1W˜ε(χp,α2).

Using infinite series expansion of a non-central chi-square distribution function [14], we get

πβ(Δ1,Δ2;ε)=v=0CvW˜ε,Σβ(θ0)1Pχp+2v2>χp,α2,where Cv(t,A)=(tTAt)vv!2ve12tTAt.

In particular, substituting ε = 0 in the above theorem, we get back Theorem Theorem 2.5 on the asymptotic contiguous power of our tests and hence expression eq. (13) can be written as

πβ(Δ1,Δ2)=πβ(Δ1,Δ2;0p)=v=0CvW(Δ1,Δ2),Σβ(θ0)1Pχp+2v2>χp,α2.

Further, substituting Δ1=Δ2=0p, we get the asymptotic level of our Wald-type tests under the contamination as αε=πβ(0,0;ε).

Now we can define the power influence functions of our proposed tests which is nothing but επβ(Δ1,Δ2;ε)ε=0 under standard regularity conditions. Using the infinite series expression of a non-central chi-square distribution function, we can derive an explicit form of the PIFs as presented in the following theorem.

Theorem 2.7

Suppose the model density satisfies the Lehmann and Basu et al. conditions, and Assumption (A) holds. Then the power influence functions of our proposed Wald-type tests are given by

PIF(1)(x;Tβ,Fθ0)=ωKpδβW(Δ1,Δ2)TΣβ(θ0)1IF(x;Uβ,Fθ0),PIF(2)(y;Tβ,Fθ0)=1ωKpδβW(Δ1,Δ2)TΣβ(θ0)1IF(y;Uβ,Fθ0),PIF(x,y;Tβ,Fθ0)=KpδβW(Δ1,Δ2)TΣβ(θ0)1WIF(x;Uβ,Fθ0),IF(x;Uβ,Fθ0),

where δβ and W(Δ1,Δ2) are as defined in Theorem Theorem 2.5 and

Kp(s)=es2v=0sv1v!2v2vsPχp+2v2>χp,α2.

Note that the PIFs are also a function of the influence function of the MDPDE used and hence they are bounded whenever β>0. Thus the proposed tests will be robust for all β>0. However, at β=0, these PIFs will be unbounded (unless there is contamination at the same points x = y in both the samples) which proves the non-robust nature of the classical Wald test.

Note that, although there is no direct relationship between the IF of test statistics with the corresponding PIF in general, in this present case they are seen to be related indirectly via the IF of the MDPDE. So, using a robust MDPDE with β>0 in the proposed Wald-type tests will make both the test statistics and its asymptotic power robust under infinitesimal contamination.

Finally, we can find the level influence function of the proposed Wald-type tests either starting from αε and following the same steps as in the case of PIFs or just by substituting Δ1=Δ2=0 in the expression of the PIFs given in Theorem 2.7. In either case, since W(0,0)=0, it turns out that

LIF(1)(x;Tβ,Fθ0)=0,LIF(2)(y;Tβ,Fθ0)=0,LIF(x,y;Tβ,Fθ0)=0,

provided the corresponding IF of Uβ is bounded, which is true at β>0. Hence the asymptotic level of our Wald-type tests is always stable with respect infinitesimal contamination. This fact was also expected as we are using the asymptotic critical values for testing.

Example 2.3 (Continuation of Examples 2.1 and 2.1)

Let us again consider the problem of testing for normal means as in Examples 2.1 and 2.6. As seen above, the level influence function is always zero implying the level robustness of our proposed Wald-type test for all β>0. Next, to study the power robustness, we compute the functions PIF(1)(x;Tβ,Fθ0) and PIF(x,y;Tβ,Fθ0) numerically for different values of β with θ0=0 and plot them over the contamination points x and y in Figure 3. PIF(2)(y;Tβ,Fθ0) has the same nature as PIF(1)(x;Tβ,Fθ0). The figures clearly show the robustness of the proposed Wald-type tests with β>0, where the robustness increases (i.e., maximum possible PIF decreases) as β increases. Further, all the PIFs at β=0 are unbounded implying the non-robust nature of the classical Wald test.

Figure 3: Power influence functions of the proposed Wald-type test statistics at 95% level for testing equality of two normal means as in Example Example 2.2 (Continuation of Example 2.1) with known common σ2=1$\sigma^2=1$, W(Δ1,Δ2)=2$W(\Delta_1,\Delta_2)=2$ and ω$\omega$ = 0.5 (n = m).
Figure 3:

Power influence functions of the proposed Wald-type test statistics at 95% level for testing equality of two normal means as in Example Example 2.2 (Continuation of Example 2.1) with known common σ2=1, W(Δ1,Δ2)=2 and ω = 0.5 (n = m).

3 General composite hypotheses with two samples

In the previous section, we have considered the simplest two sample problem which tests for equality of all the model parameters. However, in practice, we need to test many different complicated hypotheses which cannot be solved just by considering the Wald-type test statistic Tm,n(β) defined in the previous section. For example, in many real life problems, we are only interested in a proper subset of the parameters ignoring the rest as nuisance parameters; example includes popular mean test taking variance parameter unknown and nuisance. Further, in case of testing for multiplicative heteroscedasticity of two samples, we have to test if the ratio of variance parameters equals a pre-specified limit with means being unknown and nuisance. Neither of them belongs to the problem considered in the previous section.

In this section, we will consider a general class of hypotheses involving two independent samples, which would include most of the above real life testing problems. Suppose ψ(θ1,θ2) denote a general function from Rp×Rp to Rr. Then, considering the set-up of the previous section, we want to develop a family of robust tests for the general class of hypothesis given by

(15)H0:ψ(θ1,θ2)=0ragainstH1:ψ(θ1,θ2)0r.

In particular, the problem of testing normal mean with unknown variance can be seen as a particular case of the above general set-up with ψ((μ1,σ12),(μ2,σ22))=μ1μ2. Further, to test for multiplicative heteroscedasticity, we can take ψ((μ1,σ12),(μ2,σ22))=σ12σ22C0 for some known constant C0 and apply the above general set-up. It is interesting to note that, this general class of hypotheses in eq. (15) also contains the simple hypothesis in eq. (9) as its special case with ψ(θ1,θ2)=θ1θ2.

Now, to define a robust Wald-type test statistics for this general set-up, we again consider the MDPDEs of θ1 and θ2 with tuning parameter β as given by (1)θˆβ and (2)θˆβ based on the individual samples separately. Note that, whenever H0 is true, we should have ψ((1)θˆβ,(2)θˆβ)0r in large sample and so its observed value provide the indication of any departure from the null hypothesis. Using its asymptotic variance-covariance matrix as a normalizing factor, we define the corresponding Wald-type test statistic as

(16)Tm,n(β)˜=nmn+mψ(1)θˆβ,(2)θˆβTΣβ˜((1)θˆβ,(2)θˆβ)1ψ(1)θˆβ,(2)θˆβ,

where Σβ˜(θ1,θ2)=ωΨ1(θ1,θ2)TΣβ(θ1)Ψ1(θ1,θ2)+(1ω)Ψ2(θ1,θ2)TΣβ(θ2)Ψ2(θ1,θ2) with

Ψi(θ1,θ2)=θiψ(θ1,θ2)T,i=1,2.

Note that, at β=0, the Wald-type test statistics Tm,n(0)˜ is again nothing but the classical Wald test statistics for the general hypothesis eq. (15) and hence our proposal is indeed a generalization of the classical Wald test.

Interestingly, although the general hypothesis contains the hypothesis eq. (9) as its special case, the Wald-type test statistics Tm,n(β)˜ with ψ(θ1,θ2)=θ1θ2 is not the same as the Wald-type test statistics Tm,n(β) considered in the previous section. However, whenever Σβ(θ) is linear in the parameters, these two Wald-type test statistics coincide asymptotically with probability tending to one. In this section, we present the properties of the statistics Tm,n(β)˜ with general ψ-function satisfying the following assumption.

Assumption (B):

  1. Ψi(θ1,θ2), i = 1,2, exist, have rank r and are continuous with respect to its arguments.

3.1 Asymptotic properties

We again start with the asymptotic null distribution of the proposed Wald-type test statistics Tm,n(β)˜ in order to obtain the required critical values for the test.

Theorem 3.1

Suppose the model density satisfies the Lehmann and Basu et al. conditions and Assumptions (A) and (B) hold. Then, under the null hypothesis in eq. (15), Tm,n(β)˜ asymptotically follows a χr2 distribution.

Therefore, the level-α critical region for the proposed test based on Tm,n(β)˜ for testing eq. (15) is given by

Tm,n(β)˜>χr,α2.

Next, in order to consider an approximation to the asymptotic power for this general test based on Tm,n(β)˜, we are going to use the following function

l˜(θ1,θ2)=ψ(θ1,θ2)TΣβ˜(θ1,θ2)1ψ(θ1,θ2).

Theorem 3.2

Suppose the model density satisfies the Lehmann and Basu et al. conditions and Assumptions (A)-(B) hold. Then, whenever ψ(θ1,θ2)0r, we have

mnm+nl˜((1)θˆβ,(2)θˆβ)l˜(θ1,θ2)Lm,nN0,4l˜(θ1,θ2),as m,n.

Note that, from the above theorem, we can easily obtain an approximation to the power function of the proposed level-α Wald-type tests based on Tm,n(β)˜ as

πm,n,α˜(β)(θ1,θ2)=PTm,n(β)˜>χr,α2=1Φnn+mnm2l˜(θ1,θ2)χr,α2nmn+ml˜(θ1,θ2),

for a sequence of distributions Φn() tending uniformly to the standard normal distribution Φ(), whenever ψ(θ1,θ2)0r.

In such cases, it can be easily checked that πm,n,α˜(β)(θ1,θ2)1 as m,n. This proves the consistency of our proposed tests.

Corollary 3.3

Under the assumptions of Theorem Theorem 3.2, the proposed Wald-type tests based on Tm,n(β)˜ are consistent.

Now, let us study the performance of the proposed general two-sample Wald-type tests under the contiguous alternative hypotheses. As discussed in the previous section, there could be different choices for the contiguous alternative hypotheses for any general null hypothesis. Here, following the similar idea as in the alternatives in eq. (12), we consider the general form of the contiguous alternatives given by

(17)H1,n,m:θ1=θ1,n=θ10+n12Δ1,θ2=θ2,m=θ20+m12Δ2,(Δ1,Δ2)Rp×Rp{(0p,0p)},

for some fixed θ10,θ20Θ0={θ1,θ2Θ×Θ:ψ(θ1,θ2)=0}. The asymptotic distribution of Tm,n(β)˜ under these alternatives H1,m,n has been presented in the following theorem.

Theorem 3.4

Suppose the model density satisfies the Lehmann and Basu et al. conditions and Assumptions (A)-(B) hold. Then the asymptotic distribution of Tm,n(β)˜ under H1,n,m in eq. (17) is χr2(δβ˜), where

δβ˜=Wψ(Δ1,Δ2)TΣβ˜(θ1,θ2)1Wψ(Δ1,Δ2)

with Wψ(Δ1,Δ2)=ωΨ1(θ1,θ2)TΔ1+1ωΨ2(θ1,θ2)TΔ2.

The above theorem directly helps us to obtain the asymptotic power π˜β(Δ1,Δ2) of our general Wald-type tests based on Tm,n(β)˜ under the contiguous alternatives H1,n,m in eq. (17) as

π˜β(Δ1,Δ2)=1Fχr2(δβ˜)(χr,α2).

3.2 Robustness properties

Let us now study the robustness properties of the proposed general two-sample Wald-type tests based on Tm,n(β)˜. We first consider the influence function of the Wald-type test statistics. Define the statistical functional corresponding to Tm,n(β)˜ ignoring the multiplier nmn+m as

Tβ˜(G1,G2)=ψUβ(G1),Uβ(G2)TΣβ˜1(θ1,θ2)ψUβ(G1),Uβ(G2),

where Uβ is the corresponding MDPDE functional. Then, we can derive the first and second order influence functions of the Wald-type test statistics following the derivations similar to that of Section 2.2. So, here we will skip those derivations for brevity and present only the final results in the following theorem.

Theorem 3.5

Consider the notations of Section 2.2. Under the null hypothesis in eq. (15) with G1=Fθ10, G2=Fθ20 and ψ(θ10,θ20)=0, the first and second order influence functions of our general two-sample Wald-type test statistics are given as follows:

For contamination only in the i-th sample (i = 1,2) at the point xi (x1=x,x2=y)

IF(i)(xi;Tβ˜,Fθ10,Fθ20)=0,IF2(i)(xi;Tβ˜,Fθi0,Fθ20)=2IF(xi;Uβ,Fθ10)TΨi(θ10,θ20)TΣβ˜(θ10,θ20)1Ψi(θ10,θ20)IF(xi;Uβ,θi0).

For contamination in both the samples

IF(x,y;Tβ˜,Fθ10,Fθ20)=0IF2(x,y;Tβ˜,Fθ10,Fθ20)=2Qβ(x,y)TΣβ˜(θ10,θ20)1Qβ(x,y).

with Qβ(x,y)=Ψ1(θ10,θ20)TIF(x;Uβ,Fθ10)+Ψ2(θ10,θ20)TIF(y;Uβ,Fθ20).

Clearly, as in the previous case of simple two sample problem in Section 2.2, here also the first order IF of the test statistics are always zero and hence non-informative about their robustness. However, their second order IFs are clearly bounded whenever the IF of the corresponding MDPDE is bounded which holds for all β>0. Thus, the proposed general two sample Wald-type tests with any β>0 yield robust solution under contamination in either of the samples or in both. Further, in case of contamination in both the samples, if the IF of the MDPDE is not bounded (at β=0), then also the corresponding second order IF can be bounded generating robust inference provided the term Qβ(x,y) is bounded. One example of such situation arises in case of the simpler problem of Section 2 under the choice x = y, because in that case Ψ1(θ10,θ20)=Ψ2(θ10,θ20)=Ip, the identity matrix of oder p, and hence Qβ(x,y) becomes identically zero.

Next, we consider the effect of contamination on the asymptotic power and level of the proposed general Wald-type tests based on Tm,n(β)˜. For this general case, we consider the contiguous alternatives H1,m,n as defined in eq. (17) but now with the null baseline parameter values as θ10 and θ20 for the two samples respectively instead of the common θ0 and define the level and power influence functions using the corresponding contaminated distributions as in Section 2.3. Following theorem presents the asymptotic distribution of the test statistics under the contiguous and contaminated distributions, where Δ˜is (i = 1,2) are as defined in Section 2.3.

Theorem 3.6

Suppose the model density satisfies the Lehmann and Basu et al. conditions and Assumptions (A)-(B) hold. Then, the asymptotic distribution of the general Wald-type test statistics Tm,n(β)˜ under any contaminated contiguous alternative distributions (D1,D2) is non-central chi-square with r degrees of freedom and non-centrality parameter Wε˜TΣβ˜(θ1,θ2)1Wε˜, where

Wε˜=Wψ(Δ˜1,Δ2),if (D1,D2)=(F1,n,ε,xP,Fθ2,m),=Wψ(Δ1,Δ˜2),if (D1,D2)=(Fθ1,n,F2,m,ε,yP),=Wψ(Δ˜2,Δ˜2),if (D1,D2)=(F1,n,ε,xP,F2,m,ε,yP).

The above theorem can be used to get the asymptotic power of the proposed general two-sample Wald-type tests under the contiguous contaminated alternatives in terms of an infinite series following Section 2.3 (arguments after 2.6). This can be also simplified by substituting ε = 0 or Δ1=Δ2=0p to get asymptotic power under contiguous alternatives or the asymptotic level under contiguous contamination respectively. Further, the resulting infinite series expressions can now be used to obtain the power and level influence functions for this general case. Since the derivations are the same as that of Theorem 2.7, for brevity, we will only present the resulting expressions skipping the details in the following Theorem.

Theorem 3.7

Suppose the model density satisfies the Lehmann and Basu et al. conditions, and Assumptions (A)–(B) hold. Then we have the following results for the proposed Wald-type test functional T˜β for testing the general two-sample hypothesis in eq. (15).

The power influence functions are given by

PIF(1)(x;Tβ˜,Fθ10,Fθ20)=ωKrδβ˜Wψ(Δ1,Δ2)TΣβ˜(θ0)1Ψ1(θ10,θ20)TIF(x;Uβ,Fθ10),PIF(2)(y;Tβ˜,Fθ10,Fθ20)=1ωKrδβ˜Wψ(Δ1,Δ2)TΣβ˜(θ0)1Ψ2(θ10,θ20)TIF(y;Uβ,Fθ20),PIF(x,y;Tβ˜,Fθ10,Fθ20)=Krδβ˜Wψ(Δ1,Δ2)TΣβ˜(θ0)1WψIF(x;Uβ,Fθ10),IF(y;Uβ,Fθ20),

where δβ˜ and Wψ(Δ1,Δ2) are as defined in Theorem 3.4 and Kr(s) is as defined in Theorem 2.7.

Provided the IF of the MDPDE Uβ is bounded, the level influence functions are given by

LIF(1)(x;Tβ˜,Fθ10,Fθ20)=0,LIF(2)(y;Tβ˜,Fθ10,Fθ20)=0,LIF(x,y;Tβ˜,Fθ10,Fθ20)=0.

Note that for the general two-sample hypothesis eq. (15) also, the LIFs and the PIFs of our proposed test are bounded whenever the influence function of the MDPDE used is bounded which holds for all β>0. Thus, our proposal with β>0 is robust also for testing any general two-sample problem.

3.3 Special case: Testing partial homogeneity with nuisance parameters

Let us consider a simplified and possibly the most common special case of the general hypothesis in eq. (15), where we test for partial homogeneity of the two samples assuming some parameters to be nuisance. Mathematically, let us consider the partition of the parameters θ1=θ1T,0θ1TT and θ2=θ2T,0θ2TT as in the beginning of Section 2, but now we assume both, 0θ1 and 0θ2, to be unknown and nuisance parameters. Under these notations, we consider the hypothesis of partial homogeneity as given by

(18)H0:θ1=θ2against H1:θ1θ2,

with 0θ1 and 0θ2 being unknown under both hypotheses. Note that, this special case contains the problem of testing normal mean with unknown variances with θi being the mean and 0θi being the variance parameter for each i = 1,2. In practice we can either assume 0θ1=0θ2 (e.g., equal variances) or 0θ10θ2 (e.g., unequal variances). Here, we will consider the general case assuming 0θ10θ2; other case can also be dealt similarly.

Note that the hypothesis eq. (18) is indeed a special case of the general hypothesis in eq. (15) with ψ(θ1,θ2)=θ1θ2. Hence, the proposed MDPDE based Wald-type test statistics for testing eq. (18) is given by

(19)Tm,n(β)˜=nmn+m(1)θˆβ(2)θˆβTωΣβ11((1)θˆβ)+(1ω)Σβ11((2)θˆβ)1(1)θˆβ(2)θˆβ,

where (1)θˆβ and (2)θˆβ are the first r-components of the MDPDEs (1)θˆβ=((1)θˆβT,(1)0θˆβT)T and (2)θˆβ=((2)θˆβT,(2)0θˆβT)T of θ1 and θ2 respectively and Σβ11(θ) denotes the r×r principle minor of the asymptotic variance-covariance matrix Σβ(θ)=Σβ11(θ)Σβ12(θ)Σβ12(θ)TΣβ22(θ). Also note that Assumption (B) always holds for the hypothesis eq. (18). Following Theorem Theorem 3.1, the asymptotic distribution of Tm,n(β)˜ in eq. (19) under the null hypothesis in eq. (18) is χr2 and the test is consistent against any fixed alternatives by Corollary Corollary 3.3. To study the asymptotic contiguous power in this case, we consider the contiguous alternatives

(20)H1,n,m:θ1=θ0+n12Δ1,θ2=θ0+m12Δ2,(Δ1,Δ2)Rr×Rr{(0r,0r)},

for some fixed θ0Θ. Then, by Theorem 3.4, the asymptotic distribution of the Wald-type test statistics Tm,n(β)˜ in eq. (19) under H1,n,m in eq. (20) is a non-central chi-square distribution with r degrees of freedom and non-centrality parameter δβ˜=W(Δ1,Δ2)TωΣβ11((1)θˆβ)+(1ω)Σβ11((2)θˆβ)1W(Δ1,Δ2) from which the power can be calculated easily.

Next, for examining robustness properties, we define the corresponding test functional following Section 3.2 as given by

Tβ˜(G1,G2)=Uβ(G1)Uβ(G2)TωΣβ11((1)θˆβ)+(1ω)Σβ11((2)θˆβ)1Uβ(G1)Uβ(G2),

where Uβ denotes first r-components of the minimum DPD functional Uβ. Then, we can get the IF for this test statistics from Theorem Theorem 3.5. In particular, the first order influence function is identically zero for any kind of contamination and hence non-informative. And its second order influence function for contamination in i-th sample at the point xi (i = 1,2) is given by

IF2(i)(xi;Tβ˜,Fθi0,Fθ20)=2IF(xi;Uβ,Fθ10)TωΣβ11((1)θˆβ)+(1ω)Σβ11((2)θˆβ)1IF(xi;Uβ,θi0).

and the same for contamination in both samples is given by

IF2(x,y;Tβ˜,Fθ10,Fθ20)=2Qβ(x,y)TωΣβ11((1)θˆβ)+(1ω)Σβ11((2)θˆβ)1Qβ(x,y),

with Qβ(x,y)=IF(x;Uβ,Fθ10)IF(y;Uβ,Fθ20). Similarly, following Theorem 3.7, the level influence functions are always zero and the power influence functions under contiguous contamination in each sample separately or in both the samples are respectively given by

PIF(1)(x;Tβ˜,Fθ10,Fθ20)=ωKrδβ˜W(Δ1,Δ2)TωΣβ11((1)θˆβ)+(1ω)Σβ11((2)θˆβ)1IF(x;Uβ,Fθ0),PIF(2)(y;Tβ˜,Fθ10,Fθ20)=1ωKrδβ˜W(Δ1,Δ2)TωΣβ11((1)θˆβ)+(1ω)Σβ11((2)θˆβ)1IF(y;Uβ,Fθ0),PIF(x,y;Tβ˜,Fθ10,Fθ20)=Krδβ˜W(Δ1,Δ2)TωΣβ11((1)θˆβ)+(1ω)Σβ11((2)θˆβ)1×WIF(x;Uβ,Fθ0),IF(x;Uβ,Fθ0),

where δβ˜ and Uβ are as defined previously in this subsection. The nature of these PIFs are exactly the same as in the previous cases and indicates robustness of our proposals with β>0.

Example 3.1 (Testing equality of two Normal means with unknown and unequal variances)

We again consider the example of comparing two normal means (say μ1 and μ2), but now with unknown and unequal variances (say σ12 and σ22) for the two populations. Hence the model family is F={N(μ,σ2):θ=(μ,σ)TR×[0,)} and we want to test for the hypothesis

(21)H0:μ1=μ2against H1:μ1μ2,

with σ12 and σ22 being unknown under both hypotheses. Let us denote the MDPDEs based on the i-th sample (i = 1,2) as (i)θˆβ=((i)μˆβ,(i)σˆβ)T and its asymptotic variance matrix Σβ(θ) is given by

Σβ(μ,σ)=1+β21+2β3/2σ200(1+β)2(2+β2)22ζβ(1+2β)5/2β2,

with ζβ=1+3β+5β2+7β3+6β4+2β5. Then, noting that the hypothesis eq. (21) is of the form eq. (18), our proposed generalized Wald-type test statistics eq. (19) simplifies to

(22)Tm,n(β)˜=mnm+n1+β21+2β3/2(1)μˆβ(2)μˆβ2ω(1)σˆβ2+(1ω)(2)σˆβ2,

whose null asymptotic distribution is χ12 from Theorem Theorem 3.1. In the particular case of β=0, we have

Tm,n(0)˜=mnm+n(1)μˆ0(2)μˆ02ω(1)σˆ02+(1ω)(2)σˆ02=mnm+nXˉYˉ2ωsX2+(1ω)sY2,

where Xˉ and Yˉ are the sample means and sX2 and sY2 are the sample variances of X1,,Xn and Y1,,Ym respectively, and this is nothing but the classical MLE based Wald test statistic.

We can now study the asymptotic and robustness properties of these proposed Wald-type tests following the theoretical results derived in this section. However, due to the asymptotic independence of the MDPDEs of μ and σ under normal model, all the properties of the Wald-type test statistics in eq. (22) turn out to be similar in nature to those of the proposed Wald-type test with known σ as discussed in Examples 2.1, 2.6 and Example 2.2 (Continuation of Example 2.1) with the common variance σ2 there replaced by ωσ12+(1ω)σ22 in the present case. This fact can also be observed intuitively by noting that the Wald-type test statistics in eq. (22) have a similar form as the corresponding Wald-type test statistics for known common σ2 case (in Example 2.1) with the known value there being replaced by ω(1)σˆβ2+(1ω)(2)σˆβ2. So, we will skip these details for the present general case for brevity. However, examining them, one can easily verify that, in this case of unknown and unequal variances also, the asymptotic contiguous power of the proposed Wald-type test decreases only slightly as β increases (exactly in the same rate as in Table 1) but the robustness increases significantly having bounded (second order) influence functions of the Wald-type test statistics and bounded power and level influence functions for all β>0.

4 The cases of one-sided alternatives

As we have mentioned in the introduction (Section 1), majority of common practical applications of the two-sample problems are in comparing the treatment and control groups in any experimental or clinical trials or any observational studies among two such groups of population. However, in most of such cases, researchers want to test weather there is any improvement in the treatment group over the control groups due to the treatment effects. For example, one might be interested to test if the success rate of cure (modeled by binomial probability model) is reduced, or if the number of attacks of a disease (modeled by Poisson model) decreases in the treat group, or some continuous biomarkers like blood pressure etc. (modeled by normal model) changes in the targeted direction from control to treatment group. All of them lead to the one-sided alternatives in contrast to the omnibus two-sided alternatives considered so far in this paper. Although the case of general one-sided alternatives with vector parameters are much difficult to define and dealt with and hence need more targeted future research, our proposal of robust Wald-type tests in this paper can be easily extended for comparing any scalar parameters with one-sided alternatives. Noting that all the above motivating practical scenarios indeed deal with scalar parameter comparison, in this section we extend our proposal to these particular one sample problems.

In general, we consider the class of one-sided version of eq. (15) with r = 1. So, ψ(θ1,θ2) is a real function of the parameters and we develop the robust test for the one-sided hypothesis given by

(23)H0:ψ(θ1,θ2)=0againstH1:ψ(θ1,θ2)>0.

Note that the one sided version of the simple two-sided hypothesis in eq. (9) with scalar parameters (p = 1), that contains the motivating examples for Poisson and binomial models and normal model with known variances, belong to this general class eq. (23). Also, this general class of hypotheses contains many more useful cases like testing for increase (or decrease) in normal means with unknown variances.

For testing the one sided hypothesis eq. (23), we define the corresponding robust Wald-type test statistics by taking a signed square-root of our two-sided Wald-type test statistics Tm,n(β)˜ in eq. (16)

(24)Tm,n(β)P˜=sgnψ(1)θˆβ,(2)θˆβTm,n(β)˜=nmn+mψ(1)θˆβ,(2)θˆβΣβ˜((1)θˆβ,(2)θˆβ),

where sgn(  ) denotes the sign function and note that Σβ˜(θβ,θβ) is a scalar for r = 1. Then, we have the following null asymptotic distribution.

Theorem 4.1

Under the assumptions of Theorem Theorem 3.1, the asymptotic null distribution of the one-sided test statistics Tm,n(β)P˜ for testing eq. (23) is standard normal.

Following the above theorem, the level-α critical region for testing the one-sided hypothesis in eq. (23) is given by Tm,n(β)P˜>z1α, where z1α denotes the (1α)-th quantile of the standard normal distribution.

Further, as in the case of two-side alternatives, we can also derive an power approximation of these proposed Wald-type tests at any fixed alternative (θ1,θ2) satisfying ψ(θ1,θ2)>0 as follows:

πm,n,α˜(β)P(θ1,θ2)=PTm,n(β)P˜>z1α=Pnmn+mψ(1)θˆβ,(2)θˆβψθ1,θ2Σβ˜((1)θˆβ,(2)θˆβ)>z1αnmn+mψθ1,θ2Σβ˜(θ1,θ2)=1Φnz1αnmn+mψθ1,θ2Σβ˜(θ1,θ2),

for a sequence of distributions Φn() tending uniformly to the standard normal distribution Φ(), since under the alternative parameter values (θ1,θ2)

nmn+mψ(1)θˆβ,(2)θˆβψθ1,θ2Σβ˜((1)θˆβ,(2)θˆβ)Lm,nN(0,1).

Now, since ψθ1,θ2>0 under the alternatives in eq. (23), we have πm,n,α˜(β)P(θ1,θ2)1 as m,n and hence the proposed Wald-type tests are consistent for the one-sided alternatives also.

Next to study the contiguous power of the proposed Wald-type tests, we can consider the class of contiguous alternatives in eq. (17) but now with (Δ1,Δ2) being such that ψθ1,n,θ2,m>0 for all m,n. This can be equivalently (asymptotic) expressed in terms of the sequence of alternatives

(25)H1,m,nP:ψθ1,n,θ2,m=m+nmnd,

with d=WψΔ1,Δ2>0. The following theorem then gives the asymptotic distribution of our Wald-type test statistics under the contiguous alternatives in eq. (25) and the corresponding asymptotic power.

Theorem 4.2

Under the assumptions of Theorem 3.4, the asymptotic distribution of Tm,n(β)P˜ in eq. (24) under the sequence of contiguous alternatives in eq. (25) is normal with mean d/Σβ˜(θ1,θ2) and variance 1. Hence, the corresponding asymptotic contiguous power of the proposed Wald-type tests is given by

π˜βP(Δ1,Δ2)=π˜βP(d)=1Φz1αd/Σβ˜(θ1,θ2).

Now we can also derive the robustness properties of the proposed Wald-type tests against one-sided alternatives by defining the corresponding statistical function as

Tβ˜P(G1,G2)=ψUβ(G1),Uβ(G2)/Σβ˜(θ10,θ20).

Then, under the assumptions of Theorem Theorem 3.5 with contamination in only i-th sample at the point xi (i = 1,2), the first order influence function of the proposed Wald-type test statistics at the null hypothesis in eq. (23) is given by

IF(i)(xi;Tβ˜P,Fθ10,Fθ20)=Ψi(θ10,θ20)TIF(xi;Uβ,θi0)/Σβ˜(θ10,θ20),

and the same for contamination in both the samples is given by

IF(x1,x2;Tβ˜P,Fθ10,Fθ20)=Qβ(x1,x2)/Σβ˜(θ10,θ20),

with Qβ(,) being as defined in Theorem Theorem 3.5 (but is a scalar now). Note that, unlike the two-sided hypotheses, here the first order influence function of the proposed Wald-type test statistics is non-zero. Further, it is bounded whenever te IF of the corresponding MDPDE is bounded, i.e., only for β>0 and unbounded at β=0 implying the robustness of our proposal with β>0.

In order to derive the corresponding level and power influence functions, we consider the same set of hypothesis as in Section 3.2 but now with the restriction ψθ1,n,θ2,m>0 for all m,n under the alternative sequence, which is ensured by assuming WψΔ1,Δ2>0. Then, the following theorem gives the asymptotic distribution of the one-sided test statistics Tm,n(β)P˜ under the contiguous contaminated distributions.

Theorem 4.3

Under the assumptions of Theorem 3.2, the asymptotic distribution of Tm,n(β)P˜ under any contaminated contiguous alternative distributions (D1,D2) is normal with mean Wε˜/Σβ˜(θ1,θ2) and variance 1, where Wε˜ is as defined in Theorem 3.2 for different (D1,D2).

Using above theorem and following the arguments similar to those for the two-sided alternatives in Section 3.2, we can get the power influence functions for this case of one-sided alternatives also, which is presented in the next theorem.

Theorem 4.4

Under the assumptions of Theorem 3.7, the power influence functions of our proposed Wald-type test functional T˜βP for testing the one-sided hypothesis in eq. (23) are given by

PIF(1)(x;Tβ˜,Fθ10,Fθ20)=ωΣβ˜(θ1,θ2)ϕz1αWψΔ1,Δ2Σβ˜(θ1,θ2)Ψ1(θ10,θ20)TIF(x;Uβ,Fθ10),PIF(2)(y;Tβ˜,Fθ10,Fθ20)=1ωΣβ˜(θ1,θ2)ϕz1αWψΔ1,Δ2Σβ˜(θ1,θ2)Ψ2(θ10,θ20)TIF(y;Uβ,Fθ20),PIF(x,y;Tβ˜,Fθ10,Fθ20)=1Σβ˜(θ1,θ2)ϕz1αWψΔ1,Δ2Σβ˜(θ1,θ2)WψIF(x;Uβ,Fθ10),IF(y;Uβ,Fθ20).

Note that, the nature of these PIFs with respect to the contamination points x and y are exactly same as those in the case of two-sided alternatives except for a multiplicative constant. In particular, they are bounded whenever the influence function of the MDPDE used is bounded, i.e., at β>0, implying robustness of our proposal.

Finally, we can get the level influence functions from the above theorem by substituting Δ1=Δ2=0 in the expressions of PIFs. Note that, in this case of one-sided hypothesis testing, the LIFs are not identically zero, but they are bounded only for β>0 implying again the level stability of our proposed Wald-type tests.

For illustration, we will again present the case of normal model with one-sided alternatives in the following example. Other motivating models with relevant data examples will be provided in the next section.

Example 4.1 (Comparing two Normal means against one-sided alternatives)

Let us again consider the two-sample problem under normal model with unknown and unequal variances as in Example 3.1, but now with the one-sided alternatives so that our target hypothesis is

(26)H0:μ1=μ2against H1:μ1>μ2,

with the variance parameters σ1 and σ2 being unknown for both hypotheses. Considering the notations of Example 3.1, our proposed test statistics Tm,n(β)P˜ is then given by

(27)Tm,n(β)P˜=mnm+n1+β21+2β3/4(1)μˆβ(2)μˆβω(1)σˆβ2+(1ω)(2)σˆβ2,

which has standard normal asymptotic distribution under the null. Clearly this statistic also coincides with the corresponding classical Wald test statistic at β=0. Since the test is consistent at any fixed alternatives, we consider the contiguous alternatives H1,m,nP:ψ(θ1,θ2)=μ1μ2=m+nmnd with d & 0, under which the test statistics has asymptotic distribution as normal with mean 1+β21+2β3/4dωσ12+(1ω)σ2212 and variance 1. Corresponding asymptotic contiguous power at different values of d and β with σ12=σ22=1 and ω = 0.5 (n = m) is presented in Table 2. Note that, as expected this power decreases only slightly as β increases (note the similarity with Table 1).

Table 2:

Asymptotic contiguous power of the proposed Wald-type tests at 95% level for testing equality of two normal means against one-sided alternatives as in Example 4.1.

β
d00.10.30.50.70.91
00.0500.0500.0500.0500.0500.0500.050
10.2600.2580.2470.2330.2190.2070.201
20.6390.6340.6080.5740.5380.5030.487
30.9120.9090.8910.8650.8330.7980.780
51.0001.0000.9990.9980.9970.9940.991

Further, the influence function of the proposed Wald-type test statistics in this case of one-sided alternatives simplifies to

IF2(i)(xi;Tβ˜P,Fθ10,Fθ20)=ωσ102+(1ω)σ202121+2β3/4(xiμi0)eβ(xiμi0)22σi02,

and

IF2(x1,x2;Tβ˜P,Fθ10,Fθ20)=1+2β3/4ωσ102+(1ω)σ202(x1μ10)eβ(x1μ10)22σ102(x2μ20)eβ(x2μ20)22σ202.

Note that these influence functions are square roots of the corresponding influence functions under two-sided alternatives in Example Example 2.2 (Continuation of Example 2.1) except for a multiplicative constant. Further, by the general theory developed above, the corresponding PIFs and LIFs in this case can be shown to be also a constant multiplication of the corresponding PIFs in the two-sided case presented in Example Example 2.2 (Continuation of Example 2.1). Therefore, the boundedness nature of all these influence functions for the one-sided alternative will be similar to those presented in Figure 1a and Figure 3, i.e., bounded at β>0 and unbounded at β=0. These again imply the robustness of our proposal with β>0 over the classical Wald test at β=0.

5 Real life applications

5.1 Poisson model for clinical trial: Adverse events data

In our first example we will consider the application of the proposed Wald-type tests with Poisson model to the adverse event data in an Asthma clinical trial conducted by Kerstjens et al. [15][Table 3]. In this two phase randomized controlled trials, 912 patients having asthma and receiving inhaled glucocorticoids and LABAs had been divided into treatment and control groups of the two trials and were randomly assigned a total dose of 5 g tiotropium (treatment group) or suitable placebo (control group) once daily for 48 weeks. Then, Kerstjens et al. [15] investigated the effect of this combined treatment on patient’s lung function and exacerbations.

Table 3:

No of Different adverse events reported in Trial 2 of the Kerstjens et al. [15] clinical trail study.

Treatment91491912123131063376544320
Control10958201310106457512445221

Here we will consider the data on 19 reported adverse effect on the patients in trail 2 of this study, presented in Table 3, that can be modeled by a Poisson distribution with mean θ. Note that the first two entry for both the groups (corresponding to the events of Asthma and Decreased rate of peak expiratory flow) clearly stands out as outliers from the remaining observations. Hence, in presence of these two observations the MLE of the Poisson parameters θ1 and θ2 in treatment and control groups (15 and 18.47 respectively) turns out to be drastically different from the MLEs without them (8.82 and 9.65 respectively). However the robust MDPDEs with larger β remains stable (see Table 4). Clearly, the number of average adverse effect decreases from control to treatment group; but to check how significant this change is, one might be interested in testing the one-side hypothesis

(28)H0:θ2=θ1againstH1:θ2>θ1.

We have applied our proposed Wald-type tests for this problem, as developed in Section 4, to both the full dataset and after deleting the first two outliers from both the groups; the resulting p-values are presented in Figure 4a. Clearly, the classical Wald test results in completely different inference due to the inclusion of these outlying observations – it’s p-value becomes significant from non-significant inference without them (at 95% level). On the other hand, proposed MDPDE based robust Wald-type tests with β>0 gives stable results (accept the null hypothesis) even in presence of outlying observations.

Table 4:

MDPDEs of Poisson parameter θ for the Adverse Events Data in Table 3.

(β)
Group00.10.30.50.70.91
WithTreatment15.007.256.946.355.866.055.70
OutlierControl18.478.257.757.567.537.417.81
WithoutTreatment8.827.476.446.206.145.586.58
OutlierControl9.657.977.637.617.567.687.75

5.2 Poisson model for experimental trial: Drosophila data

We next consider another application to the Poisson model with data from an controlled experimental trial with Drosophila flies producing occasional spurious counts. The dataset contains two independent samples on the numbers of recessive lethal mutations observed among the daughters of male flies who are exposed either to a certain degree of chemical to be screened (treatment group) or to control conditions. This dataset has been previously analyzed by many statisticians including Woodru et al. [17], Simpson [16], Basu et al. [3] who have shown that the response data can be modeled by Poisson distribution, but there are two outlying observations in one sample that affects the likelihood based inference and so the classical Wald test. See Basu et al. [3][Table 7] for the dataset and the MDPDEs of the Poisson parameters.

Here, we will apply the proposed Wald-type tests for comparing the Poisson parameters for the two samples, say θ1 and θ2, through testing the one-sided hypothesis in eq. (28). The resulting p-values are presented in Figure 4b. Clearly, in presence of outliers, the classical rejects the null hypothesis indicating that the average number of mutation is significantly more for the second sample, which is the opposite of the true inference obtained after removing these outliers from the second sample. But, the proposed MDPDE based Wald-type tests with β0.1 produce robust results even in presence of outliers accepting the null hypothesis.

Figure 4: P-values of the proposed Wald-type tests under the real data examples with outliers (solid line) and without outliers (doted line).
Figure 4:

P-values of the proposed Wald-type tests under the real data examples with outliers (solid line) and without outliers (doted line).

5.3 Normal model for clinical trial: Infant platelet count data

We will now present another clinical trial example from Karpatkin et al. [18] to illustrate the applications under the normal model. This clinical trial was conducted to study if the infant platelet count can be increased by giving steroids to the mothers with autoimmune thrombocytopenia during pregnancy. The study consists of 19 mothers with 12 being given steroid (treatment group) and 7 not given steroid (control group) and the corresponding infant platelet counts (in thousands, per mm3) after delivery are given in Table 5. These can be modeled by a normal model with means θ1, θ2 and the variances σ12, σ22 for the treatment and control groups respectively. Then, the primary research problem can be solved by testing the one-sided hypothesis in eq. (28) with σ12 and σ22 being unknown.

Table 5:

Infant Platelet count after delivery (in thousands, per mm3) in the Karpatkin et al. [18] clinical trail study.

Treatment12012421590671269519018013539965
Control122011232604018

The p-values for this testing problem obtained by applying the proposed Wald-type tests, as described in Example 4.1, are presented in Figure 4c for different β0. One can easily observe that there is a large outlier value of 399 (thousands) in the treatment group that affects the classical Wald test (at β=0). However, our MDPDE based proposal with β>0 produces stable p-value ignoring the effect of the outlying observation.

5.4 Normal model for health study: Hair Zn content data

Two-sample test under the normal model has many possible applications from which we now present a health study to examine the impact of polluted urban environment over individual health in Sri Lanka. The dataset consist of the zinc (Zn) content of the hair of two independent samples taken from urban (polluted) and rural (unpolluted) Sri Lanka and our target is to check if the Zn content is more for polluted urban residents impacting their health conditions. The dataset was presented in Basu et al. [5][Table 6] and it has been shown their that each sample can be modeled by normal distributions with means θi and variance σi2 (i = 1,2 for rural and urban groups respectively) except for two possible outliers. There is one outlier in each of the samples that affects the MLE based inference while testing for the targeted hypothesis eq. (28) of comparing θ1 and θ2 with unknown σ12 and σ22.

We have applied the proposed MDPDE based Wald-type test for this problem following Example 4.1 and the resulting p-values are presented in Figure 4d. Clearly, the significance increase of the zinc contents in urban residents cannot be identified by the classical Wald-test in presence of outliers, but our proposal with β0.1 gives stable and correct inference ignoring the effect of the outliers.

5.5 Normal model for quality control: Cloth manufacturing data

Our third and final example with normal model will be in the context of quality control based on the data from the Levi-Strauss clothing manufacturing plant. The dataset consists of 22 measurements on run-up (a percentage measure of wastage in cloth) for each of two particular mills supplying cloths to the plant [5][Table 1]. To control the quality of the cloths, the plant want to test for the consistency of the run-up measures from the two mills. Since the sample from each mill can be modeled by normal distribution with mean θi and variance σi2 (i = 1,2), the objective is then to test for the both sided hypothesis

(29)H0:θ1=θ2againstH1:θ1θ2,

with σ12 and σ22 being unknown under both cases. However, as illustrated in [5], the dataset contains 3 potential outliers that make the MLE based inference highly non-robust. Hence the classical Wald test rejects the null hypothesis in presence of outliers whereas it accept the null after removing the outliers. When we apply the proposed MDPDE based Wald-type problem, following the description as in Example 3.1, the corresponding p-values (reported in Figure 4e) becomes highly stable for β1.5 rejecting the null hypothesis even in presence of the outliers.

5.6 Exponential model for reliability testing: Components life-time data

We will end this section with an example of exponential model used in reliability testing between two sets of products’ lifetimes. We will use the (simulated) data from Perng [19] which consist of the lifetimes (in thousand of hours) of a particular electronic components produced by two different processes (see Table 6). Each sample can be then modeled by exponential distributions with mean θi (i = 1,2). Our objective in reliability testing of the manufacturing process is to test whether the lifetimes for both the process have the same distributions, i.e., if θ1=θ2 against the both-sided alternatives as in the hypothesis eq. (29). It has been observed that there is no significant difference in the distributions of both the processes and so the null hypothesis should be accepted by any standard test.

Table 6:

Lifetimes (in thousand of hours) of a particular electronic components produced by two different processes [19].

Process 10.0440.1340.1420.1580.2160.6250.6490.6581.0621.1401.1591.238
Process 20.0600.1740.2370.2720.3350.3910.6700.9021.5431.6152.0132.309

Since there is no outliers in this dataset, in order to study the robustness aspect of our proposal we add one outlying value of 20 (assuming a decimal is misplaced by one digit from 2.0) in the second sample. The resulting p-values obtained by the proposed Wald-type tests for both the pure data and with this artificial outlier are presented in Figure 4f for different β. Clearly, the classical Wald test changes drastically by rejecting null due to insertion of only one outlying observations, but our proposed Wald-type tests with β0.1 remains stable and still accept the null hypothesis robustly in presence of the outlier.

6 Simulation study and the choice of tuning parameter β

Finally, to examine the finite sample performances of our Wald-type tests, we have performed several simulation studies with all the models considered in the previous section for real datasets. However, noting the similarity of the results for different models, for brevity, here we will report the results from only one simulation study under normal model with two-sided alternatives.

We simulate 1000 pairs of samples, each of size n = 50, independently drawn from N(θi,1) distributions (i = 1,2) and perform the proposed Wald-type tests for testing H0:θ1=θ2 against the two-sided alternative H1:θ1θ2, first assuming the variances to be known (both equal to 1) and then for unknown and possibly unequal variances, following Examples 2.1 and 3.8, respectively. Then, we compute the empirical sizes and powers of the proposed test under such pure data over 1000 iterations, where for size calculation we have taken θ1=θ2=0 and for power calculation θ1=0, θ2=1. Next, to study the robustness of these tests, we contaminate 100ε% of the second sample in each iteration (for ε = 0.1,0.15,0.2) by observations from a N(θc,1) distribution and repeat the above simulation to compute empirical sizes and powers under contamination. We have taken θc=3 and 3 for studying the robustness of size and power, respectively. Note that these contamination distributions are not very far from the corresponding true distributions and hence generate reasonably common practical situations. The resulting empirical sizes and powers are reported in Figure 5.

Figure 5: Empirical sizes and powers of the proposed Wald-type tests for testing equality of two normal means with both the known and unknown variance case at sample size n = 50 under pure data (solid line) and with contamination of 10% (dash-doted line), 15% (doted line) and 20% (dashed line).
Figure 5:

Empirical sizes and powers of the proposed Wald-type tests for testing equality of two normal means with both the known and unknown variance case at sample size n = 50 under pure data (solid line) and with contamination of 10% (dash-doted line), 15% (doted line) and 20% (dashed line).

It can be easily observed from Figure 5 that the size and power of the proposed Wald-type tests under pure data change (increase and decrease, respectively) only very slightly with increasing β, but their stabilities increase significantly. In particular, under contamination, both size and power of the tests near β=0, the classical Wald test, are heavily affected. But larger positive values of β make these measures much more stable for both the known and unknown variance cases. However, for the cases of known (and correctly specified) variances we get highly stable results even for β as low as 0.3 or 0.4, whereas we need β0.5,0.6 for similar stability in the case of unknown variances. This is intuitively expected since under the present contamination schemes the variance estimates also change and we need stronger downweighting to get overall stable inference with larger values of β.

To further illustrate the advantages of our proposed tests compared to the non-parametric Wilcoxon rank-sum test, we have repeated the above simulation exercise to derive the corresponding empirical sizes and powers of the Wilcoxon test. This Wilcoxon test is equivalent to the two-sample Mann-Whitney test and is the most commonly used default method for robust two-sample tests of hypotheses. The resulting values of its empirical sizes and powers are reported in Table 6 along with the same for the classical Wald test and the proposed MDPDE based Wald-type tests at some particular β assuming equal but unknown variances. It is evident from Table 6 that the non-parametric Wilcoxon test is slightly robust compared to the classical Wald test but it still has a high degree of non-robustness under higher contamination levels. Our proposed Wald-type tests with larger β>0 perform much more robustly compared to both the Wald test and the Wilcoxon test under contaminated data and perform very competitively under pure data. These observations appear to indicate that, when the parametric model is even approximately correct, our proposed tests indeed serve as very useful and significantly improved simple robust alternatives to the existing likelihood based or non-parametric solutions for the two-sample problems arising frequently in biostatistics and many other disciplines.

Table 7:

Empirical sizes and powers for the classical Wald test, the non-parametric Wilcoxon rank-sum test and the proposed MDPDE based Wald-type tests at different β under pure and contaminated data (assuming equal but unknown variances).

Cont.WaldWilcoxonMDPDE based Wald-Type tests with β
Prop.TestTest0.10.30.50.71
Size0%0.0490.0470.0490.0530.0520.0520.058
10%0.2090.1160.1630.1040.0790.0690.064
15%0.4660.2480.3740.2280.1550.1060.075
20%0.6520.3950.5560.4080.2890.1870.119
Power0%1.0001.0001.0001.0000.9990.9930.979
10%0.6280.9080.7470.9040.9610.9700.959
15%0.2920.7280.4160.6810.8590.9260.937
20%0.1380.4920.2090.4030.6410.7930.874

Throughout all our example and simulations above, we have notices that the tuning parameter β controls between robustness of the proposed Wald-type tests and its asymptotic contiguous power under pure data. So, we need to chose β properly for any practical applications. In particular we note that, in most of the example models, the loss in power is not significant enough at small positive β, whereas we get highly robust inferences for β0.3 (except for few cases with very high contaminations where we may need β0.4,0.5). Therefore, an empirical suggestion for the choice of β in any application suspecting some contamination could be within the range β[0.3,0.5] for generating robust inference without significant loss in power.

Although this ad hoc empirical choice of β works well enough in most practical datasets suspectable to outliers, many practitioners will prefer a data-driven choice of β in case of no idea on the level of contamination in dataset that might produce a better trade-off. In this respect, we note that the performance of the proposed Wald-type tests directly depends on that of the MDPDE (with tuning parameter β) used in constructing the test statistics. In particular the asymptotic contiguous power of the proposed test has the same nature as the asymptotic efficiency of the corresponding MDPDE whereas all the robustness measures of our tests directly depend on the robustness of the MDPDE through its influence function. So, a suitable data-driven choice of β for our Wald-type test statistics also can be equivalently formed by adjusting the trade-off between efficiency and robustness of the MDPDE used. For this second problem, Warwick and Jones [20] proposed to minimize an estimator of MSE of the MDPDE to chose optimum β. Based on the first sample X1,,Xn, they proposed to minimize the estimated MSE

(30)MSEˆn(β)=(1)θˆβθβPT(1)θˆβθβP+1nTraceJˆβ,n1Kˆβ,nJˆβ,n1

over β, where θβP is a pilot estimator of the target parameter and Jˆβ,n and Kˆβ,n are estimators of the matrices Jβ and Kβ respectively, which can be easily obtained from their expressions by substituting θ by the MDPDE and integrations by sample means.

Although there is no direct choice for θβP, Warwick and Jones [20] suggested, based on an extensive simulation studies, that the MDPDE with β=1 can serve the purpose well for the i.i.d. set-up and we will stick to that suggestion for the present case also (the non-i.i.d. cases have been studied in [21, 22]). However, the problem in the present two-sample case is that, the optimum β obtained by minimizing MSEˆn(β) based on the first sample may not be the same as that obtained for the second sample due to possible different level of contaminations. As a standard solution, we propose the minimization of the total estimated MSE, the sum of the MSE estimates based on two samples separately, over β[0,1] to obtain the optimum choice of the tuning parameter for the present two-sample testing problem.

Figure 6: Histograms for optimally chosen tuning parameter β$\beta$ under normal models with different contamination levels.
Figure 6:

Histograms for optimally chosen tuning parameter β under normal models with different contamination levels.

We have implemented this proposal for the above simulation study with normal model to check its effectiveness. Figure 6 presents the histograms of the 1000 selected optimum β following this proposal for the normal model with known and equal variances under the simulation scheme used for studying size stability above (in Figure 5). Clearly, the mode of these optimum βs shift from 0 to 1 as the contamination proportion increases yielding the expected trade-off between the power and robustness based on the level of contaminations.

7 Concluding remarks

In this paper, we have considered the problem of testing with two independent samples of i.i.d. observations and proposed a class of robust Wald-type tests for both simple and composite hypothesis testing. These Wald-type tests are constructed using the robust minimum density power divergence estimators of the underlying parameters in each sample. The asymptotic and robustness properties of the proposed Wald-type tests have been discussed along with their applications to several important real-life problems like clinical trial, medical experiment, reliability testing and many more.

Our focus in this paper has been on robust two sample tests. Nonparametric methods and robust methods share some common goals, yet robust methods are inherently different from nonparametric methods as they are essentially parametric, although they allow the parametric model to be only approximately true. It is well known that when a parametric model does hold, the parametric procedures are much more efficient compared to the nonparametric methods. However, when the parametric model holds only approximately, the robust methods are still often substantially more efficient in doing inference about the major component of the data generating distribution compared to nonparametric methods. This has been amply demonstrated by the simulations reported in Table 7. And while parametric models may never “exactly” fit the data, they often provide reasonable “approximate” fits to many practical data sets. So we expect that our method will have a better scope of application in real problems compared to classical parametric methods, and will have greater efficiency in many cases compared to nonparametric methods; in either case, our method will have better robustness properties.

Although we have discussed all possible types of general two-sample hypotheses, in this paper, we have restricted our attention to the cases where each of the two independent samples is identically distributed. The natural extension of this work will be to develop robust tests for hypotheses involving two independent samples from non-homogeneous populations; this also has many practical applications including comparing the regression lines between two groups of patients in a fixed design clinical trial. Also, one could further explore the possibility of robust hypothesis testing using the minimum density power divergence estimators for two paired samples or for more than two sample cases. we hope to pursue some of this possible extensions in our future research.

8 Proof of Results

8.1 Proof of Theorem 2.1

Using the asymptotic distribution of n((1)θˆβθ1) and n((2)θˆβθ2), we have

mnm+n(1)θˆβθ1Lm,nN(0p,ωΣβ(θ1))

and

mnm+n(2)θˆβθ2Lm,nN(0p,(1ω)Σβ(θ2)).

Hence under H0:θ1=θ2=θ0, we get

mnm+n(1)θˆβ(1)θˆβLm,nN(0p,Σβ(θ0)).

Further, under H0, (0)θˆβPθ0 as m+n. Then the theorem follows using the continuity of the matrix Σβ(θ). □

8.2 Proof of Theorem 2.2

Note that, (0)θˆβPn,mθ3 and hence the asymptotic distribution of l(0)θˆβ,β((1)θˆβ,(2)θˆβ) is the same as that of lθ3,β((1)θˆβ,(2)θˆβ). Now, a suitable Taylor series expansion leads to

lθ3,β((1)θˆβ,(2)θˆβ)lθ3,β(θ1,θ2)=(1)θˆβθ1Tθ1lθ3,β(θ1,θ2)+(2)θˆβθ2Tθ2lθ3,β(θ1,θ2)+oP||(1)θˆβθ1||2+oP||(2)θˆβθ2||2=2(1)θˆβθ1TΣβ(θ3)1(θ1θ2)2(2)θˆβθ2TΣβ(θ3)1(θ1θ2)+oP||(1)θˆβθ1||2+oP||(2)θˆβθ2||2=2(1)θˆβ(2)θˆβθ1θ2TΣβ(θ3)1(θ1θ2)+oP||(1)θˆβθ1||2+oP||(2)θˆβθ2||2.

Then, the theorem follows from the above expression by noting that

mnm+n(1)θˆβ(2)θˆβθ1θ2Ln,mN0,ωΣβ(θ1)+(1ω)Σβ(θ2),

as m,n at any θ1θ2. Here, the last convergence follows from the asymptotic distributions of the MDPDEs (1)θˆβ and (2)θˆβ. □

8.3 Proof of Theorem 2.5

Using the asymptotic distribution of n((1)θˆβθ1,n) and n((2)θˆβθ2,m) under H1,n,m and continuity of Σβ(θ0), we have (2)θˆβPmθ0,

mnm+n(1)θˆβθ0Lm,nN(ωΔ1,ωΣβ(θ0))

and

mnm+n(2)θˆβθ0Lm,nN(1ωΔ2,(1ω)Σβ(θ0)).

Hence, under H1,n,m, we get

mnm+n(1)θˆβ(1)θˆβLm,nN(ωΔ11ωΔ2,Σβ(θ0)),

from which the theorem follows immediately. □

8.4 Proof of 2.6

We will only prove the case (D1,D2)=(F1,m,ε,xP,F2,n,ε,yP). Other two cases will follow similarly.

Let us denote θ1,n=Uβ(F1,m,ε,xP) and θ2,m=Uβ(F2,n,ε,yP). Then using the continuity of Σβ(θ0), we get under (D1,D2)=(F1,m,ε,xP,F2,n,ε,yP), the asymptotic distribution of n((1)θˆβθ1,n) and n((2)θˆβθ2,m) are both p-variate normal with mean zero and variance Σβ(θ0). Further, a suitable Taylor series expansion yields

θ1,n=θ1,n+εnIF(x;Uβ,Fθ1,n)+o(n1/2)=θ0+Δ1n+εnIF(x;Uβ,Fθ1,n)+o(n1/2)=θ0+Δ1˜n+o(n1/2).

Similarly, we have

θ2,m=θ0+Δ2˜n+o(n1/2).

Combining all these, we get

mnm+n(1)θˆβθ0Lm,nN(ωΔ˜1,ωΣβ(θ0))

and

mnm+n(2)θˆβθ0Lm,nN(1ωΔ˜2,(1ω)Σβ(θ0)).

Hence, under (D1,D2)=(F1,m,ε,xP,F2,n,ε,yP), we get

mnm+n(1)θˆβ(1)θˆβLm,nN(ωΔ˜11ωΔ˜2,Σβ(θ0))

and hence the theorem follows immediately. □

8.5 Proof of Theorem 3.1

Using suitable Taylor series expansion, we get

ψ((1)θˆβ,(2)θˆβ)=ψ(θ1,θ2)+Ψ1(θ1,θ2)T((1)θˆβθ1)+Ψ2(θ1,θ2)T((2)θˆβθ2)+oP||(1)θˆβθ1||+oP||(2)θˆβθ2||.

Now, from the asymptotic distribution of n((1)θˆβθ1) and n((2)θˆβθ2) it follows that

mnm+nΨ1(θ1,θ2)T(1)θˆβθ1Lm,nN(0,ωΨ1(θ1,θ2)TΣβ(θ1)Ψ1(θ1,θ2))

and

mnm+nΨ2(θ1,θ2)T(2)θˆβθ2Lm,nN(0,(1ω)Ψ2(θ1,θ2)TΣβ(θ2)Ψ2(θ1,θ2)).

Hence under H0:ψ(θ1,θ2)=0r, we get

mnm+nψ(1)θˆβ,(2)θˆβLm,nN(0r,Σβ˜(θ1,θ2)).

Finally, by the consistency of the MDPDEs and the continuity of the matrices Ψ1, Ψ2 and Σβ, it follows that Σβ˜((1)θˆβ(2)θˆβ)PΣβ˜(θ1,θ2) as m+n, from which the theorem follows immediately. □

8.6 Proof of Theorem 3.2

Using an appropriate Taylor series expansion, we get

l˜((1)θˆβ,(2)θˆβ)l˜(θ1,θ2=(1)θˆβθ1Tθ1l˜(θ1,θ2)+(2)θˆβθ2Tθ2l˜(θ1,θ2)+oP||(1)θˆβθ1||2+oP||(2)θˆβθ2||2=2(1)θˆβθ1TΨ1(θ1,θ2)Σβ˜(θ1,θ2)1ψ(θ1,θ2)+2(2)θˆβθ2TΨ2(θ1,θ2)Σβ˜(θ1,θ2)1ψ(θ1,θ2)+oP||(1)θˆβθ1||2+oP||(2)θˆβθ2||2=2Ψ1(θ1,θ2)T(1)θˆβθ1+Ψ2(θ1,θ2)T(2)θˆβθ2TΣβ˜(θ1,θ2)1ψ(θ1,θ2)+oP||(1)θˆβθ1||2+oP||(2)θˆβθ2||2.

Then, the theorem follows from the asymptotic distributions of the MDPDEs (1)θˆβ and (2)θˆβ. □

8.7 Proof of 3.4

Using the asymptotic distribution of n((1)θˆβθ1,n) and n((2)θˆβθ2,m) under H1,n,m and continuity of Σβ(θ0), we have, as m,n, (2)θˆβPθ0,

mnm+n(1)θˆβθ10Lm,nN(ωΔ1,ωΣβ(θ1))

and

mnm+n(2)θˆβθ20Lm,nN(1ωΔ2,(1ω)Σβ(θ2)).

Hence, following the proof of  Theorem 2.5, we get under H1,n,m

mnm+nψ(1)θˆβ,(2)θˆβLm,nNωΨ1(θ1,θ2)TΔ1+1ωΨ2(θ1,θ2)TΔ2,Σβ˜(θ1,θ2),

from which the theorem follows immediately. □

8.8 Proof of Theorems 3.2 and 3.7

These proofs are similar to that of Theorems 2.6 and 2.7 and hence omitted. □

References

[1] Basu A, Shioya H, Park C. Statistical inference: the minimum distance approach. Boca Raton, FL: Chapman & Hall/CRC, 2011.10.1201/b10956Search in Google Scholar

[2] Pardo L. Statistical Inference based on Divergences. CRC/Chapman-Hall, 2006.Search in Google Scholar

[3] Basu A, Mandal A, Martin N, Pardo L. Testing statistical hypotheses based on the density power divergence. Ann Inst Stat Math. 2013;65:319–48.10.1007/s10463-012-0372-ySearch in Google Scholar

[4] Basu A, Harris IR, Hjort NL, Jones MC. Robust and efficient estimation by minimising a density power divergence. Biometrika. 1998;85:549–59.10.1093/biomet/85.3.549Search in Google Scholar

[5] Basu A, Mandal A, Martin N, Pardo L. Robust tests for the equality of two normal means based on the density power divergence. Metrika. 2015;78:611–34.10.1007/s00184-014-0518-4Search in Google Scholar

[6] Basu A, Mandal A, Martin N, Pardo L. Generalized Wald-type tests based on minimum density power divergence estimators. Statistics. 2016;50:1–26.10.1080/02331888.2015.1016435Search in Google Scholar

[7] Ghosh A, Mandal A, Martin N, Pardo L. Influence analysis of robust Wald-type tests. J Multivariate Anal. 2016;147:102–26.10.1016/j.jmva.2016.01.004Search in Google Scholar

[8] Lehmann EL. Theory of point estimation. John Wiley & Sons, 1983.10.1007/978-1-4757-2769-2Search in Google Scholar

[9] Hampel FR, Ronchetti E, Rousseeuw PJ, Stahel W. Robust statistics: the approach based on influence functions. New York, USA: John Wiley & Sons, 1986.Search in Google Scholar

[10] Huber-Carol C. Etude asymptotique de tests robustes. Ph. D. thesis, ETH, Zurich, 1970.Search in Google Scholar

[11] Heritier S, Ronchetti E. Robust bounded-influence tests in general parametric models. J Am Stat Assoc. 1994;89:897–904.10.1080/01621459.1994.10476822Search in Google Scholar

[12] Toma A, Broniatowski M. Dual divergence estimators and tests: robustness results. J Multivariate Anal. 2011;102:20–3610.1016/j.jmva.2010.07.010Search in Google Scholar

[13] Ghosh A, Basu A, Pardo L. On the robustness of a divergence based test of simple statistical hypotheses. J Stat Planning Inference. 2015;116:91–108.10.1016/j.jspi.2015.01.003Search in Google Scholar

[14] Kotz S, Johnson NL, Boyd DW. Series representations of distributions of quadratic forms in normal variables. I. Non-central case. Ann Math Stat. 1967;38:838–48.10.1214/aoms/1177698878Search in Google Scholar

[15] Kerstjens HAM, Engel M, Dahl R, Paggiaro P, Beck E, Vandewalker M, Sigmund R, Seibold W, Moroni-Zentgraf P, Bateman ED. Tiotropium in asthma poorly controlled with standard combination therapy New England J Med. 2012;367:1198–207.10.1056/NEJMoa1208606Search in Google Scholar PubMed

[16] Simpson DG. Hellinger deviance test: efficiency, breakdown points, and examples. J Am Stat Assoc. 1989;84:107–13.10.1080/01621459.1989.10478744Search in Google Scholar

[17] Woodruff RC, Mason JM, Valencia R, Zimmering A. Chemical mutagenesis testing in drosophila – I: Comparison of positive and negative control data for sex-linked recessive lethal mutations and reciprocal translocations in three laboratories. Environ Mutagen. 1984;6:189–202.10.1002/em.2860060207Search in Google Scholar PubMed

[18] Karpatkin M, Porges RF, Karpatkin S. Platelet counts in infants of women with autoimmune thrombocytopenia: effect of steroid administration to the mother. New England J Med. 1981;305;936–9.10.1056/NEJM198110153051607Search in Google Scholar PubMed

[19] Perng SK. A test for equality of two exponential distributions. Stat Neerlandica. 1978;32:93–102.10.1111/j.1467-9574.1978.tb01388.xSearch in Google Scholar

[20] Warwick J, Jones MC. Choosing a robustness tuning parameter. J Stat Comput Simul. 2005;75:581–8.10.1080/00949650412331299120Search in Google Scholar

[21] Ghosh A, Basu A. Robust estimation for independent non-homogeneous observations using density power divergence with applications to linear regression. Electr J Stat. 2013;7:2420–56.10.1214/13-EJS847Search in Google Scholar

[22] Ghosh A, Basu A. Robust estimation for non-homogeneous data and the selection of the optimal tuning parameter: the density power divergence approach. J Appl Stat. 2015;42;2056–72.10.1080/02664763.2015.1016901Search in Google Scholar

Received: 2017-03-14
Revised: 2018-01-06
Accepted: 2018-06-25
Published Online: 2018-07-19

© 2018 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 19.4.2024 from https://www.degruyter.com/document/doi/10.1515/ijb-2017-0023/html
Scroll to top button