Two-sample Behrens–Fisher problems for high-dimensional data: A normal reference approach,Journal of Statistical Planning and Inference

当前位置： X-MOL 学术 › J. Stat. Plann. Inference › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Two-sample Behrens–Fisher problems for high-dimensional data: A normal reference approach
Journal of Statistical Planning and Inference ( IF 0.8 ) Pub Date : 2021-07-01 , DOI: 10.1016/j.jspi.2020.11.008
Jin-Ting Zhang , Bu Zhou , Jia Guo , Tianming Zhu

Abstract High-dimensional data are frequently encountered with the development of modern data collection techniques. Testing the equality of the mean vectors of two high-dimensional samples with possibly different covariance matrices is usually referred to as a high-dimensional two-sample Behrens–Fisher (BF) problem. In the high-dimensional setting, the classical BF solutions are expected to perform poorly or become inapplicable due to the singularity of the sample covariance matrices. Several approaches have been proposed in the literature to address this challenging issue but they all require strong regularity conditions on the underlying covariance matrices to guarantee that their test statistics are asymptotically normally distributed. To overcome this difficulty, an L 2 -norm-based test is proposed and studied in this article. It is shown that under some regularity conditions and the null hypothesis, the test statistic and a chi-square-type mixture have the same normal or non-normal limiting distribution. It is then natural to approximate the null distribution of the proposed test using that of the chi-square-type mixture, which is actually obtained from the proposed test statistic when the two high-dimensional samples are normally distributed. The resulting test is then referred to as a normal reference test. The distribution of the chi-square-type mixture can then be well approximated by the Welch–Satterthwaite χ 2 -approximation with the approximation parameters consistently estimated from the data. The asymptotic power of the proposed test is established. Good performance of the proposed test against several existing competitors is demonstrated via several simulation studies and illustrated by a real data example.

中文翻译：

高维数据的两样本 Behrens-Fisher 问题：一种正常的参考方法

摘要随着现代数据采集技术的发展，经常会遇到高维数据。用可能不同的协方差矩阵测试两个高维样本的均值向量的相等性通常被称为高维二维样本 Behrens-Fisher (BF) 问题。在高维设置中，由于样本协方差矩阵的奇异性，预计经典 BF 解将表现不佳或变得不适用。文献中提出了几种方法来解决这个具有挑战性的问题，但它们都需要基础协方差矩阵的强正则条件，以保证它们的测试统计量是渐近正态分布的。为了克服这个困难，本文提出并研究了基于L 2 范数的检验。结果表明，在一定的规律性条件和原假设下，检验统计量和卡方型混合具有相同的正态或非正态极限分布。然后很自然地使用卡方型混合来近似提议检验的零分布，这实际上是在两个高维样本呈正态分布时从提议的检验统计量中获得的。所得到的测试然后被称为正常参考测试。然后可以通过 Welch-Satterthwaite χ 2 近似很好地近似卡方型混合物的分布，近似参数从数据中一致估计。建立了建议检验的渐近功效。

更新日期：2021-07-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11