当前位置: X-MOL 学术Ann. Stat. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Distance-based and RKHS-based dependence metrics in high dimension
Annals of Statistics ( IF 3.2 ) Pub Date : 2020-12-01 , DOI: 10.1214/19-aos1934
Changbo Zhu , Xianyang Zhang , Shun Yao , Xiaofeng Shao

In this paper, we study distance covariance, Hilbert-Schmidt covariance (aka Hilbert-Schmidt independence criterion [Gretton et al. (2008)]) and related independence tests under the high dimensional scenario. We show that the sample distance/Hilbert-Schmidt covariance between two random vectors can be approximated by the sum of squared componentwise sample cross-covariances up to an asymptotically constant factor, which indicates that the distance/Hilbert-Schmidt covariance based test can only capture linear dependence in high dimension. As a consequence, the distance correlation based t-test developed by Szekely and Rizzo (2013) for independence is shown to have trivial limiting power when the two random vectors are nonlinearly dependent but component-wisely uncorrelated. This new and surprising phenomenon, which seems to be discovered for the first time, is further confirmed in our simulation study. As a remedy, we propose tests based on an aggregation of marginal sample distance/Hilbert-Schmidt covariances and show their superior power behavior against their joint counterparts in simulations. We further extend the distance correlation based t-test to those based on Hilbert-Schmidt covariance and marginal distance/Hilbert-Schmidt covariance. A novel unified approach is developed to analyze the studentized sample distance/Hilbert-Schmidt covariance as well as the studentized sample marginal distance covariance under both null and alternative hypothesis. Our theoretical and simulation results shed light on the limitation of distance/Hilbert-Schmidt covariance when used jointly in the high dimensional setting and suggest the aggregation of marginal distance/Hilbert-Schmidt covariance as a useful alternative.

中文翻译:

高维中基于距离和基于 RKHS 的依赖度量

在本文中,我们研究了高维场景下的距离协方差、Hilbert-Schmidt 协方差(又名 Hilbert-Schmidt 独立性准则 [Gretton et al. (2008)])和相关的独立性检验。我们表明,两个随机向量之间的样本距离/Hilbert-Schmidt 协方差可以近似为平方分量样本互协方差之和,直到一个渐近常数因子,这表明基于距离/Hilbert-Schmidt 协方差的测试只能捕获高维的线性相关性。因此,当两个随机向量非线性相关但分量不相关时,Szekely 和 Rizzo (2013) 为独立性开发的基于距离相关的 t 检验被证明具有微不足道的限制能力。这种新奇的现象,这似乎是第一次发现,在我们的模拟研究中得到了进一步证实。作为补救措施,我们提出了基于边际样本距离/希尔伯特-施密特协方差聚合的测试,并在模拟中展示了它们相对于联合对应物的优越功效行为。我们进一步将基于距离相关的 t 检验扩展到基于 Hilbert-Schmidt 协方差和边际距离/Hilbert-Schmidt 协方差的那些。开发了一种新颖的统一方法来分析学生化样本距离/希尔伯特-施密特协方差以及零假设和备择假设下的学生化样本边际距离协方差。
更新日期:2020-12-01
down
wechat
bug