当前位置: X-MOL 学术Ann. Stat. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Classification accuracy as a proxy for two-sample testing
Annals of Statistics ( IF 3.2 ) Pub Date : 2021-01-29 , DOI: 10.1214/20-aos1962
Ilmun Kim , Aaditya Ramdas , Aarti Singh , Larry Wasserman

When data analysts train a classifier and check if its accuracy is significantly different from chance, they are implicitly performing a two-sample test. We investigate the statistical properties of this flexible approach in the high-dimensional setting. We prove two results that hold for all classifiers in any dimensions: if its true error remains $\epsilon $-better than chance for some $\epsilon >0$ as $d,n\to \infty $, then (a) the permutation-based test is consistent (has power approaching to one), (b) a computationally efficient test based on a Gaussian approximation of the null distribution is also consistent. To get a finer understanding of the rates of consistency, we study a specialized setting of distinguishing Gaussians with mean-difference $\delta $ and common (known or unknown) covariance $\Sigma $, when $d/n\to c\in (0,\infty )$. We study variants of Fisher’s linear discriminant analysis (LDA) such as “naive Bayes” in a nontrivial regime when $\epsilon \to 0$ (the Bayes classifier has true accuracy approaching 1/2), and contrast their power with corresponding variants of Hotelling’s test. Surprisingly, the expressions for their power match exactly in terms of $n$, $d$, $\delta $, $\Sigma $, and the LDA approach is only worse by a constant factor, achieving an asymptotic relative efficiency (ARE) of $1/\sqrt{\pi }$ for balanced samples. We also extend our results to high-dimensional elliptical distributions with finite kurtosis. Other results of independent interest include minimax lower bounds, and the optimality of Hotelling’s test when $d=o(n)$. Simulation results validate our theory, and we present practical takeaway messages along with natural open problems.

中文翻译:

分类准确性可作为两次样本测试的代理

当数据分析师训练分类器并检查其准确性是否与偶然性显着不同时,他们隐式地执行了两次样本测试。我们调查在高维环境中这种灵活方法的统计属性。我们证明了在所有维度上所有分类器均适用的两个结果:如果其真实误差仍为$ \ epsilon,则$ε> 0 $的机会大于$ d,n \ infty $的机会,则(a)基于置换的检验是一致的(幂次接近),(b)基于零分布的高斯近似的计算有效检验也是一致的。为了更好地了解一致性率,我们研究了一种特殊设置,用于区分均值差$ \ delta $和公共(已知或未知)协方差$ \ Sigma $(当d / n \至c \ in时)的高斯项。 (0,\ infty)$。我们研究费舍尔线性判别分析(LDA)的变体,例如在$ \ epsilon \至0 $(贝叶斯分类器的真实精度接近1/2)的非平凡体制中的“朴素贝叶斯”,并将其功效与的对应变体进行比较。霍特林的测试。令人惊讶的是,它们的幂表达式在$ n $,$ d $,$ \ delta $,$ \ Sigma $方面完全匹配,而LDA方法仅在恒定因子的情况下更差,实现了渐近相对效率(ARE) $ 1 / \ sqrt {\ pi} $为平衡样本。我们还将结果扩展到具有有限峰度的高维椭圆分布。独立利益的其他结果包括极小极大值下界和$ d = o(n)$时Hotelling检验的最优性。仿真结果验证了我们的理论,并提出了实用的外卖信息以及自然的开放性问题。
更新日期:2021-01-29
down
wechat
bug