Classification accuracy as a proxy for two-sample testing,Annals of Statistics

当前位置： X-MOL 学术 › Ann. Stat. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Classification accuracy as a proxy for two-sample testing
Annals of Statistics ( IF 3.2 ) Pub Date : 2021-01-29 , DOI: 10.1214/20-aos1962
Ilmun Kim , Aaditya Ramdas , Aarti Singh , Larry Wasserman

When data analysts train a classifier and check if its accuracy is significantly different from chance, they are implicitly performing a two-sample test. We investigate the statistical properties of this flexible approach in the high-dimensional setting. We prove two results that hold for all classifiers in any dimensions: if its true error remains $\epsilon $-better than chance for some $\epsilon >0$ as $d,n\to \infty $, then (a) the permutation-based test is consistent (has power approaching to one), (b) a computationally efficient test based on a Gaussian approximation of the null distribution is also consistent. To get a finer understanding of the rates of consistency, we study a specialized setting of distinguishing Gaussians with mean-difference $\delta $ and common (known or unknown) covariance $\Sigma $, when $d/n\to c\in (0,\infty )$. We study variants of Fisher’s linear discriminant analysis (LDA) such as “naive Bayes” in a nontrivial regime when $\epsilon \to 0$ (the Bayes classifier has true accuracy approaching 1/2), and contrast their power with corresponding variants of Hotelling’s test. Surprisingly, the expressions for their power match exactly in terms of $n$, $d$, $\delta $, $\Sigma $, and the LDA approach is only worse by a constant factor, achieving an asymptotic relative efficiency (ARE) of $1/\sqrt{\pi }$ for balanced samples. We also extend our results to high-dimensional elliptical distributions with finite kurtosis. Other results of independent interest include minimax lower bounds, and the optimality of Hotelling’s test when $d=o(n)$. Simulation results validate our theory, and we present practical takeaway messages along with natural open problems.

中文翻译：

分类准确性可作为两次样本测试的代理

当数据分析师训练分类器并检查其准确性是否与偶然性显着不同时，他们隐式地执行了两次样本测试。我们调查在高维环境中这种灵活方法的统计属性。我们证明了在所有维度上所有分类器均适用的两个结果：如果其真实误差仍为$ \ epsilon，则$ε> 0 $的机会大于$ d，n \ infty $的机会，则（a）基于置换的检验是一致的（幂次接近），（b）基于零分布的高斯近似的计算有效检验也是一致的。为了更好地了解一致性率，我们研究了一种特殊设置，用于区分均值差$ \ delta $和公共（已知或未知）协方差$ \ Sigma $（当d / n \至c \ in时）的高斯项。（0，\ infty）$。我们研究费舍尔线性判别分析（LDA）的变体，例如在$ \ epsilon \至0 $（贝叶斯分类器的真实精度接近1/2）的非平凡体制中的“朴素贝叶斯”，并将其功效与的对应变体进行比较。霍特林的测试。令人惊讶的是，它们的幂表达式在$ n $，$ d $，$ \ delta $，$ \ Sigma $方面完全匹配，而LDA方法仅在恒定因子的情况下更差，实现了渐近相对效率（ARE） $ 1 / \ sqrt {\ pi} $为平衡样本。我们还将结果扩展到具有有限峰度的高维椭圆分布。独立利益的其他结果包括极小极大值下界和$ d = o（n）$时Hotelling检验的最优性。仿真结果验证了我们的理论，并提出了实用的外卖信息以及自然的开放性问题。

更新日期：2021-01-29

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文