An efficient variance estimator for cross-validation under partition sampling,Statistics

当前位置： X-MOL 学术 › Statistics › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

An efficient variance estimator for cross-validation under partition sampling
Statistics ( IF 1.2 ) Pub Date : 2021-06-24 , DOI: 10.1080/02331888.2021.1943393
Qing Wang ₁ , Xizhen Cai ₂

Affiliation

This paper concerns the problem of variance estimation of cross-validation. We consider the unbiased cross-validation risk estimate in the form of a general U-statistic and focus on estimating the variance of the U-statistic risk score. We propose an efficient variance estimator under a half-sampling design where the bias of the estimator can be expressed explicitly. Furthermore, we discuss a practical approach to estimate its bias by a two-layer Monte Carlo method so as to obtain a bias-corrected variance estimator. In the simulation study and real data examples, we evaluate the performance of the proposed variance estimator, in comparison to the commonly used bootstrap and jackknife methods, in the context of model selection under the one-standard-error rule. The numerical results suggest that the proposal yields identical or similar conclusion for model selection compared to its counterparts. Moreover, the developed variance estimator is much more efficient to calculate than its competitors. In the end, we discuss the generalization of the methodology to other partition-sampling scenarios.

中文翻译：

分区抽样下交叉验证的有效方差估计器

本文关注的是交叉验证的方差估计问题。我们以一般 U 统计的形式考虑无偏交叉验证风险估计，并专注于估计 U 统计风险评分的方差。我们在半采样设计下提出了一种有效的方差估计器，其中估计器的偏差可以明确表达。此外，我们讨论了一种通过两层蒙特卡罗方法估计其偏差的实用方法，以获得偏差校正的方差估计量。在模拟研究和真实数据示例中，我们在单一标准误差规则下的模型选择背景下，与常用的引导和折刀方法相比，评估了所提出的方差估计器的性能。数值结果表明，与对应的提案相比，该提案对模型选择产生了相同或相似的结论。此外，开发的方差估计器比其竞争对手更有效地计算。最后，我们讨论了该方法对其他分区采样场景的推广。

更新日期：2021-06-24

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11