当前位置: X-MOL 学术Stat › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
K-fold cross-validation for complex sample surveys
Stat ( IF 0.7 ) Pub Date : 2022-01-12 , DOI: 10.1002/sta4.454
Jerzy Wieczorek 1 , Cole Guerin 1 , Thomas McMahon 1
Affiliation  

Although K-fold cross-validation (CV) is widely used for model evaluation and selection, there has been limited understanding of how to perform CV for non-iid data, including those from sampling designs with unequal selection probabilities. We introduce CV methodology that is appropriate for design-based inference from complex survey sampling designs. For such data, we claim that we will tend to make better inferences when we choose the folds and compute the test errors in ways that account for the survey design features such as stratification and clustering. Our mathematical arguments are supported with simulations, and our methods are illustrated on real survey data.

中文翻译:

复杂样本调查的 K 折交叉验证

尽管K折交叉验证 (CV) 广泛用于模型评估和选择,但对于如何对非独立同分布数据执行 CV 的理解有限,包括来自具有不等选择概率的抽样设计的数据。我们介绍了适用于从复杂调查抽样设计中进行基于设计的推理的 CV 方法。对于此类数据,我们声称,当我们选择折叠并以考虑分层和聚类等调查设计特征的方式计算测试误差时,我们将倾向于做出更好的推断。我们的数学论证得到了模拟的支持,我们的方法在真实的调查数据上得到了说明。
更新日期:2022-01-12
down
wechat
bug