Generalized Regression Estimators with High-Dimensional Covariates,Statistica Sinica

当前位置： X-MOL 学术 › Stat. Sin. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Generalized Regression Estimators with High-Dimensional Covariates
Statistica Sinica ( IF 1.5 ) Pub Date : 2020-01-01 , DOI: 10.5705/ss.202017.0384
Tram Ta ₁ , Jun Shao _{1,

2} , Quefeng Li ₃ , Lei Wang ₄

Affiliation

Data from a large number of covariates with known population totals are frequently observed in survey studies. These auxiliary variables contain valuable information that can be incorporated into estimation of the population total of a survey variable to improve the estimation precision. We consider the generalized regression estimator formulated under the model-assisted framework in which a regression model is utilized to make use of the available covariates while the estimator still has basic design-based properties. The generalized regression estimator has been shown to improve the efficiency of the design-based Horvitz-Thompson estimator when the number of covariates is fixed. In this study, we investigate the performance of the generalized regression estimator when the number of covariates p is allowed to diverge as the sample size n increases. We examine two approaches where the model parameter is estimated using the weighted least squares method when p < n and the LASSO method when the model parameter is sparse. We show that under an assisted model and certain conditions on the joint distribution of the covariates as well as the divergence rates of n and p, the generalized regression estimator is asymptotically more efficient than the Horvitz-Thompson estimator, and is robust against model misspecification. We also study the consistency of variance estimation for the generalized regression estimator. Our theoretical results are corroborated by simulation studies and an example.

中文翻译：

具有高维协变量的广义回归估计器

在调查研究中经常观察到来自已知总体总数的大量协变量的数据。这些辅助变量包含有价值的信息，可以将其纳入调查变量总体总数的估计中，以提高估计精度。我们考虑在模型辅助框架下制定的广义回归估计器，其中利用回归模型来利用可用的协变量，同时估计器仍然具有基于设计的基本属性。当协变量的数量固定时，广义回归估计器已被证明可以提高基于设计的 Horvitz-Thompson 估计器的效率。在本研究中，我们研究了当允许协变量 p 的数量随着样本大小 n 的增加而发散时，广义回归估计器的性能。我们研究了两种方法，其中当 p < n 时使用加权最小二乘法来估计模型参数，当模型参数稀疏时使用 LASSO 方法来估计模型参数。我们表明，在辅助模型和协变量联合分布以及 n 和 p 的发散率的某些条件下，广义回归估计器渐近地比 Horvitz-Thompson 估计器更有效，并且对模型错误指定具有鲁棒性。我们还研究了广义回归估计器的方差估计的一致性。我们的理论结果得到了模拟研究和示例的证实。

更新日期：2020-01-01

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11