当前位置: X-MOL 学术Can. J. Stat. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Optimal subsampling for linear quantile regression models
The Canadian Journal of Statistics ( IF 0.8 ) Pub Date : 2021-03-10 , DOI: 10.1002/cjs.11590
Yan Fan 1 , Yukun Liu 2 , Lixing Zhu 1, 3
Affiliation  

Subsampling techniques are efficient methods for handling big data. Quite a few optimal sampling methods have been developed for parametric models in which the loss functions are differentiable with respect to parameters. However, they do not apply to quantile regression (QR) models as the involved check function is not differentiable. To circumvent the non-differentiability problem, we consider directly estimating the linear QR coefficient by minimizing the Hansen–Hurwitz estimator of the usual loss function for QR. We establish the asymptotic normality of the resulting estimator under a generic sampling method, and then develop optimal subsampling methods for linear QR. In particular, we propose a one-stage subsampling method, which depends only on the lengths of covariates, and a two-stage subsampling method, which is a combination of the one-stage sampling and the ideal optimal subsampling methods. Our simulation and real data based simulation studies show that the two recommended sampling methods always outperform simple random sampling in terms of mean square error, whether the linear QR model is valid or not.

中文翻译:

线性分位数回归模型的最优子抽样

二次抽样技术是处理大数据的有效方法。已经为参数模型开发了相当多的最佳采样方法,其中损失函数相对于参数是可微的。但是,它们不适用于分位数回归 (QR) 模型,因为所涉及的检查函数是不可微的。为了规避不可微性问题,我们考虑通过最小化 QR 的常用损失函数的 Hansen-Hurwitz 估计量来直接估计线性 QR 系数。我们在通用采样方法下建立所得估计量的渐近正态性,然后开发线性 QR 的最优子采样方法。特别是,我们提出了一种仅取决于协变量长度的单阶段子采样方法,以及一种两阶段子采样方法,它是单阶段采样和理想的最优子采样方法的结合。我们的模拟和基于真实数据的模拟研究表明,无论线性 QR 模型是否有效,两种推荐的抽样方法在均方误差方面总是优于简单随机抽样。
更新日期:2021-03-10
down
wechat
bug