当前位置: X-MOL 学术Commun. Stat. Theory Methods › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Quantile regression for massive data with network-induced dependence, and application to the New York statewide planning and research cooperative system
Communications in Statistics - Theory and Methods ( IF 0.6 ) Pub Date : 2020-07-01 , DOI: 10.1080/03610926.2020.1786120
Yanqiao Zheng 1 , Xiaobing Zhao 2 , Xiaoqi Zhang 1
Affiliation  

Abstract

Medical costs are often skewed to the right, heteroscedastic, and having a sophisticated relation with covariates. Moreover, medical cost datasets are always massive, such as in the New York Statewide Planning and Research Cooperative System Expenditure Study. Different observations can depend on each other as the spatial distribution of diseases induces complex correlation among patients coming from nearby communities. Therefore, it is not enough if only focus on the mean function regression models with low-dimensional covariates, small sample size and identically independent observations. In this paper, we propose a new quantile regression model to analyze medical costs. A network term is introduced to account for the dependence among different observations. We also consider variable selection for massive datasets. An adaptive lasso penalized variable selection method is applied in a parallel manner, the resulting estimators are combined through minimizing an extra penalized loss function. Simulation studies are conducted to illustrate the performance of the estimation method. We apply our method to the analysis of the New York State’s Statewide Planning and Research Cooperative System, 2013.



中文翻译:

具有网络依赖性的海量数据的分位数回归及其在纽约州规划和研究合作系统中的应用

摘要

医疗费用经常向右倾斜,异方差,并且与协变量有复杂的关系。此外,医疗成本数据集总是很庞大,例如在纽约州规划和研究合作系统支出研究中。不同的观察结果可能相互依赖,因为疾病的空间分布会导致来自附近社区的患者之间存在复杂的相关性。因此,仅关注具有低维协变量、小样本量和相同独立观测值的均值函数回归模型是不够的。在本文中,我们提出了一种新的分位数回归模型来分析医疗费用。引入了一个网络术语来解释不同观察之间的依赖性。我们还考虑了海量数据集的变量选择。以并行方式应用自适应套索惩罚变量选择方法,通过最小化额外的惩罚损失函数来组合得到的估计量。进行了模拟研究来说明估计方法的性能。我们将我们的方法应用于纽约州全州规划和研究合作系统的分析,2013 年。

更新日期:2020-07-01
down
wechat
bug