当前位置: X-MOL 学术Biom. J. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Modeling tails for collinear data with outliers in the English Longitudinal Study of Ageing: Quantile profile regression
Biometrical Journal ( IF 1.3 ) Pub Date : 2020-01-20 , DOI: 10.1002/bimj.201900146
Xi Liu 1 , Silvia Liverani 2, 3 , Kimberley J Smith 4 , Keming Yu 1
Affiliation  

Research has shown that high blood glucose levels are important predictors of incident diabetes. However, they are also strongly associated with other cardiometabolic risk factors such as high blood pressure, adiposity, and cholesterol, which are also highly correlated with one another. The aim of this analysis was to ascertain how these highly correlated cardiometabolic risk factors might be associated with high levels of blood glucose in older adults aged 50 or older from wave 2 of the English Longitudinal Study of Ageing (ELSA). Due to the high collinearity of predictor variables and our interest in extreme values of blood glucose we proposed a new method, called quantile profile regression, to answer this question. Profile regression, a Bayesian nonparametric model for clustering responses and covariates simultaneously, is a powerful tool to model the relationship between a response variable and covariates, but the standard approach of using a mixture of Gaussian distributions for the response model will not identify the underlying clusters correctly, particularly with outliers in the data or heavy tail distribution of the response. Therefore, we propose quantile profile regression to model the response variable with an asymmetric Laplace distribution, allowing us to model more accurately clusters that are asymmetric and predict more accurately for extreme values of the response variable and/or outliers. Our new method performs more accurately in simulations when compared to Normal profile regression approach as well as robustly when outliers are present in the data. We conclude with an analysis of the ELSA.

中文翻译:

在英国老化纵向研究中为带有异常值的共线数据建模尾部:分位数剖面回归

研究表明,高血糖水平是糖尿病发生的重要预测因素。然而,它们也与其他心脏代谢风险因素密切相关,如高血压、肥胖和胆固醇,这些因素也高度相关。该分析的目的是从英国衰老纵向研究 (ELSA) 的第 2 波中确定这些高度相关的心脏代谢风险因素如何与 50 岁或以上老年人的高血糖水平相关。由于预测变量的高度共线性以及我们对血糖极值的兴趣,我们提出了一种称为分位数分布回归的新方法来回答这个问题。轮廓回归,一种用于同时聚类响应和协变量的贝叶斯非参数模型,是对响应变量和协变量之间的关系进行建模的强大工具,但是对响应模型使用混合高斯分布的标准方法将无法正确识别底层集群,尤其是数据中的异常值或重尾分布回复。因此,我们提出分位数分布回归来对具有不对称拉普拉斯分布的响应变量进行建模,从而使我们能够更准确地对不对称的聚类进行建模,并更准确地预测响应变量和/或异常值的极值。与正常轮廓回归方法相比,我们的新方法在模拟中表现得更准确,并且在数据中存在异常值时表现得更稳健。我们以对 ELSA 的分析结束。
更新日期:2020-01-20
down
wechat
bug