当前位置: X-MOL 学术J. Agric. Biol. Environ. Stat. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Characterization of Weighted Quantile Sum Regression for Highly Correlated Data in a Risk Analysis Setting.
Journal of Agricultural, Biological and Environmental Statistics ( IF 1.4 ) Pub Date : 2015-03-01 , DOI: 10.1007/s13253-014-0180-3
Caroline Carrico 1 , Chris Gennings 1 , David C Wheeler 1 , Pam Factor-Litvak 2
Affiliation  

In risk evaluation, the effect of mixtures of environmental chemicals on a common adverse outcome is of interest. However, due to the high dimensionality and inherent correlations among chemicals that occur together, the traditional methods (e.g. ordinary or logistic regression) suffer from collinearity and variance inflation, and shrinkage methods have limitations in selecting among correlated components. We propose a weighted quantile sum (WQS) approach to estimating a body burden index, which identifies "bad actors" in a set of highly correlated environmental chemicals. We evaluate and characterize the accuracy of WQS regression in variable selection through extensive simulation studies through sensitivity and specificity (i.e., ability of the WQS method to select the bad actors correctly and not incorrect ones). We demonstrate the improvement in accuracy this method provides over traditional ordinary regression and shrinkage methods (lasso, adaptive lasso, and elastic net). Results from simulations demonstrate that WQS regression is accurate under some environmentally relevant conditions, but its accuracy decreases for a fixed correlation pattern as the association with a response variable diminishes. Nonzero weights (i.e., weights exceeding a selection threshold parameter) may be used to identify bad actors; however, components within a cluster of highly correlated active components tend to have lower weights, with the sum of their weights representative of the set.

中文翻译:

风险分析中高度相关数据的加权分位数和回归表征。

在风险评估中,关注环境化学品混合物对常见不良后果的影响。但是,由于高维数和在一起出现的化学物质之间的固有相关性,传统方法(例如普通或逻辑回归)遭受共线性和方差膨胀的困扰,并且收缩方法在选择相关成分之间存在局限性。我们提出了加权分位数和(WQS)方法来估算身体负担指数,该指数可识别出一组高度相关的环境化学物质中的“不良行为者”。我们通过敏感性和特异性(即WQS方法正确选择不良行为者而不是错误行为者的能力)进行广泛的模拟研究,评估和表征WQS回归在变量选择中的准确性。我们证明了该方法比传统的普通回归和收缩方法(套索,自适应套索和弹性网)所提供的准确性有所提高。仿真结果表明,WQS回归在某些与环境相关的条件下是准确的,但随着与响应变量的关联减少,对于固定的相关模式,其准确性降低。非零权重(即权重超过选择阈值参数)可用于识别不良行为者;但是,一组高度相关的活动组件中的组件往往具有较低的权重,其权重之和代表集合。仿真结果表明,WQS回归在某些与环境相关的条件下是准确的,但随着与响应变量的关联减少,对于固定的相关模式,其准确性降低。非零权重(即权重超过选择阈值参数)可用于识别不良行为者;但是,一组高度相关的活动组件中的组件往往具有较低的权重,其权重之和代表集合。仿真结果表明,WQS回归在某些与环境相关的条件下是准确的,但随着与响应变量的关联减少,对于固定的相关模式,其准确性降低。非零权重(即权重超过选择阈值参数)可用于识别不良行为者;但是,一组高度相关的活动组件中的组件往往具有较低的权重,其权重之和代表集合。
更新日期:2019-11-01
down
wechat
bug