当前位置: X-MOL 学术J. Agric. Biol. Environ. Stat. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Characterization of Weighted Quantile Sum Regression for Highly Correlated Data in a Risk Analysis Setting.
Journal of Agricultural, Biological and Environmental Statistics ( IF 1.4 ) Pub Date : 2014-12-24 , DOI: 10.1007/s13253-014-0180-3
Caroline Carrico 1 , Chris Gennings 1 , David C Wheeler 1 , Pam Factor-Litvak 2
Affiliation  

In risk evaluation, the effect of mixtures of environmental chemicals on a common adverse outcome is of interest. However, due to the high dimensionality and inherent correlations among chemicals that occur together, the traditional methods (e.g. ordinary or logistic regression) suffer from collinearity and variance inflation, and shrinkage methods have limitations in selecting among correlated components. We propose a weighted quantile sum (WQS) approach to estimating a body burden index, which identifies "bad actors" in a set of highly correlated environmental chemicals. We evaluate and characterize the accuracy of WQS regression in variable selection through extensive simulation studies through sensitivity and specificity (i.e., ability of the WQS method to select the bad actors correctly and not incorrect ones). We demonstrate the improvement in accuracy this method provides over traditional ordinary regression and shrinkage methods (lasso, adaptive lasso, and elastic net). Results from simulations demonstrate that WQS regression is accurate under some environmentally relevant conditions, but its accuracy decreases for a fixed correlation pattern as the association with a response variable diminishes. Nonzero weights (i.e., weights exceeding a selection threshold parameter) may be used to identify bad actors; however, components within a cluster of highly correlated active components tend to have lower weights, with the sum of their weights representative of the set.

中文翻译:


风险分析环境中高度相关数据的加权分位数和回归的表征。



在风险评估中,环境化学品混合物对常见不良结果的影响是令人感兴趣的。然而,由于一起发生的化学物质之间的高维性和固有相关性,传统方法(例如普通回归或逻辑回归)存在共线性和方差膨胀的问题,而收缩方法在选择相关成分时存在局限性。我们提出了一种加权分位数和(WQS)方法来估计身体负担指数,该方法可以识别一组高度相关的环境化学物质中的“不良行为者”。我们通过敏感性和特异性(即 WQS 方法正确选择不良行为者而不是不正确行为者的能力)进行广泛的模拟研究,评估和表征 WQS 回归在变量选择中的准确性。我们证明了该方法相对于传统的普通回归和收缩方法(套索、自适应套索和弹性网络)所提供的准确性的提高。模拟结果表明,WQS 回归在某些环境相关条件下是准确的,但对于固定相关模式,其准确性会随着与响应变量的关联性减弱而降低。非零权重(即超过选择阈值参数的权重)可用于识别不良行为者;然而,高度相关的活动成分簇内的成分往往具有较低的权重,其权重之和代表该集合。
更新日期:2019-11-01
down
wechat
bug