当前位置: X-MOL 学术Biom. J. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Automatic variable selection for exposure-driven propensity score matching with unmeasured confounders
Biometrical Journal ( IF 1.7 ) Pub Date : 2020-03-23 , DOI: 10.1002/bimj.201800190
Daniela Zöller 1, 2 , Leesa F Wockner 3 , Harald Binder 1, 2
Affiliation  

Multivariable model building for propensity score modeling approaches is challenging. A common propensity score approach is exposure-driven propensity score matching, where the best model selection strategy is still unclear. In particular, the situation may require variable selection, while it is still unclear if variables included in the propensity score should be associated with the exposure and the outcome, with either the exposure or the outcome, with at least the exposure or with at least the outcome. Unmeasured confounders, complex correlation structures, and non-normal covariate distributions further complicate matters. We consider the performance of different modeling strategies in a simulation design with a complex but realistic structure and effects on a binary outcome. We compare the strategies in terms of bias and variance in estimated marginal exposure effects. Considering the bias in estimated marginal exposure effects, the most reliable results for estimating the propensity score are obtained by selecting variables related to the exposure. On average this results in the least bias and does not greatly increase variances. Although our results cannot be generalized, this provides a counterexample to existing recommendations in the literature based on simple simulation settings. This highlights that recommendations obtained in simple simulation settings cannot always be generalized to more complex, but realistic settings and that more complex simulation studies are needed.

中文翻译:

暴露驱动倾向评分与未测量混杂因素匹配的自动变量选择

倾向评分建模方法的多变量模型构建具有挑战性。一种常见的倾向评分方法是曝光驱动的倾向评分匹配,其中最佳模型选择策略仍不清楚。特别是,这种情况可能需要选择变量,但仍不清楚倾向评分中包含的变量是否应与暴露和结果相关联,与暴露或结果相关,至少与暴露或至少与结果相关。结果。未测量的混杂因素、复杂的相关结构和非正态协变量分布使问题进一步复杂化。我们在具有复杂但现实的结构和对二元结果的影响的模拟设计中考虑不同建模策略的性能。我们根据估计的边际暴露效应的偏差和方差来比较这些策略。考虑到估计的边际暴露效应的偏差,通过选择与暴露相关的变量来获得估计倾向得分的最可靠结果。平均而言,这会导致最小的偏差,并且不会大大增加差异。尽管我们的结果不能一概而论,但这为文献中基于简单模拟设置的现有建议提供了一个反例。这突出表明,在简单的模拟设置中获得的建议并不总是可以推广到更复杂但实际的设置,并且需要更复杂的模拟研究。通过选择与暴露相关的变量获得估计倾向得分的最可靠结果。平均而言,这会导致最小的偏差,并且不会大大增加差异。尽管我们的结果不能一概而论,但这为文献中基于简单模拟设置的现有建议提供了一个反例。这突出表明,在简单的模拟设置中获得的建议并不总是可以推广到更复杂但实际的设置,并且需要更复杂的模拟研究。通过选择与暴露相关的变量获得估计倾向得分的最可靠结果。平均而言,这会导致最小的偏差,并且不会大大增加差异。尽管我们的结果不能一概而论,但这为文献中基于简单模拟设置的现有建议提供了一个反例。这突出表明,在简单的模拟设置中获得的建议并不总是可以推广到更复杂但实际的设置,并且需要更复杂的模拟研究。这为文献中基于简单模拟设置的现有建议提供了反例。这突出表明,在简单的模拟设置中获得的建议并不总是可以推广到更复杂但实际的设置,并且需要更复杂的模拟研究。这为文献中基于简单模拟设置的现有建议提供了反例。这突出表明,在简单的模拟设置中获得的建议并不总是可以推广到更复杂但实际的设置,并且需要更复杂的模拟研究。
更新日期:2020-03-23
down
wechat
bug