当前位置: X-MOL 学术Stat. Med. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Bootstrap vs asymptotic variance estimation when using propensity score weighting with continuous and binary outcomes
Statistics in Medicine ( IF 2 ) Pub Date : 2022-07-15 , DOI: 10.1002/sim.9519
Peter C Austin 1, 2, 3
Affiliation  

We used Monte Carlo simulations to compare the performance of asymptotic variance estimators to that of the bootstrap when estimating standard errors of differences in means, risk differences, and relative risks using propensity score weighting. We considered four different sets of weights: conventional inverse probability of treatment weights with the average treatment effect (ATE) as the target estimand, weights for estimating the average treatment effect in the treated (ATT), matching weights, and overlap weights. We considered sample sizes ranging from 250 to 10 000 and allowed the prevalence of treatment to range from 0.1 to 0.9. We found that, when using ATE weights and sample sizes were ≤ 1000, then the use of the bootstrap resulted in estimates of SE that were more accurate than the asymptotic estimates. A similar finding was observed when using ATT weights and sample sizes were ≤ 1000 and the prevalence of treatment was moderate to high. When using matching weights and overlap weights, both the asymptotic estimator and the bootstrap resulted in accurate estimates of SE across all sample sizes and prevalences of treatment. Even when using the bootstrap with ATE weights, empirical coverage rates of confidence intervals were suboptimal when sample sizes were low to moderate and the prevalence of treatment was either very low or very high. A similar finding was observed when using the bootstrap with ATT weights when sample sizes were low to moderate and the prevalence of treatment was very high.

中文翻译:

使用具有连续和二元结果的倾向得分加权时的 Bootstrap 与渐近方差估计

在使用倾向得分加权估计均值差异、风险差异和相对风险的标准误差时,我们使用蒙特卡洛模拟来比较渐近方差估计器的性能与引导程序的性能。我们考虑了四组不同的权重:以平均治疗效果 (ATE) 作为目标估计值的常规治疗逆概率权重、用于估计治疗中平均治疗效果的权重 (ATT)、匹配权重和重叠权重。我们考虑了从 250 到 10 000 的样本量,并允许治疗的流行范围从 0.1 到 0.9。我们发现,当使用 ATE 权重且样本量 ≤ 1000 时,使用 bootstrap 得到的 SE 估计值比渐近估计值更准确。当使用 ATT 权重且样本量 ≤ 1000 并且治疗的流行程度为中度至高度时,观察到了类似的发现。当使用匹配权重和重叠权重时,渐近估计器和引导程序都可以准确估计所有样本量和治疗普遍性的 SE。即使在使用具有 ATE 权重的 bootstrap 时,当样本量从低到中等且治疗的流行率非常低或非常高时,置信区间的经验覆盖率也不是最佳的。当样本量从低到中等且治疗的普及率非常高时,使用具有 ATT 权重的引导程序时也观察到了类似的发现。渐近估计器和引导程序都可以准确估计所有样本量和治疗流行率的 SE。即使在使用具有 ATE 权重的 bootstrap 时,当样本量从低到中等且治疗的流行率非常低或非常高时,置信区间的经验覆盖率也不是最佳的。当样本量从低到中等且治疗的普及率非常高时,使用具有 ATT 权重的引导程序时也观察到了类似的发现。渐近估计器和引导程序都可以准确估计所有样本量和治疗流行率的 SE。即使在使用具有 ATE 权重的 bootstrap 时,当样本量从低到中等且治疗的流行率非常低或非常高时,置信区间的经验覆盖率也不是最佳的。当样本量从低到中等且治疗的普及率非常高时,使用具有 ATT 权重的引导程序时也观察到了类似的发现。当样本量从低到中等且治疗的流行率非常低或非常高时,置信区间的经验覆盖率不是最理想的。当样本量从低到中等且治疗的普及率非常高时,使用具有 ATT 权重的引导程序时也观察到了类似的发现。当样本量从低到中等且治疗的流行率非常低或非常高时,置信区间的经验覆盖率不是最理想的。当样本量从低到中等且治疗的普及率非常高时,使用具有 ATT 权重的引导程序时也观察到了类似的发现。
更新日期:2022-07-15
down
wechat
bug