当前位置: X-MOL 学术J. Am. Med. Inform. Assoc. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Deep propensity network using a sparse autoencoder for estimation of treatment effects
Journal of the American Medical Informatics Association ( IF 4.7 ) Pub Date : 2021-02-16 , DOI: 10.1093/jamia/ocaa346
Shantanu Ghosh 1 , Jiang Bian 2 , Yi Guo 2 , Mattia Prosperi 3
Affiliation  

Abstract
Objective
Drawing causal estimates from observational data is problematic, because datasets often contain underlying bias (eg, discrimination in treatment assignment). To examine causal effects, it is important to evaluate what-if scenarios—the so-called “counterfactuals.” We propose a novel deep learning architecture for propensity score matching and counterfactual prediction—the deep propensity network using a sparse autoencoder (DPN-SA)—to tackle the problems of high dimensionality, nonlinear/nonparallel treatment assignment, and residual confounding when estimating treatment effects.
Materials and Methods
We used 2 randomized prospective datasets, a semisynthetic one with nonlinear/nonparallel treatment selection bias and simulated counterfactual outcomes from the Infant Health and Development Program and a real-world dataset from the LaLonde’s employment training program. We compared different configurations of the DPN-SA against logistic regression and LASSO as well as deep counterfactual networks with propensity dropout (DCN-PD). Models’ performances were assessed in terms of average treatment effects, mean squared error in precision on effect’s heterogeneity, and average treatment effect on the treated, over multiple training/test runs.
Results
The DPN-SA outperformed logistic regression and LASSO by 36%–63%, and DCN-PD by 6%–10% across all datasets. All deep learning architectures yielded average treatment effects close to the true ones with low variance. Results were also robust to noise-injection and addition of correlated variables. Code is publicly available at https://github.com/Shantanu48114860/DPN-SAz.
Discussion and Conclusion
Deep sparse autoencoders are particularly suited for treatment effect estimation studies using electronic health records because they can handle high-dimensional covariate sets, large sample sizes, and complex heterogeneity in treatment assignments.


中文翻译:

使用稀疏自动编码器估计治疗效果的深度倾向网络

摘要
客观的
从观察数据中得出因果估计是有问题的,因为数据集通常包含潜在的偏差(例如,治疗分配中的歧视)。为了检验因果效应,重要的是评估假设情景——即所谓的“反事实”。我们提出了一种用于倾向得分匹配和反事实预测的新型深度学习架构——使用稀疏自动编码器 (DPN-SA) 的深度倾向网络——以解决估计治疗效果时的高维、非线性/非并行治疗分配和残余混杂问题.
材料和方法
我们使用了 2 个随机前瞻性数据集,一个具有非线性/非平行治疗选择偏差的半合成数据集和来自婴儿健康与发展计划的模拟反事实结果,以及来自 LaLonde 就业培训计划的真实世界数据集。我们将 DPN-SA 的不同配置与逻辑回归和 LASSO 以及具有倾向丢失 (DCN-PD) 的深度反事实网络进行了比较。在多次训练/测试运行中,根据平均治疗效果、效果异质性精度的均方误差以及治疗后的平均治疗效果来评估模型的性能。
结果
在所有数据集中,DPN-SA 比逻辑回归和 LASSO 高 36%–63%,DCN-PD 高 6%–10%。所有深度学习架构都产生了接近真实效果的平均处理效果,具有低方差。结果对噪声注入和相关变量的添加也很稳健。代码可在 https://github.com/Shantanu48114860/DPN-SAz 公开获得。
讨论和结论
深度稀疏自动编码器特别适用于使用电子健康记录的治疗效果估计研究,因为它们可以处理高维协变量集、大样本量和治疗分配中的复杂异质性。
更新日期:2021-02-16
down
wechat
bug