当前位置: X-MOL 学术ACM Trans. Intell. Syst. Technol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Causal Dirichlet Mixture Model for Causal Inference from Observational Data
ACM Transactions on Intelligent Systems and Technology ( IF 7.2 ) Pub Date : 2020-05-04 , DOI: 10.1145/3379500
Adi Lin 1 , Jie Lu 1 , Junyu Xuan 1 , Fujin Zhu 2 , Guangquan Zhang 1
Affiliation  

Estimating causal effects by making causal inferences from observational data is common practice in scientific studies, business decision-making, and daily life. In today’s data-driven world, causal inference has become a key part of the evaluation process for many purposes, such as examining the effects of medicine or the impact of an economic policy on society. However, although the literature contains some excellent models, there is room to improve their representation power and their ability to capture complex relationships. For these reasons, we propose a novel prior called Causal DP and a model called CDP. The prior captures the complex relationships between covariates, treatments, and outcomes in observational data using a rational probabilistic dependency structure. The model is Bayesian, nonparametric, and generative and is not based on the assumption of any parametric distribution. CDP is designed to estimate various kinds of causal effects—average, conditional average, average treated, quantile, and so on. It performs well with missing covariates and does not suffer from overfitting. Comparative experiments on synthetic datasets against several state-of-the-art methods demonstrate that CDP has a superior ability to capture complex relationships. Further, a simple evaluation to infer the effect of a job training program on trainee earnings from real-world data shows that CDP is both effective and useful for causal inference.

中文翻译:

从观测数据进行因果推断的因果狄利克雷混合模型

通过从观察数据中进行因果推断来估计因果效应是科学研究、商业决策和日常生活中的常见做法。在当今数据驱动的世界中,因果推理已成为评估过程的关键部分,用于许多目的,例如检查医学效果或经济政策对社会的影响。然而,尽管文献中包含了一些优秀的模型,但它们的表示能力和捕捉复杂关系的能力仍有提高的空间。出于这些原因,我们提出了一种称为因果 DP 的新颖先验和一种称为 CDP 的模型。先验使用合理的概率依赖结构捕捉观察数据中协变量、治疗和结果之间的复杂关系。该模型是贝叶斯的,非参数的,和生成的,并且不基于任何参数分布的假设。CDP 旨在估计各种因果效应——平均、条件平均、平均处理、分位数等。它在缺少协变量的情况下表现良好,并且不会受到过度拟合的影响。合成数据集与几种最先进的方法的比较实验表明,CDP 具有捕获复杂关系的卓越能力。此外,从现实世界数据中推断工作培训计划对受训者收入影响的简单评估表明,CDP 对于因果推理既有效又有用。它在缺少协变量的情况下表现良好,并且不会受到过度拟合的影响。合成数据集与几种最先进的方法的比较实验表明,CDP 具有捕获复杂关系的卓越能力。此外,从现实世界数据中推断工作培训计划对受训者收入影响的简单评估表明,CDP 对于因果推理既有效又有用。它在缺少协变量的情况下表现良好,并且不会受到过度拟合的影响。合成数据集与几种最先进的方法的比较实验表明,CDP 具有捕获复杂关系的卓越能力。此外,从现实世界数据中推断工作培训计划对受训者收入影响的简单评估表明,CDP 对于因果推理既有效又有用。
更新日期:2020-05-04
down
wechat
bug