Abstract
Matching is a routinely used technique to balance covariates and thereby alleviate confounding bias in causal inference with observational data. Most of the matching literatures involve the estimating of propensity score with parametric model, which heavily depends on the model specification. In this paper, we employ machine learning and matching techniques to learn the average causal effect. By comparing a variety of machine learning methods in terms of propensity score under extensive scenarios, we find that the ensemble methods, especially generalized random forests, perform favorably with others. We apply all the methods to the data of tropical storms that occurred on the mainland of China since 1949.
Similar content being viewed by others
References
Abadie, A., Imbens, G.W. Large sample properties of matching estimators for average treatment effects. Econometrica, 74(1): 235–267 (2006)
Abadie, A., Imbens, G.W. Bias-corrected matching estimators for average treatment effects. Journal of Business Economic Statistics, 29(1): 1–11 (2011)
Abadie, A., Imbens, G.W. Matching on the estimated propensity score. Econometrica, 84(2): 781–807 (2016)
Athey, S., Imbens, G.W. Recursive partitioning for heterogeneous causal effects. Proceedings of the National Academy of Sciences of the United States of America, 113(27): 7353–7360 (2015)
Athey, S., Tibshirani, J., Wager, S. Generalized random forests. The Annals of Statistics, 47(2): 1148–1178 (2019)
Breiman, L. Random Forests. Machine Learning, 45(1): 5–32 (2001)
Breiman, L. Statistical modeling: The two cultures. Statistical Science, 16(3): 199–215 (2001)
Cochran, W.G. The effectiveness of adjustment by subclassification in removing bias in observational studies. Biometrics, 24(2): 295–313 (1968)
Cochran, W.G., Rubin, D.B. Controlling bias in observational studies: a review. The Indian Journal of Statistics, Series A, 35(4): 417–446 (1973)
Drake, C. Effects of misspecification of the propensity score on estimators of treatment effect. Biometrics, 49(4): 1231–1236 (1993)
Efron, B., Hastie, T. Computer age statistical inference: algorithms, evidence, and data science. Cambridge University Press, Cambridge, 2016
Freund, Y., Schapire, R.E. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1): 119–139 (1997)
Friedman, J., Hastie, T., Tibshirani, R. Additive logistic regression: a statistical view of boosting. The Annals of Statistics, 28(2): 337–407 (2000)
Friedman, J.H. Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(5): 1189–1232 (2001)
Friedman, J., Hastie, T., Tibshirani, R. Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33(1): 1–22 (2010)
Fu, S., Zhang, S., Liu, Y. A new tree-based non-crossing probability estimation for weighted classifiers. forthcoming (2018)
Gu, X.S., Rosenbaum, P.R. Comparison of multivariate matching methods: Structures, distances, and algorithms. Journal of Computational and Graphical Statistics, 2(4): 405–420 (1993)
Hastie, T., Tibshirani, R., Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, second edition. Springer, New York, 2009
Hirano, K., Imbens, G.W., Ridder, G. Efficient estimation of average treatment effects using the estimated propensity score. Econometrica, 71(4): 1161–1189 (2003)
Imbens, G.W., Rubin, D.B. Causal Inference For Statistics Social and Biomedical Science. Cambridge University Press, New York, 2015
Lee, B.K., Lessler, J., Stuart, E.A. Improved propensity score weighting using machine learning. Statistics in Medicine, 29(3): 337–346 (2010)
Mccaffrey, D.F., Ridgeway, G., Morral, A.R. Propensity score estimation with boosted regression for evaluating causal effects in observational studies. Psychological Methods, 9(4): 403–425 (2004)
Rosenbaum, P.R., Rubin, D.B. The central role of the propensity score in observational studies for causal. Biometric, 70(1): 41–55 (1983)
Rosenbaum, P.R., Rubin, D.B. Reducing bias in observational studies using subclassification on the propensity score. Journal of the American Statistical Association, 79(387): 516–524 (1984)
Rosenbaum, P.R., Rubin, D.B. Constructing a control group using multivariate matched sampling methods that incorporate the propensity score. The American Statistician, 39(1): 33–38 (1985)
Rosenbaum, P.R. Observational Studies. Springer, New York, 2002
Rosenbaum, P.R., Ross, R.N., Silber, J.H. Minimum distance matched sampling with fine balance in an observational study of treatment for ovarian cancer. Journal of the American Statistical Association, 102(477): 75–83 (2007)
Rosenbaum, P.R. Design of Observational Studies. Springer, New York, 2010
Rubin, D.B. Matching to remove bias in observational studies. Biometrics, 29(1): 159–183 (1973)
Rubin, D.B. The use of matched sampling and regression adjustment to remove bias in observational studies. Biometrics, 29(1): 185–203 (1973)
Rubin, D.B. Bias reduction using mahalanobis-metric matching. Biometrics, 36(2): 293–298 (1980)
Rubin D.B. Using propensity scores to help design observational studies: Application to the tobacco litigation. Health Services and Outcomes Research Methodology, 2(3–4): 169–188 (2001)
Rubin D.B., Thomas N. Characterizing the effect of matching using linear propensity score methods with normal distributions. Biometrika, 79(4): 797–809 (1992)
Rubin, D.B., Thomas, N. Matching using estimated propensity scores: Relating theory to practice. Biometrics, 52(1): 249–264 (1996)
Rubin, D.B., Thomas, N. Combining propensity score matching with additional adjustments for prognostic covariates. Journal of American Statistical Association, 95(450): 573–55 (2000).
Schapire, R.E., Freund, Y. Boosting: Foundations and Algorithms. The MIT Press, London, 2012
Setoguchi, S., Schneeweiss, S., Brookhart, M.A., Glynn, R.J. Evaluating uses of data mining techniques in propensity score estimation: a simulation study. Pharmacoepidemiology and Drug Safety, 17(6): 546–555 (2008)
Stuart, E.A. Matching methods for causal inference: a review and a look forward. Statistical Science, 25(1): 1–21 (2010)
Wager, S., Athey, S. Estimation and inference of heterogeneous treatment effects using random forests. Journal of American Statistical Association, 113(523): 1228–1242 (2018)
Wang, J., Shen, X., Liu, Y. Probability estimation for large-margin classifiers. Biometrika, 95(1): 166–177 (2007)
Zhang, C., Liu, Y., Wu, Z. On the effect and remedies of shrinkage on classification probability estimation. The American Statistician, 67(3): 134–142 (2013)
Zhao Z. Using using matching to estimate treatment effects: Data requirements, matching metrics, and monte carlo evidence. The Review of Economics and Statistics, 86(1): 91–107 (2004)
Zhou, Z.H. Ensemble Methods: Foundations and Algorithms. Chapman Hall/CRC Press, Cambridge, 2012
Zhu, J., Hastie, T. Kernel logistic regression and the import vector machine. International Conference on Neural Information Processing Systems: Natural Synthetic, 2001
Author information
Authors and Affiliations
Corresponding author
Additional information
This paper is supported by the National Key Research and Development Program of China Grant 2017Y-FA0604903 and National Natural Science Foundation of China Grant (Nos. 11671338, 11971064).
Rights and permissions
About this article
Cite this article
Wu, P., Hu, Qr., Tong, Xw. et al. Learning Causal Effect Using Machine Learning with Application to China’s Typhoon. Acta Math. Appl. Sin. Engl. Ser. 36, 702–713 (2020). https://doi.org/10.1007/s10255-020-0960-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10255-020-0960-1