Learning Causal Effect Using Machine Learning with Application to China’s Typhoon

Wu, Peng; Hu, Qi-rui; Tong, Xing-wei; Wu, Min

doi:10.1007/s10255-020-0960-1

Learning Causal Effect Using Machine Learning with Application to China’s Typhoon

Published: 02 September 2020

Volume 36, pages 702–713, (2020)
Cite this article

Acta Mathematicae Applicatae Sinica, English Series Aims and scope Submit manuscript

Peng Wu¹,
Qi-rui Hu¹,
Xing-wei Tong¹ &
…
Min Wu²

186 Accesses
1 Citation
Explore all metrics

Abstract

Matching is a routinely used technique to balance covariates and thereby alleviate confounding bias in causal inference with observational data. Most of the matching literatures involve the estimating of propensity score with parametric model, which heavily depends on the model specification. In this paper, we employ machine learning and matching techniques to learn the average causal effect. By comparing a variety of machine learning methods in terms of propensity score under extensive scenarios, we find that the ensemble methods, especially generalized random forests, perform favorably with others. We apply all the methods to the data of tropical storms that occurred on the mainland of China since 1949.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Estimating heterogeneous policy impacts using causal machine learning: a case study of health insurance reform in Indonesia

Article Open access 09 November 2021

Applying random forest in a health administrative data context: a conceptual guide

Article 17 July 2021

Uncovering the factors that affect earthquake insurance uptake using supervised machine learning

Article Open access 03 December 2023

References

Abadie, A., Imbens, G.W. Large sample properties of matching estimators for average treatment effects. Econometrica, 74(1): 235–267 (2006)
MathSciNet MATH Google Scholar
Abadie, A., Imbens, G.W. Bias-corrected matching estimators for average treatment effects. Journal of Business Economic Statistics, 29(1): 1–11 (2011)
MathSciNet MATH Google Scholar
Abadie, A., Imbens, G.W. Matching on the estimated propensity score. Econometrica, 84(2): 781–807 (2016)
MathSciNet MATH Google Scholar
Athey, S., Imbens, G.W. Recursive partitioning for heterogeneous causal effects. Proceedings of the National Academy of Sciences of the United States of America, 113(27): 7353–7360 (2015)
MathSciNet MATH Google Scholar
Athey, S., Tibshirani, J., Wager, S. Generalized random forests. The Annals of Statistics, 47(2): 1148–1178 (2019)
MathSciNet MATH Google Scholar
Breiman, L. Random Forests. Machine Learning, 45(1): 5–32 (2001)
MATH Google Scholar
Breiman, L. Statistical modeling: The two cultures. Statistical Science, 16(3): 199–215 (2001)
MathSciNet MATH Google Scholar
Cochran, W.G. The effectiveness of adjustment by subclassification in removing bias in observational studies. Biometrics, 24(2): 295–313 (1968)
MathSciNet Google Scholar
Cochran, W.G., Rubin, D.B. Controlling bias in observational studies: a review. The Indian Journal of Statistics, Series A, 35(4): 417–446 (1973)
MATH Google Scholar
Drake, C. Effects of misspecification of the propensity score on estimators of treatment effect. Biometrics, 49(4): 1231–1236 (1993)
Google Scholar
Efron, B., Hastie, T. Computer age statistical inference: algorithms, evidence, and data science. Cambridge University Press, Cambridge, 2016
MATH Google Scholar
Freund, Y., Schapire, R.E. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1): 119–139 (1997)
MathSciNet MATH Google Scholar
Friedman, J., Hastie, T., Tibshirani, R. Additive logistic regression: a statistical view of boosting. The Annals of Statistics, 28(2): 337–407 (2000)
MathSciNet MATH Google Scholar
Friedman, J.H. Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(5): 1189–1232 (2001)
MathSciNet MATH Google Scholar
Friedman, J., Hastie, T., Tibshirani, R. Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33(1): 1–22 (2010)
Google Scholar
Fu, S., Zhang, S., Liu, Y. A new tree-based non-crossing probability estimation for weighted classifiers. forthcoming (2018)
Gu, X.S., Rosenbaum, P.R. Comparison of multivariate matching methods: Structures, distances, and algorithms. Journal of Computational and Graphical Statistics, 2(4): 405–420 (1993)
Google Scholar
Hastie, T., Tibshirani, R., Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, second edition. Springer, New York, 2009
MATH Google Scholar
Hirano, K., Imbens, G.W., Ridder, G. Efficient estimation of average treatment effects using the estimated propensity score. Econometrica, 71(4): 1161–1189 (2003)
MathSciNet MATH Google Scholar
Imbens, G.W., Rubin, D.B. Causal Inference For Statistics Social and Biomedical Science. Cambridge University Press, New York, 2015
MATH Google Scholar
Lee, B.K., Lessler, J., Stuart, E.A. Improved propensity score weighting using machine learning. Statistics in Medicine, 29(3): 337–346 (2010)
MathSciNet Google Scholar
Mccaffrey, D.F., Ridgeway, G., Morral, A.R. Propensity score estimation with boosted regression for evaluating causal effects in observational studies. Psychological Methods, 9(4): 403–425 (2004)
Google Scholar
Rosenbaum, P.R., Rubin, D.B. The central role of the propensity score in observational studies for causal. Biometric, 70(1): 41–55 (1983)
MathSciNet MATH Google Scholar
Rosenbaum, P.R., Rubin, D.B. Reducing bias in observational studies using subclassification on the propensity score. Journal of the American Statistical Association, 79(387): 516–524 (1984)
Google Scholar
Rosenbaum, P.R., Rubin, D.B. Constructing a control group using multivariate matched sampling methods that incorporate the propensity score. The American Statistician, 39(1): 33–38 (1985)
Google Scholar
Rosenbaum, P.R. Observational Studies. Springer, New York, 2002
MATH Google Scholar
Rosenbaum, P.R., Ross, R.N., Silber, J.H. Minimum distance matched sampling with fine balance in an observational study of treatment for ovarian cancer. Journal of the American Statistical Association, 102(477): 75–83 (2007)
MathSciNet MATH Google Scholar
Rosenbaum, P.R. Design of Observational Studies. Springer, New York, 2010
MATH Google Scholar
Rubin, D.B. Matching to remove bias in observational studies. Biometrics, 29(1): 159–183 (1973)
Google Scholar
Rubin, D.B. The use of matched sampling and regression adjustment to remove bias in observational studies. Biometrics, 29(1): 185–203 (1973)
Google Scholar
Rubin, D.B. Bias reduction using mahalanobis-metric matching. Biometrics, 36(2): 293–298 (1980)
MATH Google Scholar
Rubin D.B. Using propensity scores to help design observational studies: Application to the tobacco litigation. Health Services and Outcomes Research Methodology, 2(3–4): 169–188 (2001)
Google Scholar
Rubin D.B., Thomas N. Characterizing the effect of matching using linear propensity score methods with normal distributions. Biometrika, 79(4): 797–809 (1992)
MathSciNet MATH Google Scholar
Rubin, D.B., Thomas, N. Matching using estimated propensity scores: Relating theory to practice. Biometrics, 52(1): 249–264 (1996)
MATH Google Scholar
Rubin, D.B., Thomas, N. Combining propensity score matching with additional adjustments for prognostic covariates. Journal of American Statistical Association, 95(450): 573–55 (2000).
Google Scholar
Schapire, R.E., Freund, Y. Boosting: Foundations and Algorithms. The MIT Press, London, 2012
MATH Google Scholar
Setoguchi, S., Schneeweiss, S., Brookhart, M.A., Glynn, R.J. Evaluating uses of data mining techniques in propensity score estimation: a simulation study. Pharmacoepidemiology and Drug Safety, 17(6): 546–555 (2008)
Google Scholar
Stuart, E.A. Matching methods for causal inference: a review and a look forward. Statistical Science, 25(1): 1–21 (2010)
MathSciNet MATH Google Scholar
Wager, S., Athey, S. Estimation and inference of heterogeneous treatment effects using random forests. Journal of American Statistical Association, 113(523): 1228–1242 (2018)
MathSciNet MATH Google Scholar
Wang, J., Shen, X., Liu, Y. Probability estimation for large-margin classifiers. Biometrika, 95(1): 166–177 (2007)
MathSciNet Google Scholar
Zhang, C., Liu, Y., Wu, Z. On the effect and remedies of shrinkage on classification probability estimation. The American Statistician, 67(3): 134–142 (2013)
MathSciNet Google Scholar
Zhao Z. Using using matching to estimate treatment effects: Data requirements, matching metrics, and monte carlo evidence. The Review of Economics and Statistics, 86(1): 91–107 (2004)
Google Scholar
Zhou, Z.H. Ensemble Methods: Foundations and Algorithms. Chapman Hall/CRC Press, Cambridge, 2012
Google Scholar
Zhu, J., Hastie, T. Kernel logistic regression and the import vector machine. International Conference on Neural Information Processing Systems: Natural Synthetic, 2001

Download references

Author information

Authors and Affiliations

School of Statistics, Beijing Normal University, Beijing, 100875, China
Peng Wu, Qi-rui Hu & Xing-wei Tong
School of mathematics and Statistics, Hubei University of Science and Technology, Hubei, 437000, China
Min Wu

Authors

Peng Wu
View author publications
You can also search for this author in PubMed Google Scholar
Qi-rui Hu
View author publications
You can also search for this author in PubMed Google Scholar
Xing-wei Tong
View author publications
You can also search for this author in PubMed Google Scholar
Min Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xing-wei Tong.

Additional information

This paper is supported by the National Key Research and Development Program of China Grant 2017Y-FA0604903 and National Natural Science Foundation of China Grant (Nos. 11671338, 11971064).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, P., Hu, Qr., Tong, Xw. et al. Learning Causal Effect Using Machine Learning with Application to China’s Typhoon. Acta Math. Appl. Sin. Engl. Ser. 36, 702–713 (2020). https://doi.org/10.1007/s10255-020-0960-1

Download citation

Received: 18 March 2019
Accepted: 16 December 2019
Published: 02 September 2020
Issue Date: July 2020
DOI: https://doi.org/10.1007/s10255-020-0960-1

Keywords

2000 MR Subject Classification

62G05

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning Causal Effect Using Machine Learning with Application to China’s Typhoon

Abstract

Access this article

Similar content being viewed by others

Estimating heterogeneous policy impacts using causal machine learning: a case study of health insurance reform in Indonesia

Applying random forest in a health administrative data context: a conceptual guide

Uncovering the factors that affect earthquake insurance uptake using supervised machine learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

2000 MR Subject Classification

Navigation

Learning Causal Effect Using Machine Learning with Application to China’s Typhoon

Abstract

Access this article

Similar content being viewed by others

Estimating heterogeneous policy impacts using causal machine learning: a case study of health insurance reform in Indonesia

Applying random forest in a health administrative data context: a conceptual guide

Uncovering the factors that affect earthquake insurance uptake using supervised machine learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

2000 MR Subject Classification

Search

Navigation