Skip to main content

Advertisement

Log in

Predicting counterfactual risks under hypothetical treatment strategies: an application to HIV

  • METHODS
  • Published:
European Journal of Epidemiology Aims and scope Submit manuscript

Abstract

The accuracy of a prediction algorithm depends on contextual factors that may vary across deployment settings. To address this inherent limitation of prediction, we propose an approach to counterfactual prediction based on the g-formula to predict risk across populations that differ in their distribution of treatment strategies. We apply this to predict 5-year risk of mortality among persons receiving care for HIV in the U.S. Veterans Health Administration under different hypothetical treatment strategies. First, we implement a conventional approach to develop a prediction algorithm in the observed data and show how the algorithm may fail when transported to new populations with different treatment strategies. Second, we generate counterfactual data under different treatment strategies and use it to assess the robustness of the original algorithm’s performance to these differences and to develop counterfactual prediction algorithms. We discuss how estimating counterfactual risks under a particular treatment strategy is more challenging than conventional prediction as it requires the same data, methods, and unverifiable assumptions as causal inference. However, this may be required when the alternative assumption of constant treatment patterns across deployment settings is unlikely to hold and new data is not yet available to retrain the algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Dickerman BA, Hernán MA. Counterfactual prediction is not only for causal inference. Eur J Epidemiol. 2020;35(7):615–7. https://doi.org/10.1007/s10654-020-00659-8.

    Article  PubMed  PubMed Central  Google Scholar 

  2. van Geloven N, Swanson SA, Ramspek CL, et al. Prediction meets causal inference: the role of treatment in clinical prediction models. Eur J Epidemiol. 2020;35:619–30.

    Article  Google Scholar 

  3. Schulam P, Saria S. Reliable decision support using counterfactual models. Adv Neural Inf Process Syst. 2017;30:1697–708.

    Google Scholar 

  4. Subbaswamy A, Saria S. From development to deployment: dataset shift, causality, and shift-stable models in health AI. Biostatistics. 2020;21(2):345–52. https://doi.org/10.1093/biostatistics/kxz041.

    Article  PubMed  Google Scholar 

  5. Dahabreh IJ, Hernán MA. Extending inferences from a randomized trial to a target population. Eur J Epidemiol. 2019;34(8):719–22. https://doi.org/10.1007/s10654-019-00533-2.

    Article  CAS  PubMed  Google Scholar 

  6. Dahabreh IJ, Robertson SE, Steingrimsson JA, Stuart EA, Hernán MA. Extending inferences from a randomized trial to a new target population. Stat Med. 2020;39(14):1999–2014. https://doi.org/10.1002/sim.8426.

    Article  PubMed  Google Scholar 

  7. Finlayson SG, Subbaswamy A, Singh K, et al. The clinician and dataset shift in artificial intelligence. N Engl J Med. 2021;385(3):283–6. https://doi.org/10.1056/NEJMc2104626.

    Article  PubMed  PubMed Central  Google Scholar 

  8. Sperrin M, Martin GP, Pate A, Van Staa T, Peek N, Buchan I. Using marginal structural models to adjust for treatment drop-in when developing clinical prediction models. Stat Med. 2018;37(28):4142–54. https://doi.org/10.1002/sim.7913.

    Article  PubMed  PubMed Central  Google Scholar 

  9. Pajouheshnia R, Peelen LM, Moons KGM, Reitsma JB, Groenwold RHH. Accounting for treatment use when validating a prognostic model: a simulation study. BMC Med Res Methodol. 2017;17(1):103. https://doi.org/10.1186/s12874-017-0375-8.

    Article  PubMed  PubMed Central  Google Scholar 

  10. Lin L, Sperrin M, Jenkins DA, Martin GP, Peek N. A scoping review of causal methods enabling predictions under hypothetical interventions. Diagn Progn Res. 2021;5(1):3. https://doi.org/10.1186/s41512-021-00092-9.

    Article  PubMed  PubMed Central  Google Scholar 

  11. Hernán MA, Hsu J, Healy B. A second chance to get causal inference right: a classification of data science tasks. Chance. 2019;32(1):42–9.

    Article  Google Scholar 

  12. U.S. Department of Veterans Affairs. Veterans Affairs HIV Program Fact Sheet. 2020. https://www.hiv.va.gov/pdf/HIV-program-factsheet.pdf.

  13. Justice AC, Dombrowski E, Conigliaro J, et al. Veterans Aging Cohort Study (VACS): overview and description. Med Care. 2006;44(8 Suppl 2):S13-24. https://doi.org/10.1097/01.mlr.0000223741.02074.66.

    Article  PubMed  PubMed Central  Google Scholar 

  14. Tibshirani R. Regression shrinkage and selection via the lasso. J Royal Stat Soc Series B (Methodol). 1996;58(1):267–88.

    Google Scholar 

  15. Steyerberg EW, Eijkemans MJC, Habbema JDF. Application of shrinkage techniques in logistic regression analysis: a case study. Stat Neerl. 2001;55(1):76–88. https://doi.org/10.1111/1467-9574.00157.

    Article  Google Scholar 

  16. Tate JP, Justice AC, Hughes MD, et al. An internationally generalizable risk index for mortality after one year of antiretroviral therapy. AIDS. 2013;27(4):563–72. https://doi.org/10.1097/QAD.0b013e32835b8c7f.

    Article  PubMed  Google Scholar 

  17. Tate JP, Sterne JAC, Justice AC. Veterans Aging Cohort Study and the Antiretroviral Therapy Cohort Collaboration. Albumin, white blood cell count, and body mass index improve discrimination of mortality in HIV-positive individuals. AIDS. 2019;33(5):903–12. https://doi.org/10.1097/QAD.0000000000002140.

    Article  CAS  PubMed  Google Scholar 

  18. Moons KGM, Altman DG, Reitsma JB, et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): explanation and elaboration. Ann Intern Med. 2015;162(1):W1–73. https://doi.org/10.7326/M14-0698.

    Article  PubMed  Google Scholar 

  19. Austin PC, Steyerberg EW. Graphical assessment of internal and external calibration of logistic regression models by using loess smoothers. Stat Med. 2014;33(3):517–35. https://doi.org/10.1002/sim.5941.

    Article  PubMed  Google Scholar 

  20. Hernán MA, Robins JM. Using big data to emulate a target trial when a randomized trial is not available. Am J Epidemiol. 2016;183(8):758–64. https://doi.org/10.1093/aje/kwv254.

    Article  PubMed  PubMed Central  Google Scholar 

  21. Hernán MA, Robins JM. Per-protocol analyses of pragmatic trials. N Engl J Med. 2017;377(14):1391–8. https://doi.org/10.1056/NEJMsm1605385.

    Article  PubMed  Google Scholar 

  22. Hernán MA, Robins JM. Causal inference: what if. Boca Raton: Chapman & Hall/CRC; 2020.

    Google Scholar 

  23. Robins JM. A new approach to causal inference in mortality studies with a sustained exposure period—Application to the healthy worker survivor effect [published errata appear in Mathl Modelling 1987;14:917–21]. Math Model. 1986;7:1393–512.

    Article  Google Scholar 

  24. Taubman SL, Robins JM, Mittleman MA, Hernán MA. Intervening on risk factors for coronary heart disease: an application of the parametric g-formula. Int J Epidemiol. 2009;38(6):1599–611. https://doi.org/10.1093/ije/dyp192.

    Article  PubMed  PubMed Central  Google Scholar 

  25. Young JG, Cain LE, Robins JM, O’Reilly EJ, Hernán MA. Comparative effectiveness of dynamic treatment regimes: an application of the parametric g-formula. Stat Biosci. 2011;3(1):119–43. https://doi.org/10.1007/s12561-011-9040-7.

    Article  PubMed  PubMed Central  Google Scholar 

  26. Sugiyama M, Krauledat M, Müller KM. Covariate shift adaptation by importance weighted cross validation. J Mach Learn Res. 2007;8:985–1005.

    Google Scholar 

  27. Gretton A, Smola A, Huang J, Schmittfull M, Borgwardt K, Schölkopf B. Covariate shift by kernel mean matching. In: Quiñonero-Candela J, Sugiyama M, Schwaighofer A, Lawrence ND, editors. Dataset shift in machine learning. Cambridge, MA: The MIT Press; 2008. p. 131–60.

    Chapter  Google Scholar 

  28. Steingrimsson JA, Gatsonis C, Dahabreh IJ. Transporting a prediction model for use in a new target population. 2021; https://arxiv.org/abs/2101.11182v2.

  29. Subbaswamy A, Saria S. Counterfactual normalization: proactively addressing dataset shfit and improving reliability using causal mechanisms. Proceedings of the Thirty-Fourth Conference on Uncertainty in Artificial Intelligence. 2018. 947–57.

  30. Subbaswamy A, Schulam P, Saria S. Preventing failures due to dataset shift: learning predictive models that transport. Artificial Intelligence and Statistics (AISTATS). 2019.

  31. Dahabreh IJ, Robins JM, Haneuse S, Hernán MA. Generalizing causal inferences from randomized trials: counterfactual and graphical identification. 2019; https://arxiv.org/abs/1906.10792v1.

  32. Robins J, Orellana L, Rotnitzky A. Estimation and extrapolation of optimal treatment and testing strategies. Stat Med. 2008;27(23):4678–721. https://doi.org/10.1002/sim.3301.

    Article  PubMed  Google Scholar 

  33. Hernán MA, VanderWeele TJ. Compound treatments and transportability of causal inference. Epidemiology. 2011;22(3):368–77. https://doi.org/10.1097/EDE.0b013e3182109296.

    Article  PubMed  PubMed Central  Google Scholar 

  34. VanderWeele TJ, Hernán MA. Causal inference under multiple versions of treatment. J Causal Inference. 2013;1(1):1–20. https://doi.org/10.1515/jci-2012-0002.

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Funding

This research was supported by National Institutes of Health grants K99 CA248335 (B.A.D.) and R37 AI02634 and Providence/Boston Center for AIDS Research grant P30 AI042853 (S.L.).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Barbra A. Dickerman.

Ethics declarations

Conflict of interest

The author declares that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 365 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dickerman, B.A., Dahabreh, I.J., Cantos, K.V. et al. Predicting counterfactual risks under hypothetical treatment strategies: an application to HIV. Eur J Epidemiol 37, 367–376 (2022). https://doi.org/10.1007/s10654-022-00855-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10654-022-00855-8

Keywords

Navigation