When and when not to use optimal model averaging

Schomaker, Michael; Heumann, Christian

doi:10.1007/s00362-018-1048-3

When and when not to use optimal model averaging

Regular Article
Published: 14 September 2018

Volume 61, pages 2221–2240, (2020)
Cite this article

Statistical Papers Aims and scope Submit manuscript

902 Accesses
4 Citations
2 Altmetric
Explore all metrics

Abstract

Traditionally model averaging has been viewed as an alternative to model selection with the ultimate goal to incorporate the uncertainty associated with the model selection process in standard errors and confidence intervals by using a weighted combination of candidate models. In recent years, a new class of model averaging estimators has emerged in the literature, suggesting to combine models such that the squared risk, or other risk functions, are minimized. We argue that, contrary to popular belief, these estimators do not necessarily address the challenges induced by model selection uncertainty, but should be regarded as attractive complements for the machine learning and forecasting literature, as well as tools to identify causal parameters. We illustrate our point by means of several targeted simulation studies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Model Averaging

A general framework for frequentist model averaging

Article 18 January 2019

Weighted-Average Least Squares (WALS): Confidence and Prediction Intervals

Article Open access 22 April 2022

References

Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Proceeding of the second international symposiumon information theory, Budapest, pp 267–281
Bang H, Robins JM (2005) Doubly robust estimation in missing data and causal inference models. Biometrics 64(2):962–972
MathSciNet MATH Google Scholar
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
MATH Google Scholar
Buckland ST, Burnham KP, Augustin NH (1997) Model selection: an integral part of inference. Biometrics 53:603–618
MATH Google Scholar
Burnham K, Anderson D (2002) Model selection and multimodel inference. A practical information-theoretic approach. Springer, New York
MATH Google Scholar
Chatfield C (1995) Model uncertainty, data mining and statistical inference. J R Stat Soc A 158:419–466
Google Scholar
Cheng TCF, Ing CK, Yu SH (2015) Toward optimal model averaging in regression models with time series errors. J Econometr 189(2):321–334
MathSciNet MATH Google Scholar
Daniel RM, Cousens SN, De Stavola BL, Kenward MG, Sterne JA (2013) Methods for dealing with time-dependent confounding. Stat Med 32(9):1584–1618
MathSciNet Google Scholar
Draper D (1995) Assessment and propagation of model uncertainty. J R Stat Soc B 57:45–97
MathSciNet MATH Google Scholar
Fletcher D, Dillingham PW (2011) Model-averaged confidence intervals for factorial experiments. Comput Stat Data Anal 55:3041–3048
MathSciNet Google Scholar
Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1–22
Google Scholar
Gao Y, Zhang XY, Wang SY, Zou GH (2016) Model averaging based on leave-subject-out cross-validation. J Econometr 192(1):139–151
MathSciNet MATH Google Scholar
Gelman A, Su YS (2016) arm: data analysis using regression and multilevel/hierarchical models. R package version 1.9-3. https://CRAN.R-project.org/package=arm. Accessed 12 Sept 2018
Gruber S, van der Laan MJ (2012) tmle: an R package for targeted maximum likelihood estimation. J Stat Softw 51(13):1–35
Google Scholar
Hansen BE (2007) Least squares model averaging. Econometrica 75:1175–1189
MathSciNet MATH Google Scholar
Hansen BE (2008) Least squares forecast averaging. J Econometr 146:342–350
MathSciNet MATH Google Scholar
Hansen BE, Racine J (2012) Jackknife model averaging. J Econometr 167:38–46
MathSciNet MATH Google Scholar
Hjort L, Claeskens G (2003) Frequentist model average estimators. J Am Stat Assoc 98:879–945
MathSciNet MATH Google Scholar
Hoeting JA, Madigan D, Raftery AE, Volinsky CT (1999) Bayesian model averaging: a tutorial. Stat Sci 14:382–417
MathSciNet MATH Google Scholar
Kabaila P, Welsh A, Abeysekera W (2016) Model-averaged confidence intervals. Scand J Stat 43:35–48
MathSciNet MATH Google Scholar
Leeb H, Pötscher BM (2005) Model selection and inference: facts and fiction. Econometr Theory 21:21–59
MathSciNet MATH Google Scholar
Leeb H, Pötscher BM (2008) Model Selection. Springer, New York, pp 785–821
MATH Google Scholar
Lendle SD, Schwab J, Petersen ML, van der Laan MJ (2017) ltmle: an R package implementing targeted minimum loss-based estimation for longitudinal data. J Stat Softw 81(1):1–21
Google Scholar
Liang H, Zou GH, Wan ATK, Zhang XY (2011) Optimal weight choice for frequentist model average estimators. J Am Stat Assoc 106(495):1053–1066
MathSciNet MATH Google Scholar
Liu C, Kuo B (2016) Model averaging in predictive regressions. Econometr J 19(2):203–231
MathSciNet Google Scholar
Liu QF, Okui R, Yoshimura A (2016) Generalized least squares model averaging. Econometr Rev 35(8–10):1692–1752
MathSciNet Google Scholar
Mallows C (1973) Some comments on $C_p$. Technometrics 15:661–675
MATH Google Scholar
Petersen M, Schwab J, Gruber S, Blaser N, Schomaker M, van der Laan M (2014) Targeted maximum likelihood estimation for dynamic and static longitudinal marginal structural working models. J Causal Inference 2:147–185
Google Scholar
Polley E, LeDell E, Kennedy C, van der Laan M (2017) SuperLearner: super learner prediction. R package version 2.0-22. https://CRAN.R-project.org/package=SuperLearner. Accessed 12 Sept 2018
Pötscher B (2006) The distribution of model averaging estimators and an impossibility result regarding its estimation. Lect Notes Monogr Ser 52:113–129
MathSciNet MATH Google Scholar
Raftery A, Hoeting J, Volinsky C, Painter I, Yeung KY (2017) BMA: Bayesian model averaging. R package version 3.18.7. https://CRAN.R-project.org/package=BMA. Accessed 12 Sept 2018
Rao CR, Wu Y (2001) On model selection. Lect Notes Monogr Ser 38:1–64
MathSciNet Google Scholar
Robins J, Hernan MA (2009) Estimation of the causal effects of time-varying exposures. In: Fitzmaurice G, Davidian M, Verbeke G, Molenberghs G (eds) Longitudinal data analysis. CRC Press, Boca Raton, pp 553–599
Google Scholar
Sala-I-Martin X, Doppelhofer G, Miller RI (2004) Determinants of long-term growth: a Bayesian averaging of classical estimates (bace) approach. Am Econ Rev 94(4):813–835
Google Scholar
Schomaker M (2012) Shrinkage averaging estimation. Stat Pap 53(4):1015–1034
MathSciNet MATH Google Scholar
Schomaker M (2017a) MAMI: model averaging (and model selection) after multiple imputation. R package version 0.9.10
Schomaker M (2017b) Model averaging and model selection after multiple imputation using the R-package MAMI. http://mami.r-forge.r-project.org. Accessed 12 Sept 2018
Schomaker M, Heumann C (2014) Model selection and model averaging after multiple imputation. Comput Stat Data Anal 71:758–770
MathSciNet MATH Google Scholar
Schomaker M, Heumann C (2018) Bootstrap inference when using multiple imputation. Stat Med 37(14):2252–2266
MathSciNet Google Scholar
Schomaker M, Davies MA, Malateste K, Renner L, Sawry S, N’Gbeche S, Technau K, Eboua FT, Tanser F, Sygnate-Sy H, Phiri S, Amorissani-Folquet M, Cox V, Koueta F, Chimbete C, Lawson-Evi A, Giddy J, Amani-Bosse C, Wood R, Egger M, Leroy V (2016) Growth and mortality outcomes for different antiretroviral therapy initiation criteria in children aged 1–5 years: a causal modelling analysis from West and Southern Africa. Epidemiology 27:237–246
Google Scholar
Sofrygin O, van der Laan MJ, Neugebauer R (2017) simcausal R package: conducting transparent and reproducible simulation studies of causal effect estimation with complex longitudinal data. J Stat Softw 81(2):1–47
Google Scholar
Tibsharani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc B 58:267–288
MathSciNet Google Scholar
Turek D, Fletcher D (2012) Model-averaged wald confidence intervals. Comput Stat Data Anal 56:2809–2815
MathSciNet MATH Google Scholar
Van der Laan M, Petersen M (2007) Statistical learning of origin-specific statistically optimal individualized treatment rules. Int J Biostat 3:3
MATH Google Scholar
Van der Laan M, Rose S (2011) Targeted learning. Springer, New York
MATH Google Scholar
Van der Laan M, Polley E, Hubbard A (2008) Super learner. Stat Appl Genet Mol Biol 6:25
MathSciNet MATH Google Scholar
Wan ATK, Zhang X, Zou GH (2010) Least squares model averaging by Mallows criterion. J Econometr 156:277–283
MathSciNet MATH Google Scholar
Wang H, Zhou S (2012) Interval estimation by frequentist model averaging. Commun Stat Theory Methods 42(23):4342–4356
MathSciNet MATH Google Scholar
Wang H, Zhang X, Zou G (2009) Frequentist model averaging: a review. J Syst Sci Complex 22:732–748
MathSciNet MATH Google Scholar
Wood SN (2006) Generalized additive models: an introduction with R. Chapman and Hall/CRC, Boca Raton
MATH Google Scholar
Yan J (2007) Enjoy the joy of copulas: with package copula. J Stat Softw 21:1–21
Google Scholar
Zhang X, Liu CA (2017) Inference after model averaging in linear regression models. IEAS working paper: academic research 17-A005. Institute of Economics, Academia Sinica, Taipei, Taiwan. https://ideas.repec.org/p/sin/wpaper/17-a005.html. Accessed 12 Sept 2018
Zhang XY, Zou GH, Liang H (2014) Model averaging and weight choice in linear mixed-effects models. Biometrika 101(1):205–218
MathSciNet MATH Google Scholar
Zhang XY, Zou GH, Carroll RJ (2015) Model averaging based on Kullback-Leibler distance. Stat Sin 25(4):1583–1598
MathSciNet MATH Google Scholar
Zhang XY, Ullah A, Zhao SW (2016a) On the dominance of mallows model averaging estimator over ordinary least squares estimator. Econ Lett 142:69–73
MathSciNet MATH Google Scholar
Zhang XY, Yu DL, Zou GH, Liang H (2016b) Optimal model averaging estimation for generalized linear models and generalized linear mixed-effects models. J Am Stat Assoc 111(516):1775–1790
MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Centre for Infectious Disease Epidemiology and Research, University of Cape Town, Cape Town, South Africa
Michael Schomaker
Institut für Statistik, Ludwig-Maximilians Universität München, München, Germany
Christian Heumann

Authors

Michael Schomaker
View author publications
You can also search for this author in PubMed Google Scholar
Christian Heumann
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michael Schomaker.

Appendices

Appendix A: Software

We have implemented Mallow’s model averaging, Jackknife model averaging, and Lasso averaging in the R-package MAMI (Schomaker 2017a), available at http://mami.r-forge.r-project.org/. In addition to this, we have implemented several wrappers that make optimal model averaging easily useable for super learning (Polley et al. 2017), and in conjunction with causal inference packages such as tmle (Gruber and van der Laan 2012) and ltmle (Lendle et al. 2017). Available wrappers are explained by calling listSLWrappers(), and examples are given in the documentation (Schomaker 2017b).

Appendix B: Notation and background on the sequential g-formula

Consider a sample of size n of which measurements are available both at baseline ($t=0$) and during a series of follow-up times $t=1,\ldots ,T$. At each point in time we measure the outcome $Y_t$, the intervention $A_t$, time-dependent covariates $\mathbf {L}_t=\{L^1_t,\ldots ,L^q_t\}$, and a censoring indicator $C_t$. $\mathbf {{L}}_{t}$ may include baseline variables $\mathbf {V}=\{L^1_0,\ldots ,L^{q_V}_0\}$ and can potentially contain variables which refer to the outcome variable before time t, for instance ${Y}_{t-1}$. The treatment and covariate history of an individual i up to and including time t is represented as $\bar{A}_{t,i}=(A_{0,i},\ldots ,A_{t,i})$ and $\bar{L}^s_{t,i}=(L^s_{0,i},\ldots ,L^s_{t,i})$ respectively. $C_t$ equals 1 if a subject gets censored in the interval $(t-1,t]$, and 0 otherwise. Therefore, $\bar{C}_t=0$ is the event that an individual remains uncensored until time t.

The counterfactual outcome $Y_{t,i}^{\bar{a}_{t}}$ refers to the hypothetical outcome that would have been observed at time t if subject i had received, possibly contrary to the fact, the treatment history $\bar{A}_{t,i}=\bar{a}_t$. Similarly, $\mathbf {L}_{t,i}^{\bar{a}_{t}}$ are the counterfactual covariates related to the intervention $\bar{A}_{t,i}=\bar{a}_t$. The above notation refers to static treatment rules; a treatment rule may, however, depend on covariates, and in this case it is called dynamic. A dynamic rule ${d}_t({\bar{\mathbf {L}}}_{t})$ assigns treatment ${A}_{t,i} \in \{0,1\}$ as a function of the covariate history ${\bar{\mathbf {L}}}_{t,i}$. The vector of decisions $d_t, t=0,\ldots ,T$, is denoted as $\bar{d}_T=\bar{d}$. The notation $\bar{A}_t = \bar{d}$ refers to the treatment history according to the rule $\bar{d}$. The counterfactual outcome related to a dynamic rule $\bar{d}$ is $Y_{t,i}^{\bar{d}}$, and the counterfactual covariates are $\mathbf {L}_{t,i}^{\bar{d}}$.

In Sect. 3.3 we consider the expected value of Y at time 6, under no censoring, for a given treatment rule $\bar{d}$ to be the main quantity of interest, that is $\psi =\mathbb {E}(Y_6^{\bar{d}})$.

The sequential g-formula can estimate this target quantity by sequentially marginalizing the distribution with respect to $\mathbf {L}$ given the intervention rule of interest. It holds that

$$\begin{aligned} \mathbb {E}(Y_T^{\bar{d}})= & {} \mathbb {E}(\,\mathbb {E}(\,\ldots \mathbb {E}(\,\mathbb {E}(Y_T|\bar{A}_T=\bar{d}_{T}, {\bar{\mathbf{L}}}_T) | \bar{A}_{T-1}=\bar{d}_{T-1}, {\bar{\mathbf{L}}}_{T-1}\,)\ldots |\bar{A}_{0}={d}_{0}, \mathbf {{L}}_{0}\,)|\mathbf {{L}}_{0}\,), \end{aligned}$$

see for example Bang and Robins (2005). Equation (14) is valid under several assumptions: sequential conditional exchangeability, consistency, positivity and the time ordering $\mathbf {L}_t \rightarrow A_t \rightarrow C_t \rightarrow Y_t$. These assumptions essentially mean that all confounders need to be measured, that the intervention is well-defined and that individuals have a positive probability of continuing to receive treatment according to the assigned treatment rule, given that they have done so thus far and irrespective of the covariate history; see Daniel et al. (2013) and Robins and Hernan (2009) for more details and interpretations. Note that the two interventions defined in Sect. 3 also assign $C_t=0$ meaning that we are interested in the effect estimate under no censoring.

To estimate $\psi $ one needs to the the following for $t=T,\ldots ,0$: (i) use an appropriate model to estimate $\mathbb {E}(Y_T|{\bar{A}}_{T-1}={\bar{d}}_{T-1}, {\bar{\mathbf{L}}}_T)$. The model is fit on all subjects that are uncensored (until $T-1$). Note that the outcome refers to the measured outcome for $t=T$ and to the prediction (of the conditional outcome) from step (ii) if $t<T$. Then, (ii) plug in $\bar{A}_t=\bar{d}_{t}$ to predict $Y_t$ at time t; (iii) For $t=0$ the estimate $\hat{\psi }$ is obtained by calculating the arithmetic mean of the predicted outcome from the second step.

Appendix C: Data generating process in the simulation study

Both baseline data ($t=0$) and follow-up data ($t=1,\ldots ,12$) were created using structural equations using the R-package simcausal (Sofrygin et al. 2017). The below listed distributions, listed in temporal order, describe the data-generating process motivated by the analysis from Schomaker et al. (2016). Baseline data refers to region, sex, age, CD4 count, CD4%, WAZ and HAZ respectively ($V^1$, $V^2$, $V^3$, $L^1_0$, $L^2_0$, $L^3_0$, $Y_0$), see Schomaker et al. (2016) for a full explanation of variables and motivational question. Follow-up data refers to CD4 count, CD4%, WAZ and HAZ ($L^1_t$, $L^2_t$, $L^3_t$, $Y_t$), as well as a treatment ($A_t$) and censoring ($C_t$) indicator. In addition to Bernoulli (B), uniform (U) and normal (N) distributions, we also use truncated normal distributions which are denoted by $N_{[a,b]}$ where a and b are the truncation levels. Values which are smaller a are replaced by a random draw from a $U(a_1,a_2)$ distribution and values greater than b are drawn from a $U(b_1,b_2)$ distribution. Values for $(a_1, a_2, b_1, b_2)$ are (0, 50, 5000, 10000) for $L^1$, (0.03,0.09,0.7,0.8) for $L^2$, and $(-10,3,3,10)$ for both $L^3$ and Y. The notation $\bar{\mathcal {D}}$ means “conditional on the data that has already been measured (generated) according the the time ordering”.

For $t=0$:

$$\begin{aligned} V^1\sim & {} B(p=4392/5826) \\ V^2|\bar{\mathcal {D}}\sim & {} \left\{ \begin{array}{ll} B(p=2222/4392) &{} \quad \text{ if } \quad V^1 = 1\\ B(p=758/1434) &{} \quad \text{ if } \quad V^1 = 0\\ \end{array} \right. \\ V^3|\bar{\mathcal {D}}\sim & {} U(1,5) \\ L^1_0|\bar{\mathcal {D}}\sim & {} \left\{ \begin{array}{cl} N_{[0,10000]}(650,350) &{} \quad \text{ if } \quad V^1 = 1\\ N_{[0,10000]}(720,400)) &{} \quad \text{ if } \quad V^1 = 0\\ \end{array} \right. \\ \tilde{L}^1_0|\bar{\mathcal {D}}\sim & {} N((L^1_0-671.7468)/(10\cdot 352.2788)+1,0)\\ L^2_0|\bar{\mathcal {D}}\sim & {} N_{[0.06,0.8]}(0.16+0.05\cdot (L^1_0-650)/650,0.07) \\ \tilde{L}^2_0|\bar{\mathcal {D}}\sim & {} N((L^2_0-0.1648594)/(10\cdot 0.06980332)+1,0)\\ L^3_0|\bar{\mathcal {D}}\sim & {} \left\{ \begin{array}{ll} N_{[-5,5]}(- 1.65 + 0.1 \cdot V^3 + 0.05 \cdot (L^1_0 - 650)/650 \\ +\, 0.05 \cdot (L^2_0 - 16)/16,1) &{} \quad \text{ if } \quad V^1 = 1\\ N_{[-5,5]}( -2.05 + 0.1 \cdot V^3 + 0.05 \cdot (L^1_0 - 650)/650 \\ +\, 0.05 \cdot (L^2_0 - 16)/16,1)) &{} \quad \text{ if } \quad V^1 = 0\\ \end{array} \right. \\ A_0|\bar{\mathcal {D}}\sim & {} B(p=0) \\ C_0|\bar{\mathcal {D}}\sim & {} B(p=0) \\ Y_0|\bar{\mathcal {D}}\sim & {} N_{[-5,5]}(-2.6 + 0.1 \cdot I(V^3 > 2) + 0.3 \cdot I(V^1 = 0) + (L^3_0 + 1.45),1.1) \\ \end{aligned}$$

For $t>0$:

$$\begin{aligned} L^1_t|\bar{\mathcal {D}}\sim & {} \left\{ \begin{array}{cl} &{} N_{[0,10000]}(13\cdot \log (t \cdot (1034-662)/8) + L^1_{t-1} + 2 \cdot L^2_{t-1}\\ &{} +\, 2 \cdot L^3_{t-1} + 2.5 \cdot A_{t-1},50) \qquad \text{ if } \quad t \in \{1,2,3,4\}\\ &{} N_{[0,10000]}(4\cdot \log (t \cdot (1034-662)/8) + L^1_{t-1} + 2 \cdot L^2_{t-1} \\ &{}+\, 2 \cdot L^3_{t-1} + 2.5 \cdot A_{t-1},50) \qquad \text{ if } \quad t \in \{5,6\}\\ \end{array} \right. \\ L^2_t|\bar{\mathcal {D}}\sim & {} N_{[0.06,0.8]}(L^2_{t-1} + 0.0003 \cdot (L^1_t - L^1_{t-1}) + 0.0005 \cdot (L^3_{t-1}) + 0.0005 \cdot A_{t-1} \cdot \tilde{L}^1_0,0.02) \\ L^3_t|\bar{\mathcal {D}}\sim & {} N_{-5,5}(L^3_{t-1} + 0.0017 \cdot (L^1_t - L^1_{t-1}) + 0.2 \cdot (L^2_t - L^2_{t-1}) + 0.005 \cdot A_{t-1} \cdot \tilde{L}^2_0,0.5) \\ A_t|\bar{\mathcal {D}}\sim & {} \left\{ \begin{array}{ll} &{}B(p=1) \qquad \text{ if } \quad A_{t-1} = 1\\ &{} B(p=1/(1+\exp (-[-2.4 + 0.015 \cdot (750 - L^1_t) + 5 \cdot (0.2 - L^2_t) \\ &{} -\, 0.8 \cdot L^3_t + 0.8 \cdot t]))) \qquad \text{ if } \quad A_{t-1} = 0\\ \end{array} \right. \\ C_t|\bar{\mathcal {D}}\sim & {} B(p=1/(1+\exp (-[-6+ 0.01 \cdot (750 - L^1_t) + 1 \cdot (0.2 - L^2_t) - 0.65 \cdot L^3_t - A_t]))) \\ Y_t|\bar{\mathcal {D}}\sim & {} N_{[-5,5]}(Y_{t-1} + 0.00005 \cdot (L^1_t - L^1_{t-1}) - 0.000001 \cdot \left( (L^1_t - L^1_{t-1})\cdot \sqrt{\tilde{L}^1_0}\right) ^2\\&+\, 0.01 \cdot (L^2_t - L^2_{t-1})- 0.0001 \cdot \left( (L^2_t - L^2_{t-1})\cdot \sqrt{\tilde{L}^2_0}\right) ^2\\&+\, 0.07 \cdot ((L^3_t-L^3_{t-1})\cdot (L^3_0+1.5135)) - 0.001 \cdot ((L^3_t-L^3_{t-1})\cdot (L^3_0+1.5135))^2 \\&+\, 0.005 \cdot A_t + 0.075 \cdot A_{t-1} + 0.05 \cdot A[t] \cdot A[t-1] ,0.01) \\ \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schomaker, M., Heumann, C. When and when not to use optimal model averaging. Stat Papers 61, 2221–2240 (2020). https://doi.org/10.1007/s00362-018-1048-3

Download citation

Received: 02 February 2018
Revised: 20 June 2018
Published: 14 September 2018
Issue Date: October 2020
DOI: https://doi.org/10.1007/s00362-018-1048-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

When and when not to use optimal model averaging

Abstract

Access this article

Similar content being viewed by others

Model Averaging

A general framework for frequentist model averaging

Weighted-Average Least Squares (WALS): Confidence and Prediction Intervals

References

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A: Software

Appendix B: Notation and background on the sequential g-formula

Appendix C: Data generating process in the simulation study

Rights and permissions

About this article

Cite this article

Keywords

Navigation

When and when not to use optimal model averaging

Abstract

Access this article

Similar content being viewed by others

Model Averaging

A general framework for frequentist model averaging

Weighted-Average Least Squares (WALS): Confidence and Prediction Intervals

References

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A: Software

Appendix B: Notation and background on the sequential g-formula

Appendix C: Data generating process in the simulation study

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation