Skip to main content

Advertisement

Log in

Quantile regression-based Bayesian joint modeling analysis of longitudinal–survival data, with application to an AIDS cohort study

  • Published:
Lifetime Data Analysis Aims and scope Submit manuscript

Abstract

In longitudinal studies, it is of interest to investigate how repeatedly measured markers are associated with time to an event. Joint models have received increasing attention on analyzing such complex longitudinal–survival data with multiple data features, but most of them are mean regression-based models. This paper formulates a quantile regression (QR) based joint models in general forms that consider left-censoring due to the limit of detection, covariates with measurement errors and skewness. The joint models consist of three components: (i) QR-based nonlinear mixed-effects Tobit model using asymmetric Laplace distribution for response dynamic process; (ii) nonparametric linear mixed-effects model with skew-normal distribution for mismeasured covariate; and (iii) Cox proportional hazard model for event time. For the purpose of simultaneously estimating model parameters, we propose a Bayesian method to jointly model the three components which are linked through the random effects. We apply the proposed modeling procedure to analyze the Multicenter AIDS Cohort Study data, and assess the performance of the proposed models and method through simulation studies. The findings suggest that our QR-based joint models may provide comprehensive understanding of heterogeneous outcome trajectories at different quantiles, and more reliable and robust results if the data exhibits these features.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Arellano-Valle RB, Genton MG (2005) On fundamental skew distributions. J Multivar Anal 96(1):93–116

    Article  MathSciNet  MATH  Google Scholar 

  • Arellano-Valle R, Bolfarine H, Lachos V (2007) Bayesian inference for skew-normal linear mixed models. J Appl Stat 34(6):663–682

    Article  MathSciNet  Google Scholar 

  • Azzalini A, Capitanio A (1999) Statistical applications of the multivariate skew normal distribution. J R Stat Soc 61(3):579–602

    Article  MathSciNet  MATH  Google Scholar 

  • Brown ER (2009) Assessing the association between trends in a biomarker and risk of event with an application in pediatric HIV/AIDS. Ann Appl Stat 3(3):1163

    Article  MathSciNet  MATH  Google Scholar 

  • Brown ER, Ibrahim JG (2003) A Bayesian semiparametric joint hierarchical model for longitudinal and survival data. Biometrics 59(2):221–228

    Article  MathSciNet  MATH  Google Scholar 

  • Brown ER, Ibrahim JG, DeGruttola V (2005) A flexible b-spline model for multiple longitudinal biomarkers and survival. Biometrics 61(1):64–73

    Article  MathSciNet  MATH  Google Scholar 

  • Carroll RJ, Ruppert D, Stefanski LA, Crainiceanu CM (2006) Measurement error in nonlinear models: a modern perspective. CRC Press, Boca Raton

    Book  MATH  Google Scholar 

  • Chen Q, May RC, Ibrahim JG, Chu H, Cole SR (2014) Joint modeling of longitudinal and survival data with missing and left-censored time-varying covariates. Stat Med 33(26):4560–4576

    Article  MathSciNet  Google Scholar 

  • Clayton DG (1991) A Monte Carlo method for Bayesian inference in frailty models. Biometrics 47(2):467–485

    Article  Google Scholar 

  • Dagne GA, Huang Y (2011) Mixed-effects Tobit joint models for longitudinal data with skewness, detection limits, and measurement errors. J Probab Stat 2012:1–19

    Article  MATH  Google Scholar 

  • Dagne G, Huang Y (2012) Bayesian inference for a nonlinear mixed-effects Tobit model with multivariate skew-t distributions: application to AIDS studies. Int J Biostat 8(1)

  • Davidian M, Giltinan DM (1995) Nonlinear models for repeated measurement data, vol 62. CRC Press, Boca Raton

    Google Scholar 

  • Davino C, Furno M, Vistocco D (2013) Quantile regression: theory and applications. Wiley, Hoboken

    MATH  Google Scholar 

  • Elashoff RM, Li G, Li N (2008) A joint model for longitudinal measurements and survival data in the presence of multiple failure types. Biometrics 64(3):762–771

    Article  MathSciNet  MATH  Google Scholar 

  • Farcomeni A (2012) Quantile regression for longitudinal data based on latent Markov subject-specific parameters. Stat Comput 22(1):141–152

    Article  MathSciNet  MATH  Google Scholar 

  • Farcomeni A, Viviani S (2015) Longitudinal quantile regression in the presence of informative dropout through longitudinal–survival joint modeling. Stat Med 34(7):1199–1213

    Article  MathSciNet  Google Scholar 

  • Ganjali M, Baghfalaki T (2015) A copula approach to joint modeling of longitudinal measurements and survival times using monte carlo expectation-maximization with application to aids studies. J Biopharm Stat 25(5):1077–1099

    Article  Google Scholar 

  • Gelman A, Rubin DB (1992) Inference from iterative simulation using multiple sequences. Stat Sci 7(4):457–472

    Article  MATH  Google Scholar 

  • Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB (2014) Bayesian data analysis, vol 2. CRC Press, Boca Raton

    MATH  Google Scholar 

  • Genton MG (2004) Skew-elliptical distributions and their applications: a journey beyond normality. CRC Press, Boca Raton

    Book  MATH  Google Scholar 

  • Geraci M, Bottai M (2007) Quantile regression for longitudinal data using the asymmetric laplace distribution. Biostatistics 8(1):140–154

    Article  MATH  Google Scholar 

  • He X, Fu B, Fung WK (2003) Median regression for longitudinal data. Stat Med 22(23):3655–3669

    Article  Google Scholar 

  • Henderson R, Diggle P, Dobson A (2000) Joint modelling of longitudinal measurements and event time data. Biostatistics 1(4):465–480

    Article  MATH  Google Scholar 

  • Hu W, Li G, Li N (2009) A Bayesian approach to joint analysis of longitudinal measurements and competing risks failure time data. Stat Med 28(11):1601–1619

    Article  MathSciNet  Google Scholar 

  • Huang Y (2016) Quantile regression-based Bayesian semiparametric mixed-effects models for longitudinal data with non-normal, missing and mismeasured covariate. J Stat Comput Simul 86(6):1183–1202

    Article  MathSciNet  Google Scholar 

  • Huang Y, Chen J (2016) Bayesian quantile regression-based nonlinear mixed-effects joint models for time-to-event and longitudinal data with multiple features. Stat Med 35(30):5666–5685

    Article  MathSciNet  Google Scholar 

  • Huang Y, Dagne G (2011) A Bayesian approach to joint mixed-effects models with a skew-normal distribution and measurement errors in covariates. Biometrics 67(1):260–269

    Article  MathSciNet  MATH  Google Scholar 

  • Huang Y, Liu D, Wu H (2006) Hierarchical Bayesian methods for estimation of parameters in a longitudinal HIV dynamic system. Biometrics 62(2):413–423

    Article  MathSciNet  MATH  Google Scholar 

  • Huang Y, Dagne G, Wu L (2011) Bayesian inference on joint models of HIV dynamics for time-to-event and longitudinal data with skewness and covariate measurement errors. Stat Med 30(24):2930–2946

    Article  MathSciNet  Google Scholar 

  • Jara A, Quintana F, San Martín E (2008) Linear mixed models with skew-elliptical distributions: a Bayesian approach. Comput Stat Data Anal 52(11):5033–5045

    Article  MathSciNet  MATH  Google Scholar 

  • Johnson NL, Kotz S, Balakrishnan N (1995) Continuous univariate distribution, vol 2, 2nd edn. Wiley, New York

    MATH  Google Scholar 

  • Kaslow RA, Ostrow DG, Detels R, Phair JP, Polk BF, Rinaldo CR (1987) The Multicenter AIDS Cohort Study: rationale, organization, and selected characteristics of the participants. Am J Epidemiol 126(2):310–318

    Article  Google Scholar 

  • Kim MO, Yang Y (2012) Semiparametric approach to a random effects quantile regression model. J Am Stat Assoc 106(496):1405–1417

    Article  MathSciNet  MATH  Google Scholar 

  • Kobayashi G, Kozumi H (2012) Bayesian analysis of quantile regression for censored dynamic panel data. Comput Stat 27(2):359–380

    Article  MathSciNet  MATH  Google Scholar 

  • Koenker R (2004) Quantile regression for longitudinal data. J Multivar Anal 91(1):74–89

    Article  MathSciNet  MATH  Google Scholar 

  • Koenker R (2005) Quantile regression, vol 38. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  • Koenker R, Bassett G Jr (1978) Regression quantiles. Econometrica 46:33–50

    Article  MathSciNet  MATH  Google Scholar 

  • Koenker R, Machado JA (1999) Goodness of fit and related inference processes for quantile regression. J Am Stat Assoc 94(448):1296–1310

    Article  MathSciNet  MATH  Google Scholar 

  • Kotz S, Kozubowski TJ, Podgórski K (2002) Maximum likelihood estimation of asymmetric Laplace parameters. Ann Inst Stat Math 54(4):816–826

    Article  MathSciNet  MATH  Google Scholar 

  • Kotz S, Kozubowski TJ, Podgórski K (2001) Asymmetric multivariate Laplace distribution. In: The Laplace distribution and generalizations. Springer, New York, pp 239–272

  • Kozumi H, Kobayashi G (2011) Gibbs sampling methods for Bayesian quantile regression. J Stat Comput Simul 81(11):1565–1578

    Article  MathSciNet  MATH  Google Scholar 

  • Lipsitz SR, Fitzmaurice GM, Molenberghs G, Zhao LP (1997) Quantile regression methods for longitudinal data with drop-outs: application to CD4 cell counts of patients infected with the human immunodeficiency virus. J R Stat Soc 46(4):463–476

    Article  MATH  Google Scholar 

  • Liu Y, Bottai M (2009) Mixed-effects models for conditional quantiles with longitudinal data. Int J Biostat 5(1)

  • Liu W, Wu L (2007) Simultaneous inference for semiparametric nonlinear mixed-effects models with covariate measurement errors and missing responses. Biometrics 63(2):342–350

    Article  MathSciNet  MATH  Google Scholar 

  • Lunn DJ, Thomas A, Best N, Spiegelhalter D (2000) WinBUGS-a Bayesian modelling framework: concepts, structure, and extensibility. Stat Comput 10(4):325–337

    Article  Google Scholar 

  • Luo Y, Lian H, Tian M (2012) Bayesian quantile regression for longitudinal data models. J Stat Comput Simul 82(11):1635–1649

    Article  MathSciNet  MATH  Google Scholar 

  • Perelson AS, Essunger P, Cao Y, Vesanen M, Hurley A, Saksela K, Markowitz M, Ho DD (1997) Decay characteristics of HIV-1-infected compartments during combination therapy. Nature 387:188–191

    Article  Google Scholar 

  • Reich BJ, Fuentes M, Dunson DB (2012) Bayesian spatial quantile regression. J Am Stat Assoc 106:6–22

    Article  MathSciNet  MATH  Google Scholar 

  • Rizopoulos D (2010) Jm: an R package for the joint modelling of longitudinal and time-to-event data. J Stat Softw 35(9):1–33

    Article  Google Scholar 

  • Rizopoulos D (2011) Dynamic predictions and prospective accuracy in joint models for longitudinal and time-to-event data. Biometrics 67(3):819–829

    Article  MathSciNet  MATH  Google Scholar 

  • Rizopoulos D (2012) Joint models for longitudinal and time-to-event data: with applications in R. CRC Press, Boca Raton

    Book  MATH  Google Scholar 

  • Sahu SK, Dey DK, Branco MD (2003) A new class of multivariate skew distributions with applications to Bayesian regression models. Can J Stat 31(2):129–150

    Article  MathSciNet  MATH  Google Scholar 

  • Spiegelhalter DJ, Best NG, Carlin BP, Van Der Linde A (2002) Bayesian measures of model complexity and fit. J R Stat Soc 64(4):583–639

    Article  MathSciNet  MATH  Google Scholar 

  • Tang AM, Tang NS, Zhu H (2017) Influence analysis for skew-normal semiparametric joint models of multivariate longitudinal and multivariate survival data. Stat Med 36(9):1476–1490

    Article  MathSciNet  Google Scholar 

  • Tian Y, Tian M (2015) Bayesian joint quantile regression for mixed effects models with censoring and errors in covariates. Comput Stat 31(3):1031–1057

    Article  MathSciNet  MATH  Google Scholar 

  • Tsiatis AA, Davidian M (2004) Joint modeling of longitudinal and time-to-event data: an overview. Statistica Sinica 14:809–834

    MathSciNet  MATH  Google Scholar 

  • Wang HJ, Fygenson M (2009) Inference for censored quantile regression models in longitudinal studies. Ann Stat 37(2):756–781

    Article  MathSciNet  MATH  Google Scholar 

  • Wang Y, Taylor JMG (2001) Jointly modeling longitudinal and event time data with application to acquired immunodeficiency syndrome. J Am Stat Assoc 96(455):895–905

    Article  MathSciNet  MATH  Google Scholar 

  • Wu L (2002) A joint model for nonlinear mixed-effects models with censoring and covariates measured with error, with application to aids studies. J Am Stat Assoc 97(460):955–964

    Article  MATH  Google Scholar 

  • Wu H, Ding AA (1999) Population HIV-1 dynamics in vivo: applicable models and inferential tools for virological data from AIDS clinical trials. Biometrics 55(2):410–418

    Article  MATH  Google Scholar 

  • Wu H, Zhang JT (2006) Nonparametric regression methods for longitudinal data analysis: mixed-effects modeling approaches, vol 515. Wiley, Hoboken

    MATH  Google Scholar 

  • Wu L, Liu W, Hu X (2010) Joint inference on HIV viral dynamics and immune suppression in presence of measurement errors. Biometrics 66(2):327–335

    Article  MathSciNet  MATH  Google Scholar 

  • Yi G, Liu W, Wu L (2011) Simultaneous inference and bias analysis for longitudinal data with covariate measurement error and missing responses. Biometrics 67(1):67–75

    Article  MathSciNet  MATH  Google Scholar 

  • Yu K, Moyeed RA (2001) Bayesian quantile regression. Stat Probab Lett 54(4):437–447

    Article  MathSciNet  MATH  Google Scholar 

  • Yu K, Stander J (2007) Bayesian analysis of a Tobit quantile regression model. J Econom 137(1):260–276

    Article  MathSciNet  MATH  Google Scholar 

  • Yu K, Zhang J (2005) A three-parameter asymmetric Laplace distribution and its extension. Commun Stat Theory Methods 34(9–10):1867–1879

    Article  MathSciNet  MATH  Google Scholar 

  • Yu K, Lu Z, Stander J (2003) Quantile regression: applications and current research areas. J R Stat Soc 52(3):331–350

    Article  MathSciNet  Google Scholar 

  • Yuan Y, Yin G (2010) Bayesian quantile regression for longitudinal studies with nonignorable missing data. Biometrics 66(1):105–114

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yangxin Huang.

Ethics declarations

Conflicts of interest

The authors have declared no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix: Skew-normal distribution and asymmetric Laplace distribution

1.1 A.1: Skew-normal distribution

Different versions of multivariate skew distributions have been proposed and used in the literature (Arellano-Valle et al. 2007; Arellano-Valle and Genton 2005; Azzalini and Capitanio 1999; Jara et al. 2008; Sahu et al. 2003). A new class of distributions by introducing skewness in multivariate elliptically distributions, referred as skew-elliptical (SE) distributions, were developed in the literature (Genton 2004; Sahu et al. 2003). The class, which is obtained by using transformation and conditioning, which contains many standard families including the multivariate skew-normal (SN) distribution as special case. A k-dimensional random vector \(\varvec{Y}\) follows a k-variate SE distribution if its probability density function (pdf) is given by

$$\begin{aligned} f(\varvec{y}|\varvec{\mu },\varvec{\varSigma },\varvec{\varGamma };m^{(k)}_{\nu })= 2^k f(\varvec{y}|\varvec{\mu }, \varvec{A};m^{(k)}_{\nu })P(\varvec{V}>\mathbf{0 }),&\end{aligned}$$
(A.1)

where \(\varvec{A}=\varvec{\varSigma }+\varvec{\varGamma }^2\), \(\varvec{\mu }\) is a location parameter vector, \(\varvec{\varSigma }\) is a \(k \times k\) positive (diagonal) covariance matrix, \(\varvec{\varGamma }=\text {diag}(\delta _1, \delta _2,\ldots , \delta _k)\) is a \(k \times k\) skewness matrix with the skewness parameter vector \(\varvec{\delta }=(\delta _1,\delta _2,\ldots ,\delta _k)^T\); \(\varvec{V}\) follows the elliptical distribution \(El\left( \varvec{\varGamma }\varvec{A}^{-1}(\varvec{y}-\varvec{\mu }), \varvec{I}_{k}-\varvec{\varGamma }\varvec{A}^{-1}\varvec{\varGamma }; m^{(k)}_{\nu }\right) \) and the density generator function \(m^{(k)}_{\nu }(u)=\frac{\varGamma (k/2)}{\pi ^{k/2}}\frac{m_{\nu }(u)}{\int _0^{\infty }r^{k/2-1}m_{\nu }(u)dr}\), with \(m_{\nu }(u)\) being a function such that \(\int _0^{\infty }r^{k/2-1}m_{\nu }(u)dr\) exists. The function \(m_{\nu }(u)\) provides the kernel of the original elliptical density and may depend on the parameter \(\nu \). This SE distribution is denoted by \(SE(\varvec{\mu },\varvec{\varSigma },\varvec{\varGamma };m^{(k)})\). One example of \(m_{\nu }(u)\), leading to an important special case used throughout the paper, is \(m_{\nu }(u)=\exp (-u/2)\). This expression leads to the multivariate SN distribution.

As we know, a normal distribution is a special case of an SN distribution when the skewness parameter is zero. For completeness, this Appendix briefly summarizes the multivariate SN distribution introduced by (Sahu et al. 2003) to be suitable for a Bayesian inference since it is built using the conditional method. For detailed discussions on properties of SN distribution, see publication by (Sahu et al. 2003). Assume a k-dimensional random vector \(\varvec{Y}\) follows a k variate SN distribution with location vector \(\varvec{\mu }\), \(k \times k\) positive (diagonal) covariance matrix \(\varvec{\varSigma }\) and \(k \times k\) skewness diagonal matrix \(\varvec{\varGamma }=\text {diag}(\delta _1, \delta _2,\ldots , \delta _k)\).

A k-dimensional random vector \({\varvec{Y}}\) follows a k-variate SN distribution, if its pdf is given by

$$\begin{aligned} f({\varvec{y}}|\varvec{\mu },\varvec{\varSigma },\varvec{\varGamma })= 2^k|{\varvec{A}}|^{-1/2}\phi _k\{{\varvec{A}}^{-1/2}({\varvec{y}}-\varvec{\mu })\} P({\varvec{V}}>\mathbf{0 }), \end{aligned}$$
(A.2)

where \( {\varvec{V}} \sim N_k\{\varvec{\varGamma }{\varvec{A}}^{-1}({\varvec{y}}-\varvec{\mu }), {\varvec{I}}_k-\varvec{\varGamma } {\varvec{A}}^{-1}\varvec{\varGamma }\}\), and \(\phi _k(\cdot )\) is the pdf of \(N_k(\mathbf{0 },{\varvec{I}}_k)\). We denote the above distribution by \(SN_k (\varvec{\mu },\varvec{\varSigma },\varvec{\varGamma })\). An appealing feature of Eq. (A.2) is that it gives independent marginal when \(\varvec{\varSigma }=diag(\sigma ^2_1, \sigma ^2_2,\ldots , \sigma ^2_k)\). The pdf (A.2) thus simplifies to

$$\begin{aligned} f({\varvec{y}}|\varvec{\mu },\varvec{\varSigma },\varvec{\varGamma })=\mathop \prod \nolimits _{i=1} ^{k}\left[ \frac{2}{\sqrt{\sigma ^2_i+\delta ^2_i}}\phi \left\{ \frac{y_i-\mu _i}{\sqrt{\sigma ^2_i+\delta ^2_i}}\right\} \varPhi \left\{ \frac{\delta _i}{\sigma _i}\frac{y_i-\mu _i}{\sqrt{\sigma ^2_i+\delta ^2_i}}\right\} \right] ,\nonumber \\ \end{aligned}$$
(A.3)

where \(\phi (\cdot )\) and \(\varPhi (\cdot )\) are the pdf and cdf of the standard normal distribution, respectively. The mean and covariance matrix are given by

$$\begin{aligned} E({\varvec{Y}}) = \varvec{\mu }+\sqrt{2/\pi }\varvec{\delta }, ~~Cov({\varvec{Y}})=\varvec{\varSigma }+(1-2/\pi )\varvec{\varGamma }^2. \end{aligned}$$
(A.4)

It is noted that when \(\varvec{\delta }=\mathbf{0 }\), the SN distribution reduces to usual normal distribution. In order to have a zero mean vector, we should assume the location parameter \(\varvec{\mu }=-\sqrt{2/\pi }\varvec{\delta }\).

According to the study by (Arellano-Valle et al. 2007), if \({\varvec{Y}}\) follows \(SN_k(\varvec{\mu },\varvec{\varSigma },\varvec{\varGamma })\), it can be expressed by a convenient stochastic representation as follows.

$$\begin{aligned} {\varvec{Y}}=\varvec{\mu }+\varvec{\varGamma }|{\varvec{X}}_0|+\varvec{\varSigma }^{1/2}{\varvec{X}}_1, \end{aligned}$$
(A.5)

where \({\varvec{X}}_0\) and \({\varvec{X}}_1\) are two independent \(N_k(\mathbf{0 },{\varvec{I}}_k)\) random vectors. Let \({\varvec{w}}=|{\varvec{X}}_0|\); then, \({\varvec{w}}\) follows a k-dimensional standard normal distribution \(N_k(\mathbf{0 },{\varvec{I}}_k)\) truncated in the space \({\varvec{w}}>\mathbf{0 }\). Thus, a two-level hierarchical representation of (A.5) is given by

$$\begin{aligned} {\varvec{Y}}|{\varvec{w}} \sim N_k(\varvec{\mu }+\varvec{\varGamma } {\varvec{w}}, \varvec{\varSigma }),\; {\varvec{w}} \sim N_k(\mathbf{0 },{\varvec{I}}_k){\varvec{I}}({\varvec{w}}>\mathbf{0 }). \end{aligned}$$
(A.6)

Note that when \(\varvec{\varGamma }=\varvec{0}\), the hierarchical expression (A.6) presented for the SN distribution \(SN_k(\varvec{\mu },\varvec{\varSigma },\varvec{\varGamma })\) reduces to its counterpart for the normal distribution \(N_k(\varvec{\mu },\varvec{\varSigma })\).

1.2 A.2: Asymmetric Laplace distribution

An asymmetric distribution, referred as asymmetric Laplace distribution (ALD) which is closely related to the check function for quantile regression (QR), has been discussed in the literature (Geraci and Bottai 2007; Koenker and Machado 1999; Yu and Moyeed 2001; Yu and Zhang 2005). A random variable Y is said to follow ALD if its probability density function (pdf) with parameters \(\mu ,\sigma \) and \(\tau \) is given by

$$\begin{aligned} f(y|\mu ,\sigma ,\tau )=\frac{\tau (1-\tau )}{\sigma }exp\left\{ -\rho _{\tau }\left( \frac{y-\mu }{\sigma }\right) \right\} , \end{aligned}$$
(A.7)

where \(\rho _{\tau }(u)=u(\tau -I(u<0))\) is the check function, \(I(\cdot )\) is the indicator function, \(0<\tau <1\) is the skewness parameter, \(\sigma >0\) is the scale parameter and \(-\infty<\mu <\infty \) is the location parameter. The range of y is \((-\infty ,~\infty )\). We denote the above distribution by ALD\((\mu ,\sigma ,\tau )\). It should be noted that the check function \(\rho _{\tau }(\cdot )\) assigns weight \(\tau \) or \(1-\tau \) to the observations greater or less than \(\mu \), respectively, and that \(Pr(y\le \mu )=\tau \). Therefore, the distribution splits along the scale parameter into two parts, one with probability \(\tau \) to the left, and one with probability (\(1-\tau \)) to the right. That is, ALD\((\mu ,\sigma ,\tau )\) is skewed to left when \(\tau >1/2\), and skewed to right when \(\tau <1/2\). When \(\tau =1/2\), ALD\((\mu ,\sigma ,\tau )\) reduces to the Laplace double exponential (or symmetric Laplace) distribution we usually call which has pdf as follows.

$$\begin{aligned} f(y|\mu ,\sigma ,1/2)=\frac{1}{4\sigma }exp\left\{ -\frac{|y-\mu |}{2\sigma }\right\} . \end{aligned}$$
(A.8)

If \(Y \sim \) ALD\((\mu ,\sigma ,\tau )\), then \(Pr(y\le \mu )=\tau \) and \(Pr(y>\mu )=1-\tau \), which shows that the parameters \(\mu \) and \(\tau \) in ALD satisfy \(\mu \) to be the \(\tau \)th quantile of the distribution. This important feature of ALD has been generally adopted for quantile inference (Geraci and Bottai 2007; Yu and Moyeed 2001; Yu et al. 2003) and made it more popular than other asymmetric Laplace distributions (Johnson et al. 1995; Kotz et al. 2002). See (Yu and Zhang 2005) for further properties and generalizations of this distribution. It can be shown that the mean and variance of Y are given by

$$\begin{aligned} E(Y)=\mu +[\sigma (1{-}2\tau )]/[\tau (1-\tau )], ~~Var(Y)=[\sigma ^2(1{-}2\tau {+}2\tau ^2)]/[\tau ^2(1{-}\tau )^2].\nonumber \\ \end{aligned}$$
(A.9)

However, the ALD is not smooth and thus difficult to maximize its likelihood function. Fortunately, As shown by (Kotz et al. 2001) and (Kozumi and Kobayashi 2011), the ALD has various mixture representations. To develop Bayesian approach-based sampling algorithms for the QR model, we utilize a hierarchical mixture of exponential and normal distributions (Kotz et al. 2001; Kozumi and Kobayashi 2011). For \(Y \sim \) ALD\((\mu ,\sigma ,\tau )\), then Y can be decomposed as the following mixture representation.

$$\begin{aligned} Y=\mu +\vartheta _1 {X}_1+\sqrt{\vartheta _2\sigma X_1} X_2, \end{aligned}$$
(A.10)

where \(X_1\) and \(X_2\) are mutually independent, \(X_1 \sim Exp(\frac{1}{\sigma })\) with mean \(\sigma \) and \(X_2\sim N(0,1)\), \(\vartheta _1=(1-2\tau )/[\tau (1-\tau )]\) and \(\vartheta _2=2/[\tau (1-\tau )]\). This representation can transform the ALD to smooth conditional normal distribution and has been extensively utilized in the recent studies (Kobayashi and Kozumi 2012; Kozumi and Kobayashi 2011; Reich et al. 2012). Thus, a two-level hierarchical representation of (A.10) is given by

$$\begin{aligned} Y|X_1 \sim N(\mu + \vartheta _1 X_1, \vartheta _2\sigma X_1),\; X_1 \sim Exp\left( \frac{1}{\sigma }\right) . \end{aligned}$$
(A.11)

1.3 A.3: Relationship between nonlinear quantile regression and ALD

Let \(y_i\) and \(\varvec{x}_i\) denote the outcome of interest and the corresponding covariate vector for subject i (\(i=1,\ldots ,n\)), where \(y_i\) is independent scalar observations of a continuous random variable with common cumulative distribution function (cdf) \(F_{y_i}(\cdot )\). The \(\tau \)th nonlinear QR model for the response \(y_i\) given \(\varvec{x}_i\) takes the form of

$$\begin{aligned} Q_{y_i}(\tau |\varvec{x}_i) =g(\varvec{x}_i, \varvec{\beta }), \end{aligned}$$
(A.12)

where \(Q_{y_i}(\cdot )\equiv F^{-1}_{y_i}(\cdot ) \) is the inverse of cdf of \(y_i\) given \(\varvec{x}_i\) evaluated at \(\tau \) with \(0<\tau <1\), \(g(\cdot )\) is a nonlinear known function. The nonlinear regression coefficient vector \(\varvec{\beta }\) is estimated by minimizing

$$\begin{aligned} \sum _{i=1}^n \rho _{\tau }\left( y_i-g(\varvec{x}_i, \varvec{\beta })\right) , \end{aligned}$$
(A.13)

where \(\rho _{\tau }(\cdot )\) is the check function defined by \(\rho _{\tau }(u)=u({\tau -I(u<0)})\) and \(I(\cdot )\) denotes the indictor function. In order to highlight the \(\tau \)-distributional dependency, the parameter vector \(\varvec{\beta }\) should be indexed by \(\tau \) (i.e., \(\varvec{\beta }(\tau )\)). For sake of simplicity, however, we will omit this notation in the reminder of the paper. The check function is closely related to the ALD; see (Koenker and Machado 1999; Yu and Moyeed 2001; Yu and Stander 2007) in detail. The density function of an ALD, denoted by ALD(\(\mu , \sigma , \tau \)), is briefly discussed in Appendix. Considering \(\sigma \) a nuisance parameter, it can be easily shown that the minimization of Eq. (A.13) with respect to the parameter \(\varvec{\beta }\) is exactly equivalent to the maximization of a likelihood function of \(y_i\) by assuming \(y_i\) from an ALD(\(\mu , \sigma ,\tau \)) with \(\mu =g(\cdot )\).

The relationship between the check function and ALD can be used to reformulate the QR method in the likelihood framework. By utilizing this property, under independent data setting, a large number of QR-based statistical models and various associated analysis methods have been investigated in the literature. For example, (Koenker and Machado 1999) proposed a likelihood-based goodness-of-fit test for QR. (Yu and Moyeed 2001) developed Bayesian QR, and (Yu and Stander 2007) and (Kozumi and Kobayashi 2011) studied the Bayesian estimation procedure for the Tobit QR model with censored data. More recently, QR-based linear mixed-effects models have been considered via different methods for longitudinal data (Farcomeni 2012; Geraci and Bottai 2007; Kim and Yang 2012; Koenker 2004; Lipsitz et al. 1997; Liu and Bottai 2009; Wang and Fygenson 2009; Yuan and Yin 2010).

Appendix B: R and WinBUGS program codes for Model SN

figure a
figure b

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, H., Huang, Y. Quantile regression-based Bayesian joint modeling analysis of longitudinal–survival data, with application to an AIDS cohort study. Lifetime Data Anal 26, 339–368 (2020). https://doi.org/10.1007/s10985-019-09478-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10985-019-09478-w

Keywords

Navigation