Permutation tests for general dependent truncation

https://doi.org/10.1016/j.csda.2018.07.012Get rights and content

Abstract

Truncated survival data arise when the event time is observed only if it falls within a subject-specific region, known as the truncation set. Left-truncated data arise when there is delayed entry into a study, such that subjects are included only if their event time exceeds some other time. Quasi-independence of truncation and failure refers to factorization of their joint density in the observable region. Under quasi-independence, standard methods for survival data such as the Kaplan–Meier estimator and Cox regression can be applied after simple adjustments to the risk sets. Unlike the requisite assumption of independent censoring, quasi-independence can be tested, e.g., using a conditional Kendall’s tau test. Current methods for testing for quasi-independence are powerful for monotone alternatives. Nonetheless, it is essential to detect any kind of deviation from quasi-independence so as not to report a biased Kaplan–Meier estimator or regression effect, which would arise from applying the simple risk set adjustment when dependence holds. Nonparametric, minimum p-value tests that are powerful against non-monotone alternatives are developed to offer protection against erroneous assumptions of quasi-independence. The use of conditional and unconditional methods of permutation for evaluation of the proposed tests is investigated in simulation studies. The proposed tests are applied to a study on the cognitive and functional decline in aging.

Introduction

Truncated survival data arise when the event time is observed only if it falls within a subject-specific region, known as the truncation set. They are frequently encountered in observational studies conducted in many fields, including biomedical science, astronomy, and social science. One such example is an integrated observational aging study in which a large number of cognitively normal older individuals were recruited across three independent observational cohort studies (Mormino et al., 2014). These individuals had a global Clinical Dementia Rating (CDR) of 0 at the baseline testing session. In one sub-study, researchers were interested in the association between the accumulation of abnormally folded beta-amyloid protein in the brain and the risk of cognitive decline. Participants were only included in the amyloid sub-study if they had beta-amyloid measured at or after the baseline testing session by a Positron Emission Tomography (PET) scan and remained at CDR of 0 at that time. Cognitive decline was measured by progression to a global CDR of 0.5, which is left-truncated by the time from baseline to PET scan time. Additionally, there was right censoring for subjects who did not progress by the end of the follow-up.

Most existing methods for the analysis of truncated data assume quasi-independence between the truncation and event times (e.g., Turnbull (1976), Woodroofe (1985), Tsai et al. (1987), Lagakos et al. (1988), Wang (1991), Klein and Moeschberger (2003)). Quasi-independence refers to the factorization of the joint density of truncation and event within the observable region. This condition may not hold in many realistic settings. For example, PET scans for amyloid may be prioritized according to some assessment of the cognitive trajectory of the subject, which could lead to many possible monotone and non-monotone dependencies between time to PET scan and time to cognitive decline.

Quasi-independence of truncation and event times, unlike independence of censoring and event times, is a condition that can be tested with observed data. Tsai (1990) proposed a test of quasi-independence in the setting of left-truncation and right censoring based on a conditional Kendall’s tau. Martin and Betensky (2005) extended the method to the settings of double-truncation and interval censoring. Austin and Betensky (2014) proposed inverse probability weighted versions of the conditional Kendall’s tau estimators to remove the effects of censoring. Emura and Wang (2010) constructed a weighted log-rank type statistic, with optimal weights constructed from the odds ratio function considered in Chaieb et al. (2006). Chen et al. (1996) suggested a conditional version of Pearson’s product-moment correlation coefficient, but this does not accommodate right censoring. Jones and Crowley (1992) proposed a class of nonparametric tests under a proportional hazards assumption. These tests are powerful for monotone alternatives, but are not generally powerful for detecting non-monotone alternatives that are commonly encountered in real applications.

Some recent proposals have attempted to accommodate non-monotone alternatives through local versions of Kendall’s tau tests. de Uña-Álvarez (2012), and Rodríguez-Girondo and de Uña-Álvarez (2012) proposed a test for the Markov assumption embedded in the illness-death model based on the maximum of a sequence of Kendall’s tau statistics at predetermined time grids. These tests are inefficient under heavy censoring or with small sample sizes. Rodríguez-Girondo and de Uña-Álvarez (2016) proposed a weighted version of the local Kendall’s tau as an attempt to handle heavy censoring. All of these methods are quite computationally intensive as they rely on a bootstrap method to approximate the distribution of the local test statistics and an additional bootstrap for evaluation of the critical value of the maximum local test statistics. Furthermore, the performance of the tests depends on the pre-selected time grids. Nonetheless, this notion of optimization of a sequence of local tests seems promising for achieving high power for general dependence alternatives and is similar in flavor to the ideas suggested by Heller et al. (2012), Kaufman et al. (2013) and Heller et al. (2016).

Evaluation of tests of quasi-independence is challenging given the restriction associated with the observed data. Although analytical evaluation is possible for simple tests in the setting of one-sided truncation, it is not possible for more complex sampling restrictions or for maximized sequences of test statistics, such as nonparametric tests that have high power for non-monotone alternatives. Previous authors developed algorithms for permutation that yield pairs that all satisfy the truncation restriction Tsai (1990), Efron and Petrosian (1992), Efron and Petrosian (1994). This permutation method is valid, in that each permutation is exchangeable, though it is limited in that it holds fixed the size of each risk set, and thus is not useful for evaluation of any statistic that is largely a function of risk set size, such as the Kaplan–Meier estimator. Because of this property, we refer to this method of permutation as the conditional permutation. An alternative method of permutation would simply permute all of the truncation times, and retain only those pairs that satisfy the truncation restriction. Strictly speaking, this is not a valid procedure in that the permutations have variable sample size and are not exchangeable, though in large enough samples and for standardized test statistics, this may not be problematic. We refer to this method of permutation as the unconditional permutation.

In this paper, we propose two minimum p-value tests of quasi-independence that have high power for non-monotone alternatives and are less computationally intensive than the double bootstrap approach used by de Uña-Álvarez (2012), Rodríguez-Girondo and de Uña-Álvarez (2012) and Rodríguez-Girondo and de Uña-Álvarez (2016). For evaluation of these tests, we investigate and compare the use of conditional and unconditional permutations. We conduct extensive simulation studies to examine the properties of these tests and apply the tests to assess quasi-independence in the cognitive and functional decline in an aging study. We developed a publicly available R package, permDep (Chiou, 2017a), which is able to implement the proposed methods efficiently by embedding C/C++ subroutines to speed up computationally intensive components and by executing permutations in a parallel fashion.

Section snippets

Test statistics

In the setting of left truncated and right censored data, we let X denote the time to event (e.g., time from baseline testing to CDR progression), C denote time to censoring (e.g., time from baseline testing to end of follow-up) and T denote the time to study entry (e.g., time from baseline testing to PET scan). Let Y=min(X,C) and Δ denote the observation time and censoring indicator, respectively, where Δ=I(XC), and I() is the indicator function. Due to truncation, Y is observable only if TY

Permutation for evaluation of minimum p-value tests

Much of the literature on optimally selected statistics is focused on theoretical derivations of a correct p-value that accounts for the optimization. In many cases, this supremum statistic was shown to follow a Brownian Bridge process after transformation to a time scale indexed by the thresholding variable (e.g., Betensky and Rabinowitz (1999), Rabinowitz and Betensky (2000), Miller and Siegmund (1982)). This finding then enabled calculation of the corrected p-value through established

Simulation studies

We conducted two simulation studies under monotone dependence, and two under non-monotone dependence. Our objectives were to investigate the performance of the minimum p-value tests in both scenarios and to compare the conditional and unconditional methods of permutation. We implemented these tests in a publicly available R package permDep (Chiou, 2017a).

Cognitive and functional decline in aging study

In a study conducted by Mormino et al. (2014), cognitively normal older individuals were aggregated from three independent observational cohort studies: Alzheimer’s Disease Neuroimaging Initiative (ADNI), Australian Imaging Biomarkers and Lifestyle Study of Aging (AIBL), and Harvard Aging Brain Study (HABS). The AIBL study methodology has been reported previously (Ellis et al., 2009).

These individuals had a global Clinical Dementia Rating (CDR) of 0 at the baseline testing session. In one

Discussion

The goal of this paper was to propose and investigate tests of quasi-independence that are sensitive to non-monotone alternatives. These tests are needed given that non-monotone dependence structures exist in real studies and that it is necessary to establish quasi-independence before applying the straightforward risk-set adjusted analyses that assume it. We have proposed two such tests that are based on an optimal selection after examination of several thresholded versions of the data. These

Acknowledgments

This research was supported in part by the Harvard NeuroDiscovery Center, the Harvard Clinical and Translational Science Center NIH UL1 TR001102, NIH CA075971, NIH NS094610, and NIH NS048005, NIH P50AG005134, NIH P01AG036694 and K01 AG051718.

The ADNI was launched in 2003 as a public–private partnership, led by Principal Investigator Michael W. Weiner, MD. The primary goal of ADNI has been to test whether serial magnetic resonance imaging (MRI), positron emission tomography (PET),

References (58)

  • ChaiebL.L. et al.

    Estimating survival under a dependent truncation

    Biometrika

    (2006)
  • ChenC.H. et al.

    The product-moment correlation coefficient and linear regression for truncated data

    J. Amer. Statist. Assoc.

    (1996)
  • ChenY. et al.

    Sequential Monte Carlo methods for permutation tests on truncated data

    Statist. Sinica

    (2007)
  • Chiou, S.H., 2017a. permDep: Permutation tests for general dependent truncation. R package version 1.0-0....
  • Chiou, S.H., 2017b. tranSurv: Estimating a survival distribution in the presence of dependent left truncation and right...
  • Chiou, S.H., Austin, M., Qian, J., Betensky, R.A., 2018. Transformation model estimation of survival under dependent...
  • DoblerD. et al.

    Non-strange weird resampling for complex survival data

    Biometrika

    (2017)
  • EfronB. et al.

    A simple test of independence for truncated data with applications to redshift surveys

    Astrophys. J.

    (1992)
  • EfronB. et al.

    Survival analysis of the gamma-ray burst data

    J. Amer. Statist. Assoc.

    (1994)
  • Emura, T., 2018. depend.truncation: Statistical methods for the analysis of dependently truncated data. URL...
  • EmuraT. et al.

    An algorithm for estimating survival under a copula-based dependent truncation model

    TEST

    (2015)
  • EmuraT. et al.

    Parametric likelihood inference and goodness-of-fit for dependently left-truncated data, a copula-based approach

    Statist. Papers

    (2017)
  • EmuraT. et al.

    Semiparametric inference for an accelerated failure time model with dependent truncation

    Ann. Inst. Statist. Math.

    (2016)
  • EmuraT. et al.

    Semi-parametric inference for copula models for truncated data

    Statist. Sinica

    (2011)
  • GoodP.

    Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses

    (2013)
  • GrossS.T. et al.

    Bootstrap methods for truncated and censored data

    Statist. Sinica

    (1996)
  • HalpernA.L.

    Minimally selected p and other tests for a single abrupt changepoint in a binary sequence

    Biometrics

    (1999)
  • HellerR. et al.

    A consistent multivariate test of association based on ranks of distances

    Biometrika

    (2012)
  • HellerR. et al.

    Consistent distribution-free K-sample and independence tests for univariate random variables

    J. Mach. Learn. Res.

    (2016)
  • Cited by (14)

    View all citing articles on Scopus
    1

    Present address: Department of Mathematical Sciences, The University of Texas at Dallas; 800 W. Campbell Road, Richardson, TX 75080, USA.

    2

    Data used in preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). As such the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. A complete listing of ADNI investigators can be found at: http://adni.loni.usc.edu/wp-content/uploads/how_to_apply/ADNI_Acknowledgement_List.pdf.

    3

    Data used in the preparation of this article was obtained from the Australian Imaging Biomarkers and Lifestyle flagship study of aging(AIBL) funded by the Commonwealth Scientific and Industrial Research Organisation (CSIRO) which was made available at the ADNI database (www.loni.usc.edu/ADNI). The AIBL researchers contributed data but did not participate in analysis or writing of this report. AIBL researchers are listed at www.aibl.csiro.au.

    View full text