Skip to content
Publicly Available Published by De Gruyter September 13, 2017

Erratum to: A Conditional Randomization Test for Covariate Imbalance

  • Jonathan Hennessy EMAIL logo , Tirthankar Dasgupta , Luke Miratrix , Cassandra Pattanayak and and Pradipta Sarkar

Erratum to: Hennessy J, Dasgupta T, Miratrix L, Pattanayak C, Sarkar P. A conditional randomization test to account for covariate imbalance in randomized experiments. J Causal Inference 2016;4(1):61–80 (https://doi.org/10.1515/jci-2015-0018).

There was an error in [1] and we are very grateful to Peng Ding for pointing it out.

Proposition 1, restated below, is incorrect, our proof being a mis-application of a result from [2].

Proposition 1. Let X denote a categorical covariate with J levels, observed after a two-armed randomized experiment is conducted with N units. Let Nj denote the observed number of units that belong to stratum j, and let NTj and NCj denote the number of units assigned to treatment and control respectively, in stratum j, such that NTj+NCj=Nj, and j=1JNj=N. Then the conditional randomization test using the simple difference test statistic τˆsd=YˉTobsYˉCobs and the balance function (NT1, ..., NTJ) is equivalent to the conditional randomization test using the composite test statistic τˆps=j=1JNjNτˆsd,j, where τˆsd,j denotes the simple difference test statistic for the jth stratum.

We can show the proposition is not true by a simple counterexample. In order for the two conditional tests to be equivalent, they must yield the same p-values. Consider the situation where N=5, X=(1,1,1,2,2), w=(1,0,0,1,0), and yobs=(1.13,0.49,0.31,0.98,1.68). In this case, τˆsd=0.435 and τˆps=0.344. To find the p-values for the conditional test, we consider the values of the test statistics across all 6 alternative randomizations where NT1=1 and NT2=1.

To calculate the p-values, we find the proportion of test statistics as or more extreme than the observed. For τˆsd=0.435, there are three test statistics as large or larger (0.435, 0.485, 1.018), so the 2-sided p-value is 23/6=1. For τˆps=0.344, there are two test statistics as large or larger (0.344, 0.904), so the 2-sided p-value is 22/6=2/3. Since the p-values do not agree, the tests are not equivalent.

The incorrect proof in Appendix A mis-applied a result from [2] that showed that in a linear regression of the response on the treatment indicator and covariates, a conditional randomization test based on the treatment indicator coefficient is equivalent to the conditional randomization test based on the simple difference test statistic. This proof rests on the fact that the columns corresponding to the covariates are fixed across randomizations. While our τˆps does equal a regression coefficient, that regression includes interactions between the treatment indicator and the covariates. However, these interaction terms are not fixed across the different randomizations and we ignored this fact in the proof. In the proof in Appendix A, we incorrectly assumed k1=wTF(FTF)1FTyobs is a constant. While wTF(FTF)1 is a constant, FTyobs is not.[1]Note that the main results and conclusions regarding conditional randomization tests from [1] do not depend on the proposition. The proposition was only used in the simulation study to reduce the number of tests to be compared. Rather than reporting the conditional tests using both τˆsd and τˆps, we only reported results using τˆsd. However, in the specific simulation setting we explored, τˆps is, in fact, a monotonic function of τˆsd when conditioning on the observed balance because there are two strata of equal size and the treated and control groups are of equal size. In this situation, the conditional tests using τˆsd and τˆps are equivalent. We verify this fact below. For this situation, our test statistics can be expanded as

Table 1

Alternative randomizations: For each alternative randomization where NT1=1 and NT2=1, we calculate both test statistics.

Randomizationτˆsdτˆps
(1, 0, 0, 1, 0)0.4350.344
(0, 1, 0, 1, 0)0.0980.232
(0, 0, 1, 1, 0)0.7650.952
(1, 0, 0, 0, 1)1.0180.904
(0, 1, 0, 0, 1)0.4850.328
(0, 0, 1, 0, 1)0.1820.392
τˆsd=(NT1NTYˉT1obs+NT2NTYˉT2obs)(NC1NCYˉC1obs+NC2NCYˉC2obs)=2N(NT1YˉT1obs+NT2YˉT2obsNC1YˉC1obsNC2YˉC2obs)

and

τˆps=N1N(YˉT1obsYˉC1obs)+N2N(YˉT2obsYˉC2obs)=12(YˉT1obs+YˉT2obsYˉC1obsYˉC2obs).

We then show that τˆps is a monotonic function of τˆsd by showing that

τˆps=N4NT1NC1(N4(τˆsd+2Yˉobs)NT1Yˉ1obsNT2Yˉ2obs).

The proof is available upon request.

If we were to include the conditional randomization test using τˆps in the simulation study, the results would be the same as those reported for the conditional randomization test using τˆsd. We leave a formal comparison of conditional randomization tests using τˆsd and τˆps for future work.

References

1. Hennessy J, Dasgupta T, Miratrix L, Pattanayak C, Sarkar P. A conditional randomization test to account for covariate imbalance in randomized experiments. J Causal Inference 2016;4(1):61–80.10.1515/jci-2015-0018Search in Google Scholar

2. Rosenbaum PR. Conditional permutation tests and the propensity score in observational studies. J Am Stat Assoc 1984;79(387):565–74.10.1080/01621459.1984.10478082Search in Google Scholar

Published Online: 2017-9-13

© 2017 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 26.4.2024 from https://www.degruyter.com/document/doi/10.1515/jci-2017-0023/html
Scroll to top button