Skip to main content
Log in

On the power of some sequential multiple testing procedures

  • Published:
Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Abstract

We study an online multiple testing problem where the hypotheses arrive sequentially in a stream. The test statistics are independent and assumed to have the same distribution under their respective null hypotheses. We investigate two recently proposed procedures LORD and LOND, which are proved to control the FDR in an online manner. In some (static) model, we show that LORD is optimal in some asymptotic sense, in particular as powerful as the (static) Benjamini–Hochberg procedure to first asymptotic order. We also quantify the performance of LOND. Some numerical experiments complement our theory.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. In this paper, the survival function of a random variable Y is defined as \(y \mapsto {\mathbb{P}} (Y \ge y)\).

  2. We note that this definition is different from that of Genovese and Wasserman (2002). According to our definition, the FNR is the expected fraction of non-nulls that are not correctly rejected out of all non-nulls, while according to the definition of Genovese and Wasserman (2002), the FNR is the fraction of non-nulls that are not rejected out of all non-rejections. We find our definition to be more appropriate in the asymptotic setting that we consider.

  3. By “static” we mean a setting where all the null hypotheses of interest are considered together. This is the more common setting considered in the multiple testing literature.

References

  • Aharoni, E., Rosset, S. (2014). Generalized \(\alpha \)-investing: Definitions, optimality results and application to public databases. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76(4), 771–794.

    Article  MathSciNet  Google Scholar 

  • Arias-Castro, E., Chen, S. (2017). Distribution-free multiple testing. Electronic Journal of Statistics, 11(1), 1983–2001.

    Article  MathSciNet  Google Scholar 

  • Bartroff, J. (2014). Multiple hypothesis tests controlling generalized error rates for sequential data. arXiv preprint arXiv:14065933

  • Bartroff, J., Song, J. (2013). Sequential tests of multiple hypotheses controlling false discovery and nondiscovery rates. arXiv preprint arXiv:13113350

  • Bartroff, J., Song, J. (2014). Sequential tests of multiple hypotheses controlling type I and II familywise error rates. Journal of Statistical Planning and Inference, 153, 100–114.

    Article  MathSciNet  Google Scholar 

  • Benjamini, Y., Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society Series B (Methodological), 57(1), 289–300.

    Article  MathSciNet  Google Scholar 

  • Bogdan, M., Chakrabarti, A., Frommlet, F., Ghosh, J. K. (2011). Asymptotic Bayes-optimality under sparsity of some multiple testing procedures. The Annals of Statistics, 39, 1551–1579.

    Article  MathSciNet  Google Scholar 

  • Butucea, C., Ndaoud, M., Stepanova, N. A., Tsybakov, A. B., et al. (2018). Variable selection with Hamming loss. The Annals of Statistics, 46(5), 1837–1875.

    Article  MathSciNet  Google Scholar 

  • Dickhaus, T. (2014). Simultaneous statistical inference. Germany: Springer.

    Book  Google Scholar 

  • Donoho, D., Jin, J. (2004). Higher criticism for detecting sparse heterogeneous mixtures. The Annals of Statistics, 32(3), 962–994.

    Article  MathSciNet  Google Scholar 

  • Dudoit, S., van der Laan, M. J. (2007). Multiple testing procedures with applications to genomics. New York: Springer.

    MATH  Google Scholar 

  • Fithian, W., Taylor, J., Tibshirani, R., Tibshirani, R. (2015). Selective sequential model selection. arXiv preprint arXiv:151202565

  • Foster, D. P., Stine, R. A. (2008). \(\alpha \)-investing: A procedure for sequential control of expected false discoveries. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 70(2), 429–444.

    Article  MathSciNet  Google Scholar 

  • Foygel-Barber, R., Candès, E. J. (2015). Controlling the false discovery rate via knockoffs. The Annals of Statistics, 43(5), 2055–2085.

    Article  MathSciNet  Google Scholar 

  • Genovese, C., Wasserman, L. (2002). Operating characteristics and extensions of the false discovery rate procedure. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64(3), 499–517.

    Article  MathSciNet  Google Scholar 

  • G’Sell, M. G., Wager, S., Chouldechova, A., Tibshirani, R. (2016). Sequential selection procedures and false discovery rate control. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 78(2), 423–444.

    Article  MathSciNet  Google Scholar 

  • Ingster, Y. I. (1997). Some problems of hypothesis testing leading to infinitely divisible distributions. Mathematical Methods of Statistics, 6(1), 47–69.

    MathSciNet  MATH  Google Scholar 

  • Ingster, Y. I., Suslina, I. A. (2003). Nonparametric goodness-of-fit testing under Gaussian models. Lecture notes in statistics, Vol. 169. New York: Springer.

    Book  Google Scholar 

  • Javanmard, A., Montanari, A. (2015). On online control of false discovery rate. arXiv preprint arXiv:150206197

  • Javanmard, A., Montanari, A. (2018). Online rules for control of false discovery rate and false discovery exceedance. The Annals of Statistics, 46(2), 526–554.

    Article  MathSciNet  Google Scholar 

  • Ji, P., Jin, J., et al. (2012). UPS delivers optimal phase diagram in high-dimensional variable selection. The Annals of Statistics, 40(1), 73–103.

    Article  MathSciNet  Google Scholar 

  • Jin, J., Ke, Z. T. (2016). Rare and weak effects in large-scale inference: Methods and phase diagrams. Statistica Sinica, 26, 1–34.

    MathSciNet  MATH  Google Scholar 

  • Lei, L., Fithian, W. (2016). Power of ordered hypothesis testing. In International conference on machine learning (pp. 2924–2932).

  • Li, A., Barber, R. F. (2017). Accumulation tests for fdr control in ordered hypothesis testing. Journal of the American Statistical Association, 112(518), 837–849.

    Article  MathSciNet  Google Scholar 

  • Meinshausen, N., Maathuis, M. H., Bühlmann, P., et al. (2011). Asymptotic optimality of the westfall-young permutation procedure for multiple testing under dependence. The Annals of Statistics, 39(6), 3369–3391.

    Article  MathSciNet  Google Scholar 

  • Neuvial, P., Roquain, E. (2012). On false discovery rate thresholding for classification under sparsity. The Annals of Statistics, 40(5), 2572–2600.

    Article  MathSciNet  Google Scholar 

  • Ramdas, A., Yang, F., Wainwright, M. J., Jordan, M. I. (2017). Online control of the false discovery rate with decaying memory. In Advances in neural information processing systems (pp. 5650–5659).

  • Ramdas, A., Zrnic, T., Wainwright, M., Jordan, M. (2018). SAFFRON: An adaptive algorithm for online control of the false discovery rate. In International conference on machine learning (pp. 4283–4291).

  • Roquain, E. (2011). Type I error rate control in multiple testing: A survey with proofs. Journal de la Société Française de Statistique, 152(2), 3–38.

    MathSciNet  MATH  Google Scholar 

  • Storey, J. D. (2002). A direct approach to false discovery rates. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64(3), 479–498.

    Article  MathSciNet  Google Scholar 

  • Storey, J. D. (2007). The optimal discovery procedure: A new approach to simultaneous significance testing. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 69(3), 347–368.

    Article  MathSciNet  Google Scholar 

  • Sun, W., Cai, T. T. (2007). Oracle and adaptive compound decision rules for false discovery rate control. Journal of the American Statistical Association, 102(479), 901–912.

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

We would like to thank Jiaqi Gu from Department of Computer Science, University of California, Los Angeles, for his help with the numerical experiments in Sect. 5. This work was partially supported by grants from the US National Science Foundation (DMS 1223137).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ery Arias-Castro.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Simulations with varying number of hypotheses

Appendix: Simulations with varying number of hypotheses

In this second set of experiments, we examine the performance of the same methods as the number of hypotheses, n, varies.

1.1 FNR of LORD with a fixed level

In this subsection, we present numerical experiments meant to illustrate the theoretical results we derived about asymptotic FNR of LORD. We fix \(q = 0.1\) and choose a few values for the parameter \(\beta \) so as to exhibit different sparsity levels, while the parameter r takes values in a grid of spanning [0, 1.5]. We plot the average FNP of LORD procedure with different \(n \in \{10^6, 10^7, 10^8, 10^9\}\). The simulation results are reported in Figs. 9 and 10. Each situation is repeated 200 times. We observe that in the normal model when \(r>\beta \), the FNP decreases as n is getting larger. In the double-exponential model, as n increases, the FNP transition lines are getting closer the theoretical thresholds \(r = \beta \), especially when \(\beta = 0.7\).

Fig. 9
figure 9

Simulation results showing the FNP for LORD under the normal model in three distinct sparsity regimes with different test size. The black vertical line delineates the theoretical threshold (\(r=\beta \))

Fig. 10
figure 10

Simulation results showing the FNP for LORD under the double-exponential model in three distinct sparsity regimes with different test size. The black vertical line delineates the theoretical threshold (\(r=\beta \))

1.2 Varying level

Here, we explore the effect of letting the desired FDR control level q tend to 0 as n increases in accordance with (11). Specifically, we set it as \(q = q_n = 1/\log n\). We choose n on a log scale, specifically, \(n \in \{10^6, 10^7, 10^8, 10^9\}\). Each time, we fix a value of \((\beta , r)\) such that \(r > \beta \).

In the first setting, we set \((\beta , r) = (0.4, 0.9)\) for normal model and \((\beta , r) = (0.4, 0.7)\) for double-exponential model. The simulation results are reported in Figs. 11 and 12. We see that, in both models, the risks of the two procedures decrease to zero as the test size gets larger. LORD clearly dominates LOND (in terms of FNP). Both methods have FDP much lower than the level \(q_n\), and in particular, LOND is very conservative.

Fig. 11
figure 11

FDP and FNP for the LORD and LOND methods under the normal model with \((\beta , r) = (0.4, 0.9)\) and varying test size n. The black line delineates the desired FDR control level (\(q = q_n \))

Fig. 12
figure 12

FDP and FNP for the LORD and LOND methods under the double-exponential model with \((\beta , r) = (0.4, 0.7)\) and varying test size n. The black line delineates the desired FDR control level (\(q = q_n \))

Fig. 13
figure 13

FDP and FNP for the LORD and LOND methods under the normal model with \((\beta , r) = (0.7, 1.5)\) and varying test size n. The black line delineates the desired FDR control level (\(q = q_n \))

Fig. 14
figure 14

FDP and FNP for the LORD and LOND methods under the double-exponential model with \((\beta , r) = (0.7, 0.9)\) and varying test size n

In the second setting, we set \((\beta , r) = (0.7, 1.5)\) for normal model and \((\beta , r) = (0.7, 0.9)\) for double-exponential model. The simulation results are reported in Figs. 13 and 14. In this sparser regime, we can see that although LORD still dominates, the difference in FNP between two methods is much smaller than that in dense regime, especially in the double-exponential model. Both methods have FDP lower than the level \(q_n\), and in particular, LOND is very conservative.

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, S., Arias-Castro, E. On the power of some sequential multiple testing procedures. Ann Inst Stat Math 73, 311–336 (2021). https://doi.org/10.1007/s10463-020-00752-5

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10463-020-00752-5

Keywords

Navigation