Abstract
We study an online multiple testing problem where the hypotheses arrive sequentially in a stream. The test statistics are independent and assumed to have the same distribution under their respective null hypotheses. We investigate two recently proposed procedures LORD and LOND, which are proved to control the FDR in an online manner. In some (static) model, we show that LORD is optimal in some asymptotic sense, in particular as powerful as the (static) Benjamini–Hochberg procedure to first asymptotic order. We also quantify the performance of LOND. Some numerical experiments complement our theory.
Similar content being viewed by others
Notes
In this paper, the survival function of a random variable Y is defined as \(y \mapsto {\mathbb{P}} (Y \ge y)\).
We note that this definition is different from that of Genovese and Wasserman (2002). According to our definition, the FNR is the expected fraction of non-nulls that are not correctly rejected out of all non-nulls, while according to the definition of Genovese and Wasserman (2002), the FNR is the fraction of non-nulls that are not rejected out of all non-rejections. We find our definition to be more appropriate in the asymptotic setting that we consider.
By “static” we mean a setting where all the null hypotheses of interest are considered together. This is the more common setting considered in the multiple testing literature.
References
Aharoni, E., Rosset, S. (2014). Generalized \(\alpha \)-investing: Definitions, optimality results and application to public databases. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76(4), 771–794.
Arias-Castro, E., Chen, S. (2017). Distribution-free multiple testing. Electronic Journal of Statistics, 11(1), 1983–2001.
Bartroff, J. (2014). Multiple hypothesis tests controlling generalized error rates for sequential data. arXiv preprint arXiv:14065933
Bartroff, J., Song, J. (2013). Sequential tests of multiple hypotheses controlling false discovery and nondiscovery rates. arXiv preprint arXiv:13113350
Bartroff, J., Song, J. (2014). Sequential tests of multiple hypotheses controlling type I and II familywise error rates. Journal of Statistical Planning and Inference, 153, 100–114.
Benjamini, Y., Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society Series B (Methodological), 57(1), 289–300.
Bogdan, M., Chakrabarti, A., Frommlet, F., Ghosh, J. K. (2011). Asymptotic Bayes-optimality under sparsity of some multiple testing procedures. The Annals of Statistics, 39, 1551–1579.
Butucea, C., Ndaoud, M., Stepanova, N. A., Tsybakov, A. B., et al. (2018). Variable selection with Hamming loss. The Annals of Statistics, 46(5), 1837–1875.
Dickhaus, T. (2014). Simultaneous statistical inference. Germany: Springer.
Donoho, D., Jin, J. (2004). Higher criticism for detecting sparse heterogeneous mixtures. The Annals of Statistics, 32(3), 962–994.
Dudoit, S., van der Laan, M. J. (2007). Multiple testing procedures with applications to genomics. New York: Springer.
Fithian, W., Taylor, J., Tibshirani, R., Tibshirani, R. (2015). Selective sequential model selection. arXiv preprint arXiv:151202565
Foster, D. P., Stine, R. A. (2008). \(\alpha \)-investing: A procedure for sequential control of expected false discoveries. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 70(2), 429–444.
Foygel-Barber, R., Candès, E. J. (2015). Controlling the false discovery rate via knockoffs. The Annals of Statistics, 43(5), 2055–2085.
Genovese, C., Wasserman, L. (2002). Operating characteristics and extensions of the false discovery rate procedure. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64(3), 499–517.
G’Sell, M. G., Wager, S., Chouldechova, A., Tibshirani, R. (2016). Sequential selection procedures and false discovery rate control. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 78(2), 423–444.
Ingster, Y. I. (1997). Some problems of hypothesis testing leading to infinitely divisible distributions. Mathematical Methods of Statistics, 6(1), 47–69.
Ingster, Y. I., Suslina, I. A. (2003). Nonparametric goodness-of-fit testing under Gaussian models. Lecture notes in statistics, Vol. 169. New York: Springer.
Javanmard, A., Montanari, A. (2015). On online control of false discovery rate. arXiv preprint arXiv:150206197
Javanmard, A., Montanari, A. (2018). Online rules for control of false discovery rate and false discovery exceedance. The Annals of Statistics, 46(2), 526–554.
Ji, P., Jin, J., et al. (2012). UPS delivers optimal phase diagram in high-dimensional variable selection. The Annals of Statistics, 40(1), 73–103.
Jin, J., Ke, Z. T. (2016). Rare and weak effects in large-scale inference: Methods and phase diagrams. Statistica Sinica, 26, 1–34.
Lei, L., Fithian, W. (2016). Power of ordered hypothesis testing. In International conference on machine learning (pp. 2924–2932).
Li, A., Barber, R. F. (2017). Accumulation tests for fdr control in ordered hypothesis testing. Journal of the American Statistical Association, 112(518), 837–849.
Meinshausen, N., Maathuis, M. H., Bühlmann, P., et al. (2011). Asymptotic optimality of the westfall-young permutation procedure for multiple testing under dependence. The Annals of Statistics, 39(6), 3369–3391.
Neuvial, P., Roquain, E. (2012). On false discovery rate thresholding for classification under sparsity. The Annals of Statistics, 40(5), 2572–2600.
Ramdas, A., Yang, F., Wainwright, M. J., Jordan, M. I. (2017). Online control of the false discovery rate with decaying memory. In Advances in neural information processing systems (pp. 5650–5659).
Ramdas, A., Zrnic, T., Wainwright, M., Jordan, M. (2018). SAFFRON: An adaptive algorithm for online control of the false discovery rate. In International conference on machine learning (pp. 4283–4291).
Roquain, E. (2011). Type I error rate control in multiple testing: A survey with proofs. Journal de la Société Française de Statistique, 152(2), 3–38.
Storey, J. D. (2002). A direct approach to false discovery rates. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64(3), 479–498.
Storey, J. D. (2007). The optimal discovery procedure: A new approach to simultaneous significance testing. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 69(3), 347–368.
Sun, W., Cai, T. T. (2007). Oracle and adaptive compound decision rules for false discovery rate control. Journal of the American Statistical Association, 102(479), 901–912.
Acknowledgements
We would like to thank Jiaqi Gu from Department of Computer Science, University of California, Los Angeles, for his help with the numerical experiments in Sect. 5. This work was partially supported by grants from the US National Science Foundation (DMS 1223137).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: Simulations with varying number of hypotheses
Appendix: Simulations with varying number of hypotheses
In this second set of experiments, we examine the performance of the same methods as the number of hypotheses, n, varies.
1.1 FNR of LORD with a fixed level
In this subsection, we present numerical experiments meant to illustrate the theoretical results we derived about asymptotic FNR of LORD. We fix \(q = 0.1\) and choose a few values for the parameter \(\beta \) so as to exhibit different sparsity levels, while the parameter r takes values in a grid of spanning [0, 1.5]. We plot the average FNP of LORD procedure with different \(n \in \{10^6, 10^7, 10^8, 10^9\}\). The simulation results are reported in Figs. 9 and 10. Each situation is repeated 200 times. We observe that in the normal model when \(r>\beta \), the FNP decreases as n is getting larger. In the double-exponential model, as n increases, the FNP transition lines are getting closer the theoretical thresholds \(r = \beta \), especially when \(\beta = 0.7\).
1.2 Varying level
Here, we explore the effect of letting the desired FDR control level q tend to 0 as n increases in accordance with (11). Specifically, we set it as \(q = q_n = 1/\log n\). We choose n on a log scale, specifically, \(n \in \{10^6, 10^7, 10^8, 10^9\}\). Each time, we fix a value of \((\beta , r)\) such that \(r > \beta \).
In the first setting, we set \((\beta , r) = (0.4, 0.9)\) for normal model and \((\beta , r) = (0.4, 0.7)\) for double-exponential model. The simulation results are reported in Figs. 11 and 12. We see that, in both models, the risks of the two procedures decrease to zero as the test size gets larger. LORD clearly dominates LOND (in terms of FNP). Both methods have FDP much lower than the level \(q_n\), and in particular, LOND is very conservative.
In the second setting, we set \((\beta , r) = (0.7, 1.5)\) for normal model and \((\beta , r) = (0.7, 0.9)\) for double-exponential model. The simulation results are reported in Figs. 13 and 14. In this sparser regime, we can see that although LORD still dominates, the difference in FNP between two methods is much smaller than that in dense regime, especially in the double-exponential model. Both methods have FDP lower than the level \(q_n\), and in particular, LOND is very conservative.
About this article
Cite this article
Chen, S., Arias-Castro, E. On the power of some sequential multiple testing procedures. Ann Inst Stat Math 73, 311–336 (2021). https://doi.org/10.1007/s10463-020-00752-5
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10463-020-00752-5