Skip to main content
Log in

False discovery rate for functional data

  • Original Paper
  • Published:
TEST Aims and scope Submit manuscript

Abstract

Since Benjamini and Hochberg introduced false discovery rate (FDR) in their seminal paper, this has become a very popular approach to the multiple comparisons problem. An increasingly popular topic within functional data analysis is local inference, i.e. the continuous statistical testing of a null hypothesis along the domain. The principal issue in this topic is the infinite amount of tested hypotheses, which can be seen as an extreme case of the multiple comparisons problem. In this paper, we define and discuss the notion of FDR in a very general functional data setting. Moreover, a continuous version of the Benjamini–Hochberg procedure is introduced along with a definition of adjusted p value function. Some general conditions are stated, under which the functional Benjamini–Hochberg procedure provides control of the functional FDR. Two different simulation studies are presented; the first study has a one-dimensional domain and a comparison with another state-of-the-art method, and the second study has a planar two-dimensional domain. Finally, the proposed method is applied to satellite measurements of Earth temperature. In detail, we aim at identifying the regions of the planet where temperature has significantly increased in the last decades. After adjustment, large areas are still significant.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. http://eosweb.larc.nasa.gov.

References

  • Abramowicz K, Häger CK, Pini A, Schelin L, Sjöstedt de Luna S, Vantini S (2018) Nonparametric inference for functional-on-scalar linear models applied to knee kinematic hop data after injury of the anterior cruciate ligament. Scand J Stat 45(4):1036–1061

    Article  MathSciNet  Google Scholar 

  • Benjamini Y, Heller R (2007) False discovery rates for spatial signals. J Am Stat Assoc 102(480):1272–1281

    Article  MathSciNet  Google Scholar 

  • Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B (Methodol) 57:289–300

    MathSciNet  MATH  Google Scholar 

  • Benjamini Y, Hochberg Y (1997) Multiple hypotheses testing with weights. Scand J Stat 24(3):407–418

    Article  MathSciNet  Google Scholar 

  • Benjamini Y, Yekutieli D (2001) The control of the false discovery rate in multiple testing under dependency. Ann Stat 29:1165–1188

    Article  MathSciNet  Google Scholar 

  • Berlinet A, Thomas-Agnan C (2011) Reproducing kernel Hilbert spaces in probability and statistics. Springer, Berlin

    MATH  Google Scholar 

  • Blanchard G, Delattre S, Roquain E (2014) Testing over a continuum of null hypotheses with false discovery rate control. Bernoulli 20(1):304–333

    Article  MathSciNet  Google Scholar 

  • Cheng D, Schwartzman A (2017) Multiple testing of local maxima for detection of peaks in random fields. Ann Stat 45(2):529–556

    Article  MathSciNet  Google Scholar 

  • Efron B, Tibshirani R, Storey JD, Tusher V (2001) Empirical bayes analysis of a microarray experiment. J Am Stat Assoc 96(456):1151–1160

    Article  MathSciNet  Google Scholar 

  • Freedman D, Lane D (1983) A nonstochastic interpretation of reported significance levels. J Bus Econ Stat 1(4):292–298

    Google Scholar 

  • Heesen P, Janssen A et al (2015) Inequalities for the false discovery rate (FDR) under dependence. Electron J Stat 9(1):679–716

    Article  MathSciNet  Google Scholar 

  • Holmes AP, Blair R, Watson J, Ford I (1996) Nonparametric analysis of statistic images from functional mapping experiments. J Cerebral Blood Flow Metab 16(1):7–22

    Article  Google Scholar 

  • Horváth L, Kokoszka P (2012) Inference for functional data with applications. Springer, Berlin

    Book  Google Scholar 

  • Perone Pacifico M, Genovese C, Verdinelli I, Wasserman L (2004) False discovery control for random fields. J Am Stat Assoc 99(468):1002–1014

    Article  MathSciNet  Google Scholar 

  • Pini A, Vantini S (2017) Interval-wise testing for functional data. J Nonparametric Stat 29(2):407–424

    Article  MathSciNet  Google Scholar 

  • Ramsay JO, Silverman BW (2005) Functional data analysis, 2nd edn. Springer, Berlin

    Book  Google Scholar 

  • Schwartzman A, Gavrilov Y, Adler RJ (2011) Multiple testing of local maxima for detection of peaks in 1d. Ann Stat 39(6):3290

    Article  MathSciNet  Google Scholar 

  • Storey JD (2003) The positive false discovery rate: a Bayesian interpretation and the q-value. Ann Stat 31(6):2013–2035

    Article  MathSciNet  Google Scholar 

  • Sun W, Reich BJ, Tony Cai T, Guindani M, Schwartzman A (2015) False discovery control in large-scale spatial multiple testing. J R Stat Soc Ser B (Stat Methodol) 77(1):59–83

    Article  MathSciNet  Google Scholar 

  • White H, Domowitz I (1984) Nonlinear regression with dependent observations. Econometrica 52(1):143–161

    Article  MathSciNet  Google Scholar 

  • Winkler AM, Ridgway GR, Webster MA, Smith SM, Nichols TE (2014) Permutation inference for the general linear model. Neuroimage 92:381–397

    Article  Google Scholar 

  • Zeileis A (2004) Econometric computing with HC and HAC covariance matrix estimators. Research Report Series/Department of Statistics and Mathematics, 10. Institut für Statistik und Mathematik, WU, Vienna University of Economics and Business, Vienna

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Niels Lundtorp Olsen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix

A Proofs

Proof of Proposition 3.4 and Proposition 3.6

1.1 A.1 Assumptions

We begin by repeating the assumptions of Propositions 3.4 and 3.6.

Definition

(PRDS) Let ‘\(\le \)’ be the usual ordering on \({\mathbb {R}}^l\). An increasing set \(D \subseteq {\mathbb {R}}^l\) is a set satisfying \(x \in D \wedge y \ge x \Rightarrow y \in D\).

A random variable \({\mathbf {X}}\) on \({\mathbb {R}}^l\) is said to be PRDS on \(I_0\), where \(I_0\) is a subset of \(\{1, \dots , l\}\), if it for any increasing set D and \(i \in I_0\) holds that

$$\begin{aligned} x \le y \Rightarrow P({\mathbf {X}} \in D | X_i = x) \le P({\mathbf {X}} \in D | X_i = y) \end{aligned}$$

Let \({\mathbf {Z}}\) be an infinite-dimensional random variable, where instances of \({\mathbf {Z}}\) are functions \(T \rightarrow {\mathbb {R}}\). We say that \({\mathbf {Z}}\) is PRDS on \(U \subseteq T\) if all finite-dimensional distributions of \({\mathbf {Z}}\) are PRDS. That is, for all finite subsets \(I = \{i_1, \dots , i_l \} \subseteq T\), it holds that \(Z(i_1), \dots , Z(i_l)\) is PRDS on \(I \cap U\).

Let \(\{S_k\}_{k=1}^\infty , S_1 \subset S_2 \subset \dots \) be a dense, uniform grid in \(\mathrm {{\mathbf {T}}}\) in sense that \(S_k\) uniformly approximates all level sets of p and \(p|_U\) with probability one.

For Theorem 3.4 that amounts to,

$$\begin{aligned} P \left[ \lim _{k \rightarrow \infty } \sup _r \frac{\#( S_k \cap \{s: p(s) \le r \})}{\# S_k} - \mu \{s: p(s) \le r \} \rightarrow 0 \right] = 1 \end{aligned}$$
(9)

and

$$\begin{aligned} P \left[ \lim _{k \rightarrow \infty } \sup _r \frac{\#( S_k \cap \{s: p(s) \le r \} \cap U)}{\# S_k} - \mu (\{s: p(s) \le r \} \cap U) \rightarrow 0 \right] = 1 \nonumber \\ \end{aligned}$$
(10)

whereas for Theorem 3.6, we need the density function f:

$$\begin{aligned} P \left[ \lim _{k \rightarrow \infty } \sup _r \frac{\sum _{i \in S_k \cap \{s: p(s) \le r \}} f(i) }{\# S_k} - \int _{\{s: p(s) \le t \}} f(x) \, \mathrm {d}x \rightarrow 0 \right] = 1 \end{aligned}$$

and

$$\begin{aligned} P \left[ \lim _{k \rightarrow \infty } \sup _r \frac{\sum _{i \in S_k \cap \{s: p(s) \le r \} \cap U} f(i) }{\# S_k} - \int _{\{s: p(s) \le t \} \cap U} f(x) \, \mathrm {d}x \rightarrow 0 \right] = 1 \end{aligned}$$

Furthermore, assume that p is PRDS wrt the set of true null hypotheses with probability one and that the assumptions about p-value function below hold true with probability one:

  1. (a1)

    All level sets of p have zero measure,

    $$\begin{aligned} \mu \{s: p(s) = t \} = 0 \quad \forall t \in \mathrm {{\mathbf {T}}}\end{aligned}$$
  2. (a2)

    \(\alpha ^* \in (0,\alpha ] \Rightarrow \) for any open neighbourhood O around \(\alpha ^*\) there exists \(s_1, s_2 \in O\) s.t. \( a(s_1) > \alpha ^{-1} s_1, a(s_2) < \alpha ^{-1} s_2\), where a is the cumulated p value function (Definition 3.2).

  3. (a3)

    \([\alpha ^* = 0] \Rightarrow \min p(t) > 0\).

1.2 A.2 Proof details

For the ease of presentation, we will only consider Theorem 3.4 and furthermore assume that \(\mu (\mathrm {{\mathbf {T}}}) = 1\); the latter can be done without loss of generalisation. The proof of Proposition 3.6 is analogous but notionally tedious, as the counts are replaced by sums and the measures by integrals.

Let \(a_k\) be the cumulated p value function for the k’th iteration of the BH procedure:

$$\begin{aligned} a_k(t) := N_k \# \{ s \in S_k : p(s) \le t\} \end{aligned}$$

and define the k’th step false discovery proportion \(Q_k\) by applying the (usual) BH procedure at level q to p evaluated in \(S_k\):

$$\begin{aligned} Q_k = \frac{\#\{t \in S_k : p(t) \le b_k\} \cap U}{\#\{t \in S_k : p(t) \le b_k\}}, \quad b_k = \arg \max _r \frac{\# \{s \in S_k : p(s) \le r \}}{\#S_k} \ge \alpha ^{-1} r \end{aligned}$$

or equivalently \(b_k = \arg \max _t a_k(t) \ge \alpha ^{-1} r\).

Lemma 7.1

\(a_k\) converges to a uniformly as \(k \rightarrow \infty \).

Proof: Follows from assumption (9) and definitions of \(a_k\) and a.

Lemma 7.2

\(b_k\) converges to \(\alpha ^*\) as \(k \rightarrow \infty \)

Proof

By Lemma 7.1, \(a_k\) converges uniformly to a. There are two cases: \(\alpha ^* = 0\) and \(\alpha ^* \in (0, \alpha ]\).

Case 1, \(\alpha ^* = 0\)

Let O be any open neighbourhood around zero. \(O^{C}\) (where the complement is wrt [0,1]) is a closed set that satisfies \(a(t) < \alpha ^{-1}t \). By continuity of a, there exists an \(\epsilon > 0\) s.t. \(a(t) < \alpha ^{-1}t - \epsilon \) for all \(t \in O^C\). As \(a_k\) converges uniformly to a, eventually for large enough k, \(a_k(t) < \alpha ^{-1}t\) for \(t \in T \backslash O\), and thus, \(b_k \in O\) eventually. This was true for any O, and we conclude that \(b_k \rightarrow 0\).

Case 2, \(\alpha ^* \in (0, \alpha ]\)

By assumption, for any open neighbourhood \(O \ni \alpha ^*\), there exist \(s_1, s_2 \in O\) s.t. \(a(s_1) > \alpha ^{-1}s_1\), \(a(s_2) < \alpha ^{-1}s_2\).

For \(t > \alpha ^*, t \notin O\), we have that \(\alpha ^{-1} t - a(t) > \epsilon \) for some \(\epsilon > 0\) by continuity of a. Hence by uniform convergence, it must hold that for k sufficiently large we have \(a_k(t) < \alpha ^{-1}t\) for \(t > \alpha ^*, t \notin O\). This was true for any O, and we conclude \(\limsup b_k \le \alpha ^*\).

Conversely, we can show that \(\liminf b_k \ge \alpha ^*\), and thus \(\lim b_k = \alpha ^*\).

\(\square \)

Define \(A_k = \#\{t \in S_k : p(t) \le b_k \}\), and define \(Q_k\) as the false discovery proportion for the k’th iteration:

$$\begin{aligned} Q_k := \frac{\# (A_k \cap U)}{\# A_k}1_{A_k \ne \emptyset } \end{aligned}$$

Rejection areas Now we intend to prove that \(H_{t,k}\) converges eventually. Note that p(t) is independent of k, and that \(H_{t,k} = (t \in S_k) \cap (p(t) \le b_k)\), i.e. the event that the BH threshold at step k is larger than p(t).

Proposition 7.3

For all t that satisfies \(p(t) \ne \alpha ^*\), \(H_{t,k}\) converges eventually.

Proof

First note that if \(t \notin S_k\) for all k, then \(H_{t,k}\) is trivially zero for all k. So assume \(t \in S_{k_0}\) for some \(k_0\). As \(k \rightarrow \infty \), \(b_k \rightarrow \alpha ^*\), and by assumption \(p(t) \ne \alpha ^*\). Eventually, as \(k \rightarrow \infty \), p(t) is either strictly larger or strictly smaller than \(b_k\), proving the result. \(\square \)

Convergence of \(Q_k\) Finally, we need to show that \(Q_k \rightarrow Q\). We show this by proving convergence of the nominator and denominator, and arguing that \(Q = 0\) implies that \(Q_k = 0\) eventually.

Define \(H^0 = \{t | p(t) > \alpha ^*\}\), i.e. the acceptance region, and \(H^1 = T \backslash H^0\), the rejection region. Note that \(\mu (H^1) = a( \alpha ^*) = \alpha ^{-1} \alpha ^* \). Also note that \(H^1 = V \cup S = \{t: p(t) \le \alpha ^*\}\) and \(H^1 \cap U = V\).

Proposition 7.4

\( N_k \# A_k \rightarrow \mu (H^1)\) and \(N_k \# (A_k \cap U) \rightarrow \mu (H^1 \cap U)\).

Proof

For k, define \(J_k = \{t : p(t) \le b_k \}\). Note that \(A_k = J_k \cap S_k\).

Observe that by the assumption about uniform convergence on levels sets (Eq. (9)):

$$\begin{aligned} N_k \# (J_k \cap S_k) - \mu (J_k) \rightarrow 0 \text { for }k \rightarrow \infty . \end{aligned}$$

Next observe that due to (1) continuity of a, (2) \(b_k \rightarrow \alpha ^*\) and (3) the fact that we are considering sets on the form \(\{t: p(t) \le x \}\), we are able to conclude that

$$\begin{aligned} \mu (J_k \triangle H^1) \rightarrow 0 \text { for }k \rightarrow \infty . \end{aligned}$$

and we conclude \(N_k \# A_k \rightarrow \mu (H^1)\).

For the second part, observe that from Eq. (10)) it follows that

$$\begin{aligned} N_k \# (U \cap J_k \cap S_k) - \mu (U \cap J_k) \rightarrow 0 \text { for }k \rightarrow \infty . \end{aligned}$$

We just argued that \(\mu (J_k \triangle H^1) \rightarrow 0\). It remains true when ‘conditioning’ on a measurable set, in this case U:

$$\begin{aligned} \mu ((J_k \cap U) \triangle (H^1 \cap U)) \rightarrow 0 \text { for }k \rightarrow \infty . \end{aligned}$$

and we conclude \(N_k \# (A_k \cap U) \rightarrow \mu (H^1)\). \(\square \)

For \(\alpha ^* = 0\) we have the following stronger result:

Lemma 7.5

If \(\alpha ^* = 0\), then \(\# A_k = 0\) eventually.

From this lemma, it follows that \( N_k \# A_k = 0\) (and thus \(Q_k\) as well) eventually.

Proof

Since \(\alpha ^* = 0\), \(a(t) < \alpha ^{-1}t\) for all \(t> 0\). By assumption, \(\min p(t) > 0\), and thus \(a_k(s) = 0\) for \(s < \min p(t)\) and all k.

By continuity of a, it follows that there exists \(\epsilon > 0\) s.t. \(\alpha ^{-1} t - a(t) > \epsilon \) on the interval \([\min p(t) , 1]\), and by uniform convergence of \(a_k\) we get that for large enough k, \(a_k(t) < \alpha ^{-1} t\) for all \(t \ge \min p(t)\).

Combining this with \(a_k(t) = 0\) for \(t < \min p(t)\), we get that eventually \(a_k(t) < \alpha ^{-1}t\) for every \(t > 0\) and thus \(b_k = 0\). From this (remember \(\min p(t)>0\)), we conclude that all hypotheses are rejected eventually, i.e. \(\# A_k = 0\) for k sufficiently large. \(\square \)

Theorem 7.6

\(Q_k\) converges to Q almost surely, and \(\limsup _{k \rightarrow \infty } \mathrm {E}[Q_k] \le \alpha \mu (U)\).

Proof

By Lemma 7.5, \(Q_k\) converges to Q when \(\alpha ^* = 0\), and by Proposition 7.4\(Q_k\) converges to Q when \(\alpha ^* > 0\) since \(\mu (H^1) = \alpha ^*/\alpha > 0\).

Applying Benjamini and Yekuteli’s original proposition, Theorem 2.6, (now we use the PRDS assumption), we have \(\mathrm {E}[Q_k] \le \alpha N_k \#(S_k \cap U)\) for all k.

By setting \(r = 1\) it follows from (10) that \(\lim _{k \rightarrow \infty } N_k \#(S_k \cap U) = \mu (U)\) and hence \(\limsup _{k \rightarrow \infty } \mathrm {E}[Q_k] \le \alpha \mu (U)\).

\(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Olsen, N.L., Pini, A. & Vantini, S. False discovery rate for functional data. TEST 30, 784–809 (2021). https://doi.org/10.1007/s11749-020-00751-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11749-020-00751-x

Keywords

Mathematics Subject Classification

Navigation