Uniform joint screening for ultra-high dimensional graphical models

doi:10.1016/j.jmva.2020.104645

Journal of Multivariate Analysis

Volume 179, September 2020, 104645

https://doi.org/10.1016/j.jmva.2020.104645 Get rights and content

Abstract

Identifying large-scale conditional dependence structures through graphical models is a challenging yet practical problem. Under ultra-high dimensional settings, a screening procedure is generally suggested before variable selection to reduce computational costs. However, most existing screening methods examine the marginal correlations, thus not suitable to discover the conditional dependence in graphical models. To overcome this issue, we propose a new procedure called graphical uniform joint screening (GUS) for edge identification in graphical models. Instead of screening out edges nodewisely, GUS utilizes a uniform threshold for all statistics indicating the significance of different edges to adapt to various kinds of graphical structures. We demonstrate that GUS enjoys the sure screening property and even the screening consistency by preserving the rankings of the significant edges. Furthermore, a scalable implementation of GUS is developed for big data applications. Simulation and real data studies are provided to illustrate the effectiveness of the proposed method.

Introduction

Rapidly developed technologies have brought in large amounts of data with numerous variables in many research areas including genomics, social networks, computer visions and so on. It is of great practical interest to explore how these variables are related to each other. Under such circumstances, graphical models provide a general framework to characterize the conditional dependence structures among these nodes and are widely used in many applications owing to their descriptive simplicity [17], [21], [30]. Therefore, recovering the graphical structure in a scalable way is of urgent need in the big data era.

A large amount of literature has been devoted to recover the conditional dependence structures in graphical models. When the conditional independence can be characterized by zero entries in the precision matrix, such as in Gaussian graphical models, there are mainly two categories of methods for estimating the graphical structures. The first class optimizes the penalized likelihood or the penalized empirical risk with $l_{1}$ -penalty or some non-convex penalty to estimate the precision matrix. See, for instance, [8], [13], [14], [25], [34], [35], among many others. The second class of methods decompose the estimation of graphical structures into a series of nodewise regressions. That is, fitting sparse regressions for each node by treating the others as predictors [2], [3], [23], [24], [27], [28], [33]. Although these methods provide accurate estimation for graphical structures, they can be infeasible when the dimensionality is ultra-high due to the tremendous computation and memory costs.

To alleviate the computational costs caused by high dimensionality, conducting a screening procedure before the second stage analysis is an efficient and straightforward strategy. Under the ultra-high dimensional settings, screening methods were shown to enjoy great advantages in linear models [9], [31] and have been generalized to different models owing to the computation convenience [7], [11], [12], [18], [19], [22], [36]. Specifically, for ultra-high graphical models, [20] proposed a screening approach called the graphical sure screening (GRASS) to recover the graphical structures. Despite the significant computational advantages, GRASS is not that suitable to identify the conditional dependence structure in graphical models since the statistics are constructed based on the marginal correlations so that two variables adopting joint correlations instead of the marginal ones would not be connected. In this paper, we suggest a new method called the graphical uniform joint screening (GUS) to overcome such issues for edge identification in ultra-high graphical models.

The major contributions of this paper are threefold. First, a uniform joint screening procedure is developed to identify the ultra-high dimensional graphical structures. The statistics of GUS are constructed based on the joint correlations instead of the marginal ones to evaluate the significance of edges. To the best of our knowledge, this is the first joint screening procedure proposed in graphical models. Moreover, a uniform threshold is suggested to screen all significant edges in the graph at one time so that GUS can be adaptive to various kinds of graphical structures. Second, we provide theoretical guarantees for the proposed procedure by establishing the sure screening property as well as the screening consistency without marginal correlation assumptions. Last but not the least, a scalable implementation of GUS is proposed so that all statistics can be generated via a single estimation of the ultra-high dimensional precision matrix, making it as fast as GRASS in terms of computational complexity. The numerical studies show that the proposed procedure enjoys both appealing statistical accuracy and computational efficiency.

The rest of this paper is organized as follows. Section 2 presents the problem setup and the proposed method. Theoretical properties including sure screening property and screening consistency are established in Section 3. We provide simulation and real data studies in Section 4. Section 5 concludes with discussion and possible future work. All technical details are relegated to the Section 6.

Section snippets

Model setting

To illustrate the idea of uniform joint screening, we adopt the Gaussian graphical model for simplicity. In such model, a graph $G = (V, E)$ can be characterized by a $p$ -variate random vector which follows multivariate normal distribution $x = {(x_{1}, \dots, x_{p})}^{⊤} \sim N (μ, Σ),$ where $μ$ is the mean vector, $Σ = {(σ_{j k})}_{p \times p}$ is the covariance matrix and $G$ is an undirected graph with the vertex (or node) set $V ≔ {x_{1}, \dots, x_{p}}$ and the edge set $E ≔ {(j, k)}$ between vertices. Specifically, the edge between $x_{j}$ and $x_{k}$ is absent if and only if $x$

Theoretical properties

In this section, we study the theoretical properties of GUS. To begin with, we introduce the following propositions of $z$ and $Z$ which are defined as $z = Σ^{- 1 ∕ 2} x$ and $Z = X Σ^{- 1 ∕ 2}$ respectively.

Proposition 1

The random vector $z$ has a spherically symmetric distribution, that is, $G z$ has the same probability distribution as $z$ for every orthonormal matrix $G$ .

Proposition 2

For the random matrix $Z$ , there are some $c, c_{1} > 1$ and $C_{1} > 0$ such that the following inequality $Pr (λ_{max} ({\tilde{p}}^{- 1} \tilde{Z} {\tilde{Z}}^{⊤}) > c_{1} a n d λ_{min} ({\tilde{p}}^{- 1} \tilde{Z} {\tilde{Z}}^{⊤}) < 1 ∕ c_{1}) \leq e^{- C_{1} n}$ holds for any $n \times \tilde{p}$ submatrix

Numerical studies

In this section, we use numerical data to investigate the finite sample performance of the proposed GUS procedure, compared to another approach of recovering graphical model, graphical sure screening (GRASS) [20]. We also report the performance of innovated scalable efficient estimation (ISEE) [10] as a reference since ISEE enjoys appealing estimation accuracy and computational efficiency. In two simulation examples, the performance measure we consider to study screening accuracy is positive

Conclusion

In this paper, we have proposed a new procedure called as the graphical uniform joint screening to discover the conditional dependence in graphical models. Instead of screening out edges nodewisely, a uniform threshold is utilized for all statistics indicating the significance of different edges to adapt to various kinds of graphical structures. Moreover, a scalable approach is proposed for simplifying the calculations to adapt to large-scale applications. Both the established theoretical

Proofs

Proof of Theorem 1

We will finish the proof by studying the uniform property of ${\hat{θ}}_{a j}$ associated with both significant and insignificant edges. Recall the statistic ${\hat{θ}}_{a j}$ defined in (6) ${\hat{θ}}_{a j} = X_{j}^{⊤} {(X_{- a} X_{- a}^{⊤})}^{- 1} X_{a} .$ Based on the regression model of $X_{a}$ on $X_{- a}$ $X_{a} = X_{- a} θ_{a} + E_{a},$ it is sufficient that ${\hat{θ}}_{a j}$ can be rewritten as ${\hat{θ}}_{a j} = e_{j}^{⊤} X_{- a}^{⊤} {(X_{- a} X_{- a}^{⊤})}^{- 1} X_{- a} θ_{a} + e_{j}^{⊤} X_{- a}^{⊤} {(X_{- a} X_{- a}^{⊤})}^{- 1} E_{a},$ where $e_{j} = {(0, \dots, 1, 0, \dots, 0)}^{⊤}$ is the $j$ th natural base in the $(p - 1)$ -dimension space. For convenience of representation, we have the following notations ${\hat{θ}}_{a j} : = e_{j}^{⊤} \cdot ξ_{a} + e_{j}^{⊤} \cdot η_{a} = ξ$

CRediT authorship contribution statement

Zemin Zheng: Conceptualization, Methodology, Writing - original draft, Writing - review & editing. Haiyu Shi: Methodology, Writing - original draft, Writing - review & editing, Software. Yang Li: Methodology, Writing - original draft, Writing - review & editing, Project administration. Hui Yuan: Methodology, Writing - original draft, Software.

Acknowledgments

This work was supported by National Natural Science Foundation of China Grants 11601501, 11671374, 71731010, and 71921001, Anhui Provincial Natural Science Foundation, China Grant 1708085QA02, and Fundamental Research Funds for the Central Universities, China Grant WK2040160028. The authors also sincerely thank the Editor, Associate Editor and the referees for their helpful comments and suggestions that led to substantial improvement of the paper.

References (36)

FanJ. et al.
Network exploration via the adaptive LASSO and SCAD penalties
Ann. Appl. Stat.
(2009)
ZhaoS.D. et al.
Principled sure independence screening for Cox models with ultra-high-dimensional covariates
J. Multivariate Anal.
(2012)
ArbeitmanM.N. et al.
Gene expression during the life cycle of drosophila melanogaster
Science
(2002)
CaiT. et al.
A constrained $ℓ_{1}$ minimization approach to sparse precision matrix estimation
J. Amer. Statist. Assoc.
(2011)
CaiT. et al.
Estimating sparse precision matrix: Optimal rates of convergence and adaptive estimation
Ann. Statist.
(2016)
ChasanR. et al.
Activation of the easter zymogen is regulated by five other genes to define dorsal-ventral polarity in the Drosophila embryo
Development
(1992)
Coyle-ThompsonC.A. et al.
The strawberry notch gene functions with Notch in common developmental pathways
Development
(1993)
EfronB. et al.
An Introduction to the Bootstrap
(1994)
FanJ. et al.
Nonparametric independence screening in sparse ultra-high-dimensional additive models
J. Amer. Statist. Assoc.
(2011)
FanJ. et al.
Sure independence screening for ultrahigh dimensional feature space
J. R. Stat. Soc. Ser. B. Stat. Methodol.
(2008)

FanY. et al.

Innovated scalable efficient estimation in ultra-large Gaussian graphical models

Ann. Statist.

(2016)

FanJ. et al.

Sure independence screening in generalized linear models with NP-dimensionality

Ann. Statist.

(2010)

FangY. et al.

Joint variable screening in the censored accelerated failure time model

Statist. Sinica

(2019)

FriedmanJ. et al.

Sparse inverse covariance estimation with the graphical lasso

Biostatistics

(2008)

GuoJ. et al.

Joint estimation of multiple graphical models

Biometrika

(2011)

KimL.K. et al.

Down-regulation of NF- $κ$ B target genes by the AP-1 and STAT complex during the innate immune response in drosophila

PLoS Biol.

(2007)

KolarM. et al.

Estimating time-varying networks

Ann. Appl. Stat.

(2010)

LauritzenS.L.

Graphical Models

(1996)

Cited by (1)

Reproducible learning in large-scale graphical models
2022, Journal of Multivariate Analysis
Learning the conditional dependence structures through high-dimensional graphical models is of fundamental importance in many contemporary applications. Despite the fast growing literature on graphical models, a practical issue of reproducibility remains largely unexplored as most of existing methods for graph recovery do not guarantee the false discovery rate (FDR) control. In this paper, we propose a new procedure, called the high-dimensional graphical knockoff filter, to control the overall FDR for large-scale graph recovery. The proposed procedure enjoys not only theoretical guarantees and high power but also the robustness of FDR control even when the population precision matrices of predictors are replaced by consistent estimates. Furthermore, a scalable implementation approach is developed such that all knockoff variables can be generated through one single estimation of the overall graphical structure. Our new methodology and results are evidenced by numerical studies.

View full text

Uniform joint screening for ultra-high dimensional graphical models

Abstract

Introduction

Section snippets

Model setting

Theoretical properties

Numerical studies

Conclusion

Proofs

CRediT authorship contribution statement

Acknowledgments

Ann. Appl. Stat.

J. Multivariate Anal.

Gene expression during the life cycle of drosophila melanogaster

Science

A constrained ℓ1 minimization approach to sparse precision matrix estimation

J. Amer. Statist. Assoc.

Estimating sparse precision matrix: Optimal rates of convergence and adaptive estimation

Ann. Statist.

Activation of the easter zymogen is regulated by five other genes to define dorsal-ventral polarity in the Drosophila embryo

Development

The strawberry notch gene functions with Notch in common developmental pathways

Development

An Introduction to the Bootstrap

Nonparametric independence screening in sparse ultra-high-dimensional additive models

J. Amer. Statist. Assoc.

Sure independence screening for ultrahigh dimensional feature space

J. R. Stat. Soc. Ser. B. Stat. Methodol.

Innovated scalable efficient estimation in ultra-large Gaussian graphical models

Ann. Statist.

Sure independence screening in generalized linear models with NP-dimensionality

Ann. Statist.

Joint variable screening in the censored accelerated failure time model

Statist. Sinica

Sparse inverse covariance estimation with the graphical lasso

Biostatistics

Joint estimation of multiple graphical models

Biometrika

Down-regulation of NF-κB target genes by the AP-1 and STAT complex during the innate immune response in drosophila

PLoS Biol.

Estimating time-varying networks

Ann. Appl. Stat.

Graphical Models

A constrained $ℓ_{1}$ minimization approach to sparse precision matrix estimation

Down-regulation of NF- $κ$ B target genes by the AP-1 and STAT complex during the innate immune response in drosophila