Abstract
It is now common to record dozens to hundreds or more neurons simultaneously, and to ask how the network activity changes across experimental conditions. A natural framework for addressing questions of functional connectivity is to apply Gaussian graphical modeling to neural data, where each edge in the graph corresponds to a non-zero partial correlation between neurons. Because the number of possible edges is large, one strategy for estimating the graph has been to apply methods that aim to identify large sparse effects using an \(L_{1}\) penalty. However, the partial correlations found in neural spike count data are neither large nor sparse, so techniques that perform well in sparse settings will typically perform poorly in the context of neural spike count data. Fortunately, the correlated firing for any pair of cortical neurons depends strongly on both their distance apart and the features for which they are tuned. We introduce a method that takes advantage of these known, strong effects by allowing the penalty to depend on them: thus, for example, the connection between pairs of neurons that are close together will be penalized less than pairs that are far apart. We show through simulations that this physiologically-motivated procedure performs substantially better than off-the-shelf generic tools, and we illustrate by applying the methodology to populations of neurons recorded with multielectrode arrays implanted in macaque visual cortex areas V1 and V4.
Similar content being viewed by others
References
Ahrens, M.B., Orger, M.B., Robson, D.N., Li, J.M., Keller, P.J. (2013). Whole-brain functional imaging at cellular resolution using light-sheet microscopy. Nature Methods, 10(5), 413–420.
Alivisatos, A.P., Andrews, A.M., Boyden, E.S., Chun, M., Church, G.M., Deisseroth, K., et al. (2013). Nanotools for neuroscience and brain activity mapping.
Andrews, D.F., & Mallows, C.L. (1974). Scale mixtures of normal distributions. Journal of the Royal Statistical Society. Series B (Methodological), 1, 99–102.
Banerjee, O., Ghaoui, L.E., d’Aspremont, A. (2008). Model selection through sparse maximum likelihood estimation for multivariate gaussian or binary data. Journal of Machine learning research, 9, 485–516.
Bassett, D.S., & Sporns, O. (2017). Network neuroscience. Nature Neuroscience, 20(3), 353–364.
Behseta, S., Berdyyeva, T., Olson, C.R., Kass, R.E. (2009). Bayesian correction for attenuation of correlation in multi-trial spike count data. Journal of Neurophysiology, 101(4), 2186–2193.
Brown, E.N., Kass, R.E., Mitra, P.P. (2004). Multiple neural spike train data analysis: state-of-the-art and future challenges. Nature Neuroscience, 7(5), 456–461.
Butts, D.A., & Goldman, M.S. (2006). Tuning curves, neuronal variability, and sensory coding. PLoS Biol, 4 (4), e92.
Chandrasekaran, V., Parrilo, P.A., Willsky, A.S. (2012). Latent variable graphical model selection via convex optimization. Annals of Statistics, 40(4), 1935–1967.
Cohen, M.R., & Kohn, A. (2011). Measuring and interpreting neuronal correlations. Nature Neuroscience, 14 (7), 811–819.
Cohen, M.R., & Maunsell, J.H. (2009). Attention improves performance primarily by reducing interneuronal correlations. Nature Neuroscience, 12(12), 1594–1600.
Cover, T.M., & Thomas, J.A. (2006). Elements of information theory, 2nd edition. Wiley-Interscience: NJ.
Cowley, B.R., Smith, M.A., Kohn, A., Yu, B.M. (2016). Stimulus-driven population activity patterns in macaque primary visual cortex. PLOS Computational Biology, 12(12), e1005185.
Cunningham, J.P., & Yu, B.M. (2014). Dimensionality reduction for large-scale neural recordings. Nature Neuroscience, 17(11), 1500–1509.
d’Aspremont, A., Banerjee, O., El Ghaoui, L. (2008). First-order methods for sparse covariance selection. SIAM Journal on Matrix Analysis and Applications, 30(1), 56–66.
Dempster, A.P., Laird, N.M., Rubin, D.B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the royal statistical society. Series B (methodological), 1–38.
Dobson, A.J., & Barnett, A. (2008). An introduction to generalized linear models. Boca Raton: CRC Press, Chapman & Hall.
Ecker, A.S., Berens, P., Cotton, R.J., Subramaniyan, M., Denfield, G.H., Cadwell, C.R., et al. (2014). State dependence of noise correlations in macaque primary visual cortex. Neuron, 82(1), 235–248.
Efron, B., Tibshirani, R., Storey, J.D., Tusher, V. (2001). Empirical Bayes analysis of a microarray experiment. Journal of the American statistical association, 96(456), 1151–1160.
Efron, B., & Tibshirani, R. (2002). Empirical Bayes methods and false discovery rates for microarrays. Genetic Epidemiology, 23(1), 70–86.
Efron, B. (2007). Size, power and false discovery rates. The Annals of Statistics 1351–1377.
Fan, J., Feng, Y., Wu, Y. (2009). Network exploration via the adaptive LASSO and SCAD penalties. The Annals of Applied Statistics, 3(2), 521.
Foygel, R., & Drton, M. (2010). Extended Bayesian information criteria for Gaussian graphical models. In Advances in neural information processing systems (pp. 604–612).
Friedman, J., Hastie, T., Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9(3), 432–441.
Gelman, A., Carlin, J., Stern, H.S., Rubin, D.B. (2004). Bayesian data analysis. New York: CRC Press.
Georgopoulos, A.P., & Ashe, J. (2000). One motor cortex, two different views. Nature Neuroscience, 3(10), 963.
Giraud, C., & Tsybakov, A. (2012). Discussion: Latent variable graphical model selection via convex optimization. Annals of Statistics, 40(4), 1984–1988.
Goris, R.L., Movshon, J.A., Simoncelli, E.P. (2014). Partitioning neuronal variability. Nature Neuroscience, 17(6), 858–865.
Guerrero, J.L. (1994). Multivariate mutual information: sampling distribution with applications. Communications in Statistics-Theory and Methods, 23(5), 1319–1339.
Gutnisky, D.A., & Dragoi, V. (2008). Adaptive coding of visual information in neural populations. Nature, 452(7184), 220–224.
Hastie, T.J., & Pregibon, D. (1992). Generalized linear models. In Chambers, J.M., & Hastie, T.J. (Eds.) Wadsworth & Brooks/Cole.
Hastie, T.J., & Tibshirani, R.J. (1990). Generalized additive models Vol. 43. Boca Raton: CRC press, Chapman & Hall.
Hsieh, C.J., Dhillon, I.S., Ravikumar, P.K., Sustik, M.A. (2011). Sparse inverse covariance matrix estimation using quadratic approximation. In Advances in neural information processing systems (pp. 2330–2338).
Kass, R.E., Eden, U.T., Brown, E.N. (2014). Analysis of neural data. New York: Springer.
Kelly, R.C., Smith, M.A., Samonds, J.M., Kohn, A., Bonds, A.B., Movshon, J.A., Lee, T.S. (2007). Comparison of recordings from microelectrode arrays and single electrodes in the visual cortex. Journal of Neuroscience, 27(2), 261–264.
Kelly, R.C., Smith, M.A., Kass, R.E., Lee, T.S. (2010). Local field potentials indicate network state and account for neuronal response variability. Journal of Computational Neuroscience, 29(3), 567–579.
Kelly, R.C., & Kass, R.E. (2012). A framework for evaluating pairwise and multiway synchrony among stimulus-driven neurons. Neural Computation, 24(8), 2007–2032.
Kerr, J.N., & Denk, W. (2008). Imaging in vivo: watching the brain in action. Nature Reviews Neuroscience, 9(3), 195–205.
Kipke, D.R., Shain, W., Buzsáki, G., Fetz, E., Henderson, J.M., Hetke, J.F., Schalk, G. (2008). Advanced neurotechnologies for chronic neural interfaces: new horizons and clinical opportunities. Journal of Neuroscience, 28(46), 11830–11838.
Liu, H., Roeder, K., Wasserman, L. (2010). Stability approach to regularization selection (stars) for high dimensional graphical models. In Advances in neural information processing systems (pp. 1432–1440).
Markram, H., Toledo-Rodriguez, M., Wang, Y., Gupta, A., Silberberg, G., Wu, C. (2004). Interneurons of the neocortical inhibitory system. Nature Reviews Neuroscience, 5(10), 793.
Mazumder, R., & Hastie, T. (2012). The graphical lasso: new insights and alternatives. Electronic journal of statistics 6.
McCullagh, P., & Nelder, J.A. (1989). Generalised linear models II.
Mitchell, J.F., Sundberg, K.A., Reynolds, J.H. (2009). Spatial attention decorrelates intrinsic activity fluctuations in macaque area V4. Neuron, 63(6), 879–888.
Murphy, K.P. (2012). Machine learning: a probabilistic perspective. Cambridge: MIT press.
Poort, J., & Roelfsema, P.R. (2009). Noise correlations have little influence on the coding of selective attention in area V1. Cerebral Cortex, 19(3), 543–553.
Rasch, M.J., Schuch, K., Logothetis, N.K., Maass, W. (2011). Statistical comparison of spike responses to natural stimuli in monkey area V1 with simulated responses of a detailed laminar network model for a patch of V1. Journal of Neurophysiology, 105(2), 757–778.
Rothman, A.J., Bickel, P.J., Levina, E., Zhu, J. (2008). Sparse permutation invariant covariance estimation. Electronic Journal of Statistics, 2, 494–515.
Samonds, J.M., Potetz, B.R., Lee, T.S. (2009). Cooperative and competitive interactions facilitate stereo computations in macaque primary visual cortex. Journal of Neuroscience, 29(50), 15780–15795.
Scott, J.G., Kelly, R.C., Smith, M.A., Zhou, P., Kass, R.E. (2015). False discovery rate regression: an application to neural synchrony detection in primary visual cortex. Journal of the American Statistical Association, 110(510), 459–471.
Shadlen, M.N., & Newsome, W.T. (1998). The variable discharge of cortical neurons: implications for connectivity, computation, and information coding. Journal of Neuroscience, 18(10), 3870– 3896.
Shannon, C.E. (1964). Mathematical theory of communications. Urbana: University of Illinois Press.
Sincich, L.C., & Blasdel, G.G. (2001). Oriented axon projections in primary visual cortex of the monkey. Journal of Neuroscience, 21(12), 4416–4426.
Smith, M.A., & Kohn, A. (2008). Spatial and temporal scales of neuronal correlation in primary visual cortex. Journal of Neuroscience, 28(48), 12591–12603.
Smith, M.A., & Sommer, M.A. (2013). Spatial and temporal scales of neuronal correlation in visual area V4. Journal of Neuroscience, 33(12), 5422–5432.
Smith, M.A., Jia, X., Zandvakili, A., Kohn, A. (2013). Laminar dependence of neuronal correlations in visual cortex. Journal of neurophysiology, 109(4), 940–947.
Song, D., Wang, H., Tu, C.Y., Marmarelis, V.Z., Hampson, R.E., Deadwyler, S.A., Berger, T.W. (2013). Identification of sparse neural functional connectivity using penalized likelihood estimation and basis functions. Journal of computational neuroscience, 35(3), 335–357.
Stevenson, I.H., & Kording, K.P. (2011). How advances in neural recording affect data analysis. Nature Neuroscience, 14(2), 139–142.
Van Den Heuvel, M.P., & Sporns, O. (2011). Rich-club organization of the human connectome. Journal of Neuroscience, 31(44), 15775–15786.
Venables, W.N., & Ripley, B.D. (2002). Modern applied statistics with S. New York: Springer.
Vinci, G., Ventura, V., Smith, M.A., Kass, R.E. (2016). Separating spike count correlation from firing rate correlation. Neural Computation, 28(5), 849–881.
Wang, H. (2012). Bayesian graphical lasso models and efficient posterior computation. Bayesian Analysis, 7 (4), 867–886.
West, M. (1987). On scale mixtures of normal distributions. Biometrika, 1, 646–8.
Wood, S.N. (2011). Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73(1), 3–36.
Yatsenko, D., Josić, K., Ecker, A.S., Froudarakis, E., Cotton, R.J., Tolias, A.S. (2015). Improved estimation and interpretation of correlations in neural circuits. PLoS Computational Biology, 11(3), e1004083.
Yu, B.M., Cunningham, J., Santhanam, G., Ryu, S.I., Shenoy, K.V., Sahani, M. (2009). Gaussian-process factor analysis for low-dimensional single-trial analysis of neural population activity. In Advances in neural information processing systems (pp. 1881–1888).
Yuan, M., & Lin, Y. (2006). Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68(1), 49– 67.
Yuan, M., & Lin, Y. (2007). Model selection and estimation in the Gaussian graphical model. Biometrika, 94(1), 19–35.
Yuan, M. (2012). Discussion: latent variable graphical model selection via convex optimization. Annals of Statistics, 40(4), 1968–1972.
Acknowledgments
Data from visual area V1 were collected by Matthew A. Smith, Adam Kohn, and Ryan Kelly in the Kohn laboratory at Albert Einstein College of Medicine, and are available from CRCNS at http://crcns.org/data-sets/vc/pvc-11. We are grateful to Adam Kohn and Tai Sing Lee for research support. Data from visual area V4 were collected in the Smith laboratory at the University of Pittsburgh. We are grateful to Samantha Schmitt for assistance with data collection. Giuseppe Vinci was supported by the National Institute of Health (NIH R90DA023426) and by the Rice Academy Postdoctoral Fellowship. Robert E. Kass and Valérie Ventura were supported by the National Institute of Mental Health (NIH R01MH064537). Matthew A. Smith was supported by the National Institute of Health (NIH R01EY022928 and P30EY008098), Research to Prevent Blindness, and the Eye and Ear Foundation of Pittsburgh. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interests
The authors declare that they have no conflict of interest.
Additional information
Action Editor: Uri Eden
Appendix
Appendix
1.1 SAGlasso algorithm
There are several ways to build the weight matrix Q of SAGlasso. We used Gamma regression, as described in Algorithm 1, which can be implemented efficiently with standard statistical software, e.g. R packages glm (Dobson and Barnett 2008; Hastie and Pregibon 1992; McCullagh and Nelder 1989; Venables and Ripley 2002), mgcv ( Wood 2011), and gam (Hastie and Tibshirani 1990). Note that in Eq. (3), Q is typically estimated by the square rooted absolute entries of the inverse sample covariance matrix. In SAGlasso, we observed a slightly better performance without applying any transformation.
1.2 GAR algorithms
GAR Algorithms 2–4 are derived in Section 2.2, and implemented in our R package “GARggm” available in ModelDB.
In Algorithm 2, \(U\sim \text {InvGaussian}(a,b)\) has p.d.f. \(p(u) = \left (\frac {b}{2\pi u^{3}}\right )^{1/2}\exp \left \{-b(u-a)^{2}/(2a^{2}u)\right \} \). Moreover, given a matrix M, \(M_{ij}\) is the i-th row and j-th column entry of M; \(M_{-ij}\) is the j-th column of M without the i-th entry; \(M_{i-j}\) is the i-th row of M without the j-th entry; and \(M_{-i-j}\) is the submatrix obtained by removing the i-th row and the j-th column from M. Algorithms 3 and 4 both produce posterior samples of \({\Omega }\) whose average approximates the posterior mean of \({\Omega }\). The posterior mode of \({\Omega }\) can be obtained by solving Eq. (1) with λ∥Ω∥ replaced by \(\Vert \hat {\Lambda } \odot {\Omega }\Vert _{1}\), where \(\hat {\Lambda } \) is the estimated penalty matrix from either Algorithm 3 or 4, and \(\odot \) denotes the entry-wise matrix multiplication. This optimization can be performed using R functions such as glasso (package glasso, Friedman et al. (2008)) with argument rho set equal to \(2\hat {\Lambda } /n\); see also the R package QUIC, Hsieh et al. (2011). We solve the SPL problem in Eq. (5) by the EM algorithm of Yuan (2012) involving Glasso, and we impose the GAR penalty matrix \(\hat {\Lambda } \) on S in the Maximization step to obtain the GAR-SPL estimate. For \(d\sim 100\), we suggest to run the Gibbs samplers for at least \(B = 2000\) iterations, including a burn-in period of 300 iterations. The Gamma regression in step 2b of Algorithm 4 can be implemented either parametrically or nonparametrically by using standard statistical software e.g. R packages glm (Dobson and Barnett 2008; Hastie and Pregibon 1992; McCullagh and Nelder 1989; Venables and Ripley 2002) and mgcv (Wood 2011); in the data analyses we used splines (Kass et al. 2014).
1.3 Computational efficiency of estimators
Table 1 contains the computation times of the graph estimators considered for \(d = 50, 100\) and \(n = 200, 500\), using the programming language R, CPU Quad-core 2.6 GHz Intel Core i7, and RAM 16 GB 2133 MHz DDR4. These times could be improved substantially by using a lower level language such as C++. Glasso, AGlasso, SPL, and SAGlasso are fitted with tuning parameter optimization based on ten-fold cross-validation involving 500 random splits over a fine grid of 20 values of the tuning parameter about its optimal value. GAR full Bayes (Algorithm 3; \(K=\lfloor \sqrt {d} \rceil \)) involved \(B = 2000\) iterations, where the Gibbs sampler converged after about 300 iterations. GAR empirical Bayes (Algorithm 4; splines with 3 knots) involved 30 EM iterations, each including 500 iterations of Gibbs sampler for the E-step; the efficiency of this method may be improved by replacing the Gibbs sampler with some alternate faster approximation of Eq. (20).
Rights and permissions
About this article
Cite this article
Vinci, G., Ventura, V., Smith, M.A. et al. Adjusted regularization of cortical covariance. J Comput Neurosci 45, 83–101 (2018). https://doi.org/10.1007/s10827-018-0692-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10827-018-0692-x