Abstract
Over a decade ago, Lèbre (2009) proposed an inference method, G1DBN, to learn the structure of gene regulatory networks (GRNs) from high dimensional, sparse time-series gene expression data. Their approach is based on concept of low-order conditional independence graphs that they extend to dynamic Bayesian networks (DBNs). They present results to demonstrate that their method yields better structural accuracy compared to the related Lasso and Shrinkage methods, particularly where the data is sparse, that is, the number of time measurements n is much smaller than the number of genes p. This paper challenges these claims using a careful experimental analysis, to show that the GRNs reverse engineered from time-series data using the G1DBN approach are less accurate than claimed by Lèbre (2009). We also show that the Lasso method yields higher structural accuracy for graphs learned from the simulated data, compared to the G1DBN method, particularly when the data is sparse (
Acknowledgement
The code for G1DBN algorithm 786-TF study was run on Kay, which is the primary supercomputer of Irish Centre of High-End Computing (ICHEC) for academic researchers.
Author contributions: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.
Research funding: None declared.
Conflict of interest statement: The authors declare no conflicts of interest regarding this article.
References
Altman, N., and Krzywinski, M. (2018). The curse(s) of dimensionality. Nat. Methods 15: 399–400, https://doi.org/10.1038/s41592-018-0019-x.Search in Google Scholar PubMed
Bernard, A., and Hartemink, A.J. (2005). Informative structure priors: joint learning of dynamic regulatory networks from multiple types of data. In: Biocomputing. World Scientific, Hawaii, USA, pp. 459–470.10.1142/9789812702456_0044Search in Google Scholar
Campos, L.M.d. (2006). A scoring function for learning Bayesian networks based on mutual information and conditional independence tests. J. Mach. Learn. Res. 7: 2149–2187.Search in Google Scholar
Chai, L.E., Loh, S.K., Low, S.T., Mohamad, M.S., Deris, S., and Zakaria, Z. (2014). A review on the computational approaches for gene regulatory network construction. Comput. Biol. Med. 48: 55–65, https://doi.org/10.1016/j.compbiomed.2014.02.011.Search in Google Scholar PubMed
Charbonnier, C., Chiquet, J., and Ambroise, C. (2010). Weighted-LASSO for structured network inference from time course data. Stat. Appl. Genet. Mol. Biol. 9, https://doi.org/10.2202/1544-6115.1519.Search in Google Scholar PubMed
Chaturvedi, I., and Rajapakse, J.C. (2010). Building gene networks with time-delayed regulations. Pattern Recogn. Lett. 31: 2133–2137, https://doi.org/10.1016/j.patrec.2010.03.002.Search in Google Scholar
Cho, K.-H., Choo, S.-M., Jung, S., Kim, J.-R., Choi, H.-S., and Kim, J. (2007). Reverse engineering of gene regulatory networks. IET Syst. Biol. 1: 149–163, https://doi.org/10.1049/iet-syb:20060075.10.1049/iet-syb:20060075Search in Google Scholar
Csala, A., Voorbraak, F.P., Zwinderman, A.H., and Hof, M.H. (2017). Sparse redundancy analysis of high-dimensional genetic and genomic data. Bioinformatics 33: 3228–3234, https://doi.org/10.1093/bioinformatics/btx374.Search in Google Scholar PubMed
De Campos, C.P. and Ji, Q. (2011). Efficient structure learning of Bayesian networks using constraints. J. Mach. Learn. Res. 12: 663–689.Search in Google Scholar
Delgado, F.M., and Gómez-Vela, F. (2019). Computational methods for gene regulatory networks reconstruction and analysis: a review. Artif. Intell. Med. 95: 133–145, https://doi.org/10.1016/j.artmed.2018.10.006.Search in Google Scholar PubMed
D’haeseleer, P., Wen, X., Fuhrman, S., and Somogyi, R. (1999). Linear modeling of mRNA expression levels during CNS development and injury. In: Biocomputing’99. World Scientific, Hawaii, USA, pp. 41–52.10.1142/9789814447300_0005Search in Google Scholar PubMed
Dojer, N., Gambin, A., Mizera, A., Wilczyński, B., and Tiuryn, J. (2006). Applying dynamic Bayesian networks to perturbed gene expression data. BMC Bioinf. 7, https://doi.org/10.1186/1471-2105-7-249.Search in Google Scholar PubMed PubMed Central
Dondelinger, F., Lèbre, S., and Husmeier, D. (2010). Heterogeneous continuous dynamic Bayesian networks with flexible structure and inter-time segment information sharing. In: Proceedings of the 27th international conference on international conference on machine learning. Omnipress, Haifa, Israel, pp. 303–310.Search in Google Scholar
Dondelinger, F., Lèbre, S., and Husmeier, D. (2013). Non-homogeneous dynamic Bayesian networks with Bayesian regularization for inferring gene regulatory networks with gradually time-varying structure. Mach. Learn. 90: 191–230, https://doi.org/10.1007/s10994-012-5311-x.Search in Google Scholar
Dong, X., Yambartsev, A., Ramsey, S.A., Thomas, L.D., Shulzhenko, N., and Morgun, A. (2015). Reverse enGENEering of regulatory networks from big data: a roadmap for biologists. Bioinf. Biol. Insights 9, BBI–S12467, https://doi.org/10.4137/bbi.s12467.Search in Google Scholar PubMed PubMed Central
Efron, B., Hastie, T., Johnstone, I., and Tibshirani, R. (2004). Least angle regression. Ann. Stat. 32: 407–499.10.1214/009053604000000067Search in Google Scholar
Ekstrøm, C.T. (2020). MESS: miscellaneous esoteric statistical scripts, R package version 0.5.7.Search in Google Scholar
Enright, C.G., and Madden, M.G. (2015). Modelling and monitoring the individual patient in real time. Springer, Cham, Switzerland, pp. 107–136.10.1007/978-3-319-28007-3_7Search in Google Scholar
Friedman, N., Linial, M., Nachman, I., and Pe’er, D. (2004). Using Bayesian networks to analyze expression data. J. Comput. Biol. 7: 601–620.10.1145/332306.332355Search in Google Scholar
Friedman, N., Murphy, K.P., and Russell, S.J. (1998). Learning the structure of dynamic probabilistic networks. In: Cooper, G. F. and Moral, S. (Eds.), UAI ’98: proceedings of the fourteenth conference on uncertainty in artificial intelligence. University of Wisconsin Business School, Madison, Wisconsin, USA, pp. 139–147. Morgan Kaufmann.Search in Google Scholar
Gardner, T.S., Di Bernardo, D., Lorenz, D., and Collins, J.J. (2003). Inferring genetic networks and identifying compound mode of action via expression profiling. Science 301: 102–105, https://doi.org/10.1126/science.1081900.Search in Google Scholar PubMed
Grzegorczyk, M., and Husmeier, D. (2009). Non-stationary continuous dynamic Bayesian networks. In: Advances in neural information processing systems. Curran Associates, Inc., Vancouver, Canada, pp. 682–690.Search in Google Scholar
Grzegorczyk, M., and Husmeier, D. (2011). Non-homogeneous dynamic Bayesian networks for continuous data. Mach. Learn. 83: 355–419, https://doi.org/10.1007/s10994-010-5230-7.Search in Google Scholar
Grzegorczyk, M., and Husmeier, D. (2012). A non-homogeneous dynamic Bayesian network with sequentially coupled interaction parameters for applications in systems and synthetic biology. Stat. Appl. Genet. Mol. Biol. 11, https://doi.org/10.1515/1544-6115.1761.Search in Google Scholar PubMed
Halbersberg, D., and Lerner, B. (2020). Local to global learning of a latent dynamic Bayesian network. In: 24th European conference on artificial intelligence - ECAI 2020. IOS Press, Santiago de Compostela, Spain.Search in Google Scholar
Hartemink, A.J. (2005). Reverse engineering gene regulatory networks. Nat. Biotechnol. 23: 554–555, https://doi.org/10.1038/nbt0505-554.Search in Google Scholar PubMed
Hastie, T. and Efron, B. (2013). Least angle regression, Lasso and forward stagewise, R package version 1.2.Search in Google Scholar
Heckerman, D., Geiger, D., and Chickering, D.M. (1995). Learning Bayesian networks: the combination of knowledge and statistical data. Mach. Learn. 20: 197–243, https://doi.org/10.1007/bf00994016.Search in Google Scholar
Hill, S.M., Lu, Y., Molina, J., Heiser, L.M., Spellman, P.T., Speed, T.P., Gray, J.W., Mills, G.B., and Mukherjee, S. (2012). Bayesian inference of signaling network topology in a cancer cell line. Bioinformatics 28: 2804–2810, https://doi.org/10.1093/bioinformatics/bts514.Search in Google Scholar PubMed PubMed Central
Hurd, P.J., and Nelson, C.J. (2009). Advantages of next-generation sequencing versus the microarray in epigenetic research. Briefings Funct. Genomics Proteomics 8: 174–183, https://doi.org/10.1093/bfgp/elp013.Search in Google Scholar PubMed
Husmeier, D., Dondelinger, F., and Lèbre, S. (2010). Inter-time segment information sharing for non-homogeneous dynamic Bayesian networks. In: Advances in neural information processing systems. Curran Associates, Inc., Vancouver, Canada, pp. 901–909.Search in Google Scholar
Iglesias-Martinez, L.F., Kolch, W., and Santra, T. (2016). BGRMI: a method for inferring gene regulatory networks from time-course gene expression data and its application in breast cancer research. Sci. Rep. 6: 37140, https://doi.org/10.1038/srep37140.Search in Google Scholar PubMed PubMed Central
Imoto, S., Higuchi, T., Goto, T., Tashiro, K., Kuhara, S., and Miyano, S. (2004). Combining microarrays and biological knowledge for estimating gene networks via Bayesian networks. J. Bioinf. Comput. Biol. 2: 77–98, https://doi.org/10.1142/s021972000400048x.Search in Google Scholar PubMed
Jassal, B., Matthews, L., Viteri, G., Gong, C., Lorente, P., Fabregat, A., Sidiropoulos, K., Cook, J., Gillespie, M., Haw, R., et al.. (2019). The reactome pathway knowledgebase. Nucleic Acids Res. 48: D498–D503.10.1093/nar/gkz1031Search in Google Scholar PubMed PubMed Central
Kanehisa, M., and Goto, S. (2000). KEGG: Kyoto Encyclopedia of genes and genomes. Nucleic Acids Res. 28: 27–30, https://doi.org/10.1093/nar/28.1.27.Search in Google Scholar PubMed PubMed Central
Kim, S.Y., Imoto, S., and Miyano, S. (2003). Inferring gene networks from time series microarray data using dynamic Bayesian networks. Briefings Bioinf. 4: 228–235, https://doi.org/10.1093/bib/4.3.228.Search in Google Scholar PubMed
Kim, S., Imoto, S., and Miyano, S. (2004). Dynamic Bayesian network and nonparametric regression for nonlinear modeling of gene networks from time series gene expression data. Biosystems 75: 57–65, https://doi.org/10.1016/j.biosystems.2004.03.004.Search in Google Scholar PubMed
Koivisto, M. and Sood, K. (2004). Exact Bayesian structure discovery in Bayesian networks. J. Mach. Learn. Res. 5: 549–573.Search in Google Scholar
Koranda, M., Schleiffer, A., Endler, L., and Ammerer, G. (2000). Forkhead-like transcription factors recruit ndd1 to the chromatin of g2/m-specific promoters. Nature 406: 94, https://doi.org/10.1038/35017589.Search in Google Scholar PubMed
Lähdesmäki, H., and Shmulevich, I. (2008). Learning the structure of dynamic Bayesian networks from time series and steady state measurements. Mach. Learn. 71: 185–217, https://doi.org/10.1007/s10994-008-5053-y.Search in Google Scholar
Lähdesmäki, H., Shmulevich, I., and Yli-Harja, O. (2003). On learning gene regulatory networks under the Boolean network model. Mach. Learn. 52: 147–167, https://doi.org/10.1023/a:1023905711304.10.1023/A:1023905711304Search in Google Scholar
Lèbre, S. (2009). Inferring dynamic genetic networks with low order independencies. Stat. Appl. Genet. Mol. Biol. 8, https://doi.org/10.2202/1544-6115.1294.Search in Google Scholar PubMed
Lèbre, S., Becq, J., Devaux, F., Stumpf, M.P., and Lelandais, G. (2010). Statistical inference of the time-varying structure of gene-regulation networks. BMC Syst. Biol. 4: 130, https://doi.org/10.1186/1752-0509-4-130.Search in Google Scholar PubMed PubMed Central
Lébre, S. and Chiquet, J. (2012). G1DBN: a package performing dynamic Bayesian network inference, R package version 3.1.1.Search in Google Scholar
Lèbre, S., Dondelinger, F., and Husmeier, D. (2012). Nonhomogeneous dynamic Bayesian networks in systems biology. In: Next generation microarray bioinformatics. Springer, Clifton, USA, pp. 199–213.10.1007/978-1-61779-400-1_13Search in Google Scholar PubMed
Li, Z., Li, P., Krishnan, A., and Liu, J. (2011). Large-scale dynamic gene regulatory network inference combining differential equation models with local dynamic Bayesian network analysis. Bioinformatics 27: 2686–2691, https://doi.org/10.1093/bioinformatics/btr454.Search in Google Scholar PubMed
Li, Y., and Ngom, A. (2013). The max-min high-order dynamic Bayesian network learning for identifying gene regulatory networks from time-series microarray data. In: 2013 IEEE symposium on computational intelligence in bioinformatics and computational biology (CIBCB). IEEE, Singapore, pp. 83–90.10.1109/CIBCB.2013.6595392Search in Google Scholar
Liu, C., Jiang, J., Gu, J., Yu, Z., Wang, T., and Lu, H. (2016). High-dimensional omics data analysis using a variable screening protocol with prior knowledge integration (SKI). BMC Syst. Biol. 10: 118, https://doi.org/10.1186/s12918-016-0358-0.Search in Google Scholar PubMed PubMed Central
Ma, S., Kemmeren, P., Gresham, D., and Statnikov, A. (2014). De-novo learning of genome-scale regulatory networks in s. cerevisiae. PloS One 9: e106479, https://doi.org/10.1371/journal.pone.0106479.Search in Google Scholar PubMed PubMed Central
Margaritis, D., and Thrun, S. (2000). Bayesian network induction via local neighborhoods. In: Advances in neural information processing systems. The MIT Press, Denver, USA, pp. 505–511.Search in Google Scholar
Margolin, A.A., Nemenman, I., Basso, K., Wiggins, C., Stolovitzky, G., Dalla Favera, R., and Califano, A. (2006). ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinf. 7: S7, https://doi.org/10.1186/1471-2105-7-s1-s7.Search in Google Scholar PubMed PubMed Central
Mihajlovic, V., and Petkovic, M. (2001). Dynamic Bayesian networks: a state of the art. University of Twente, Enschede, the Netherlands.Search in Google Scholar
Murphy, K. and Mian, S. (1999). Modelling gene expression data using dynamic Bayesian networks, Technical report, Technical report. Computer Science Division, University of California, Berkeley.Search in Google Scholar
Opgen-Rhein, R., and Strimmer, K. (2007). Learning causal networks from systems biology time course data: an effective model selection procedure for the vector autoregressive process. BMC Bioinf. 8: S3, https://doi.org/10.1186/1471-2105-8-s2-s3.Search in Google Scholar
Pearl, J. (2014). Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann, Burlington, USA.Search in Google Scholar
Peña, J.M., Björkegren, J., and Tegnér, J. (2005). Learning dynamic Bayesian network models via cross-validation. Pattern Recogn. Lett. 26: 2295–2308, https://doi.org/10.1016/j.patrec.2005.04.005.Search in Google Scholar
Perrin, B.-E., Ralaivola, L., Mazurie, A., Bottani, S., Mallet, J., and d’Alche Buc, F. (2003). Gene networks inference using dynamic Bayesian networks. Bioinformatics 19: ii138-ii148, https://doi.org/10.1093/bioinformatics/btg1071.Search in Google Scholar PubMed
Pirgazi, J., and Khanteymoori, A.R. (2018). A robust gene regulatory network inference method base on Kalman filter and linear regression. PloS One 13: e0200094, https://doi.org/10.1371/journal.pone.0200094.Search in Google Scholar PubMed PubMed Central
Rajapakse, J.C., and Chaturvedi, I. (2010). Gene regulatory networks with variable-order dynamic Bayesian networks. In: The 2010 international joint conference on neural networks (IJCNN). IEEE, Barcelona, Spain, pp. 1–5.10.1109/IJCNN.2010.5596380Search in Google Scholar
Robinson, J.W., and Hartemink, A.J. (2009). Non-stationary dynamic Bayesian networks. In: Advances in neural information processing systems. Curran Associates, Inc., Vancouver, Canada, pp. 1369–1376.Search in Google Scholar
Robinson, J.W., Hartemink, A.J., and Ghahramani, Z. (2010). Learning non-stationary dynamic Bayesian networks. J. Mach. Learn. Res. 11.Search in Google Scholar
Schrynemackers, M., Küffner, R., and Geurts, P. (2013). On protocols and measures for the validation of supervised methods for the inference of biological networks. Front. Genet. 4: 262, https://doi.org/10.3389/fgene.2013.00262.Search in Google Scholar PubMed PubMed Central
Shafiee Kamalabad, M. and Grzegorczyk, M. (2019). Non-homogeneous dynamic Bayesian networks with edge-wise sequentially coupled parameters. Bioinformatics 36: 1198–1207.10.1093/bioinformatics/btz690Search in Google Scholar PubMed PubMed Central
Shermin, A., and Orgun, M.A. (2009). Using dynamic Bayesian networks to infer gene regulatory networks from expression profiles. In: Proceedings of the 2009 ACM symposium on applied computing. Association for Computing Machinery, New York, NY, United States, Honolulu, Hawaii, USA, pp. 799–803.10.1145/1529282.1529449Search in Google Scholar
Shmelkov, E., Tang, Z., Aifantis, I., and Statnikov, A. (2011). Assessing quality and completeness of human transcriptional regulatory pathways on a genome-wide scale. Biol. Direct 6: 15, https://doi.org/10.1186/1745-6150-6-15.Search in Google Scholar PubMed PubMed Central
Song, L., Kolar, M., and Xing, E.P. (2009). Time-varying dynamic Bayesian networks. In: Advances in neural information processing systems. Curran Associates, Inc., Vancouver, Canada, pp. 1732–1740.Search in Google Scholar
Spellman, P.T., Sherlock, G., Zhang, M.Q., Iyer, V.R., Anders, K., Eisen, M.B., Brown, P.O., Botstein, D., and Futcher, B. (1998). Comprehensive identification of cell cycle–regulated genes of the yeast saccharomyces cerevisiae by microarray hybridization. Mol. Biol. Cell 9: 3273–3297, https://doi.org/10.1091/mbc.9.12.3273.Search in Google Scholar PubMed PubMed Central
Tastan, O., Qi, Y., Carbonell, J.G., and Klein-Seetharaman, J. (2009). Prediction of interactions between hiv-1 and human proteins by information integration. In: Biocomputing 2009. World Scientific, Hawaii, USA, pp. 516–527.Search in Google Scholar
Teixeira, M.C., Monteiro, P., Jain, P., Tenreiro, S., Fernandes, A.R., Mira, N.P., Alenquer, M., Freitas, A.T., Oliveira, A.L., and Sa-Correia, I. (2006). The YEASTRACT database: a tool for the analysis of transcription regulatory associations in Saccharomyces cerevisiae. Nucleic Acids Res. 34: D446–D451, https://doi.org/10.1093/nar/gkj013.Search in Google Scholar PubMed PubMed Central
Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. J. Roy. Stat. Soc. B 58: 267–288, https://doi.org/10.1111/j.2517-6161.1996.tb02080.x.Search in Google Scholar
Tsai, M.-J., Wang, J.-R., Ho, S.-J., Shu, L.-S., Huang, W.-L., and Ho, S.-Y. (2020). GREMA: modelling of emulated gene regulatory networks with confidence levels based on evolutionary intelligence to cope with the underdetermined problem. Bioinformatics 36: 3833–3840, https://doi.org/10.1109/icce-taiwan49838.2020.9258351.Search in Google Scholar
Tucker, A., Liu, X., and Ogden-Swift, A. (2001). Evolutionary learning of dynamic probabilistic models with large time lags. Int. J. Intell. Syst. 16: 621–645, https://doi.org/10.1002/int.1027.Search in Google Scholar
Vinh, N.X., Chetty, M., Coppel, R., and Wangikar, P.P. (2011). GlobalMIT: learning globally optimal dynamic Bayesian network with the mutual information test criterion. Bioinformatics 27: 2765–2766, https://doi.org/10.1093/bioinformatics/btr457.Search in Google Scholar PubMed
Vinh, N.X., Chetty, M., Coppel, R., and Wangikar, P.P. (2012a). Gene regulatory network modeling via global optimization of high-order dynamic Bayesian network. BMC Bioinf. 13: 131.10.1186/1471-2105-13-131Search in Google Scholar PubMed PubMed Central
Vinh, N.X., Chetty, M., Coppel, R., and Wangikar, P.P. (2012b). Local and global algorithms for learning dynamic Bayesian networks. In: 2012 IEEE 12th international conference on data mining. IEEE, Brussels, pp. 685–694.10.1109/ICDM.2012.18Search in Google Scholar
Vohradsky, J. (2001). Neural network model of gene expression. Faseb. J. 15: 846–854, https://doi.org/10.1096/fj.00-0361com.Search in Google Scholar PubMed
Wexler, E.M., Rosen, E., Lu, D., Osborn, G.E., Martin, E., Raybould, H., and Geschwind, D.H. (2011). Genome-wide analysis of a Wnt1-regulated transcriptional network implicates neurodegenerative pathways. Sci. Signal. 4: ra65–ra65, https://doi.org/10.1126/scisignal.2002282.Search in Google Scholar PubMed PubMed Central
Wille, A., and Bühlmann, P. (2006). Low-order conditional independence graphs for inferring genetic networks. Stat. Appl. Genet. Mol. Biol. 5, https://doi.org/10.2202/1544-6115.1170.Search in Google Scholar PubMed
Xing, L., Guo, M., Liu, X., Wang, C., and Zhang, L. (2018). Gene regulatory networks reconstruction using the flooding-pruning hill-climbing algorithm. Genes 9: 342, https://doi.org/10.3390/genes9070342.Search in Google Scholar PubMed PubMed Central
Xing, Z., and Wu, D. (2006). Modeling multiple time units delayed gene regulatory network using dynamic Bayesian network. In: Sixth IEEE international conference on data mining-workshops (ICDMW’06). IEEE, Hong Kong, pp. 190–195.10.1109/ICDMW.2006.120Search in Google Scholar
Xu, C., and Jackson, S.A. (2019). Machine learning and complex biological data. Genome Biol. 20, https://doi.org/10.1186/s13059-019-1689-0.Search in Google Scholar PubMed PubMed Central
Zhang, Y., Deng, Z., Jiang, H., and Jia, P. (2006). Dynamic Bayesian network (DBN) with structure expectation maximization (SEM) for modeling of gene network from time series gene expression data. In: BIOCOMP. CSREA Press, Las Vegas, USA, pp. 41–47.Search in Google Scholar
Zhang, Y., Deng, Z., Jiang, H., and Jia, P. (2007). Inferring gene regulatory networks from multiple data sources via a dynamic Bayesian network with structural EM. In: Cohen-Boulakia, S. and Tannen, V. (Eds.), Data integration in the life sciences, pp. 204–214. Springer Berlin Heidelberg, Berlin, Heidelberg.10.1007/978-3-540-73255-6_17Search in Google Scholar
Zou, M., and Conzen, S.D. (2004). A new dynamic Bayesian network (DBN) approach for identifying gene regulatory networks from time course microarray data. Bioinformatics 21: 71–79, https://doi.org/10.1093/bioinformatics/bth463.Search in Google Scholar PubMed
Zuo, Y., Yu, G., Tadesse, M.G., and Ressom, H.W. (2014). Biological network inference using low order partial correlation. Methods 69: 266–273, https://doi.org/10.1016/j.ymeth.2014.06.010.Search in Google Scholar PubMed PubMed Central
Supplementary Material
The online version of this article offers supplementary material (https://doi.org/10.1515/sagmb-2020-0051).
© 2020 Walter de Gruyter GmbH, Berlin/Boston