Skip to main content
Log in

Using prior knowledge in the inference of gene association networks

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Traditional computational techniques are recently being improved with the use of prior biological knowledge from open-access repositories in the area of gene expression data analysis. In this work, we propose the use of prior knowledge as heuristic in an inference method of gene-gene associations from gene expression profiles. In this paper, we use Gene Ontology, which is an open-access ontology where genes are annotated using their biological functionality, as a source of prior knowledge together with a gene pairwise Gene-Ontology-based measure. The performance of our proposal has been compared to other benchmark methods for the inference of gene networks, outperforming in some cases and obtaining similar and competitive results in others, but with the advantage of providing simple and interpretable models, which is a desired feature for the Artificial Intelligence Health related models as stated by the European Union.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. RNA: RiboNucleic acid

References

  1. GENIE3 vignette. https://doi.org/10.18129/B9.bioc.GENIE3. https://bioconductor.org/packages/release/bioc/vignettes/GENIE3/inst/doc/GENIE3.html

  2. The gene ontology (go) database and informatics resource. Nucleic acids research 32(Database issue), D258–61 (2004). https://doi.org/10.1093/nar/gkh036. https://www.ncbi.nlm.nih.gov/pubmed/14681407

  3. SCENIC: Single-cell regulatory network inference and clustering. Nature Methods (2017). https://doi.org/10.1038/nmeth.4463

  4. Benabderrahmane S, Smail-Tabbone M, Poch O, Napoli A, Devignes MD (2010) Intelligo: a new vector-based semantic similarity measure including annotation origin. BMC bioinform 11(1):588

    Google Scholar 

  5. Benjamini Y, Yekutieli D (2001) The control of the false discovery rate in multiple testing under dependency Annals of Statistics. https://doi.org/10.1214/aos/1013699998

  6. Berriz GF, King OD, Bryant B, Sander C, Roth FP (2003) Characterizing gene sets with FuncAssociate. Bioinformatics 19(18):2502–2504. https://doi.org/10.1093/bioinformatics/btg363

    Article  Google Scholar 

  7. Bulashevska S, Eils R (2005) Inferring genetic regulatory logic from expression data. Bioinformatics (Oxford England) 21(11):2706–13. https://doi.org/10.1093/bioinformatics/bti388

    Article  Google Scholar 

  8. Caniza H, Romero AE, Heron S, Yang H, Devoto A, Frasca M, Mesiti M, Valentini G, Paccanaro A (2014) GOssto: A stand-alone application and a web tool for calculating semantic similarities on the Gene Ontology Bioinformatics. https://doi.org/10.1093/bioinformatics/btu144

  9. Cho RJ, Campbell MJ, Winzeler EA, Steinmetz L, Conway A, Wodicka L, Wolfsberg TG, Gabrielian AE, Landsman D, Lockhart DJ, Davis RW (1998) A genome-wide transcriptional analysis of the mitotic cell cycle. Molecular Cell 2(1):65–73. https://doi.org/10.1016/S1097-2765(00)80114-8. http://linkinghub.elsevier.com/retrieve/pii/S1097276500801148

    Google Scholar 

  10. Couto FM, Silva MJ, Coutinho PM (2005) Semantic similarity over the gene ontology: Family correlation and selecting disjunctive ancestors. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management, CIKM ’05. ACM, New York, pp 343–344, DOI https://doi.org/10.1145/1099554.1099658, (to appear in print)

  11. Delgado FM, Gómez-Vela F (2018) Computational methods for gene regulatory networks reconstruction and analysis: A review Artificial intelligence in medicine. https://doi.org/10.1016/j.artmed.2018.10.006

  12. Dwight SS, Harris MA, Dolinski K, Ball CA, Binkley G, Christie KR, Fisk DG, Issel-Tarver L, Schroeder M, Sherlock G, Sethuraman A, Weng S, Botstein D, Cherry JM (2002) Saccharomyces genome database (sgd) provides secondary gene annotation using the gene ontology (go). Nucl Acids Res 30(1):69–72. https://doi.org/10.1093/nar/30.1.69. http://dblp.uni-trier.de/db/journals/nar/nar30.html#DwightHDBBCFISSSWBC03

    Google Scholar 

  13. Eisen MB, Spellman PT, Brown PO, Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Sciences of the United States of America 95 14863–14868. https://doi.org/10.1073/pnas.95.25.14863

  14. EMBL-EBI: Introduction to embl-european bioinformatics institute. https://www.ebi.ac.uk/sites/ebi.ac.uk/files/content.ebi.ac.uk/documents/introduction_to_embl-ebi.pdf

  15. Fitch A, Jones M (2009) Shortest path analysis using partial correlations for classifying gene functions from gene expression data. Bioinformatics 25:42–47. https://doi.org/10.1093/bioinformatics/btn574

    Article  Google Scholar 

  16. Friedman N (2004) Inferring cellular networks using probabilistic graphical models. Science (New York) 303(5659):799–805. https://doi.org/10.1126/science.1094068. http://www.ncbi.nlm.nih.gov/pubmed/14764868

    Google Scholar 

  17. Gan M, Dou X, Jiang R (2013) From ontology to semantic similarity: calculation of ontology-based semantic similarity. Sci World J 2013

  18. Gómez-Vela F, Lagares JA, Díaz-Díaz N (2015) Gene network coherence based on prior knowledge using direct and indirect relationships. Comput Biol Chem 56:142–151

    Google Scholar 

  19. Gutiérrez-Avilés D, Rubio-Escudero C, Martínez-Álvarez F, Riquelme JC (2014) Trigen: A genetic algorithm to mine triclusters in temporal gene expression data. Neurocomputing 132:42–53. https://doi.org/10.1016/j.neucom.2013.03.061

    Article  Google Scholar 

  20. Huynh-Thu VA, Irrthum A, Wehenkel L, Geurts P (2010) Inferring regulatory networks from expression data using tree-based methods PLos ONE. https://doi.org/10.1371/journal.pone.0012776

  21. Lee I, Date SV, Adai AT, Marcotte EM (2004) A probabilistic functional network of yeast genes. Science 1555–1558. https://doi.org/10.1126/science.1099511

  22. Lee I, LZME (2007) An improved, bias-reduced probabilistic functional gene network of baker’s yeast, saccharomyces cerevisiae. PLoS One e988. https://doi.org/10.1371/journal.pone.0000988

  23. Margolin AA, Nemenman I, Basso K, Wiggins C, Stolovitzky G, Dalla Favera R, Califano A (2006) Aracne: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC bioinformatics 7 Suppl 1, S7 https://doi.org/10.1186/1471-2105-7-S1-S7. http://www.ncbi.nlm.nih.gov/pubmed/16723010

  24. Miron M (2018) Interpretability in AI and its relation to fairness, transparency, reliability and trust. Joint Research Center, EU Commission. https://ec.europa.eu/jrc/communities/en/node/1162/article/interpretability-ai-and-its-relation-fairness-transparency-reliability-and-trust

  25. Markowetz F, Spang R Inferring cellular networks–a review. BMC bioinformatics 8 Suppl 6, S5 (2007). https://doi.org/10.1186/1471-2105-8-S6-S5. http://www.ncbi.nlm.nih.gov/pubmed/17903286

  26. Martínez B, Isabel A, Nepomuceno C, José C, Riquelme M (2014) Discovering gene association networks by multi-objective evolutionary quantitative association rules. J Comput Syst Sci 80(1):118–136. https://doi.org/10.1016/j.jcss.2013.03.010

    Article  MathSciNet  MATH  Google Scholar 

  27. Mistry M, Pavlidis P (2008) Gene ontology term overlap as a measure of gene functional similarity. BMC Bioinform 9(1):327. https://doi.org/10.1186/1471-2105-9-327. http://www.biomedcentral.com/1471-2105/9/327

    Google Scholar 

  28. Nepomuceno JA, Lora AT, Aguilar-Ruiz JS (2011) Biclustering of gene expression data by correlation-based scatter search. BioData Mining 4:3. https://doi.org/10.1186/1756-0381-4-3

    Article  Google Scholar 

  29. Nepomuceno JA, Troncoso A, Nepomuceno-Chamorro IA, Aguilar-Ruiz JS (2015) Integrating biological knowledge based on functional annotations for biclustering of gene expression data. Comput Methods Prog Biomed 119(3):163–180. https://doi.org/10.1016/j.cmpb.2015.02.010

    Article  Google Scholar 

  30. Nepomuceno JA, Troncoso A, Nepomuceno-Chamorro IA, Aguilar-Ruiz JS (2018) Pairwise gene go-based measures for biclustering of high-dimensional expression data. BioData mining 11(1):4

    Google Scholar 

  31. Nepomuceno-Chamorro I, Aguilar-Ruiz J, Riquelme J (2010) Inferring gene regression networks with model trees. BMC Bioinformatics 11 (1):517. https://doi.org/10.1186/1471-2105-11-517. http://www.biomedcentral.com/1471-2105/11/517

    Google Scholar 

  32. Nepomuceno-Chamorro IA, Jesús S, Aguilar R (2013) Synergies of genes in alzheimer’s disease. In: International Work-Conference on Bioinformatics and Biomedical Engineering, IWBBIO 2013, Granada, Spain, March 18-20, 2013. Proceedings, pp 51–53. http://iwbbio.ugr.es/papers/iwbbio_008.pdf

  33. Nepomuceno-Chamorro IA, Márquez C, Jesús S, Aguilar-Ruiz AE (2015) Building transcriptional association networks in cytoscape with regnetc. IEEE/ACM Trans Comput Biology Bioinform 12 (4):823–824. https://doi.org/10.1109/TCBB.2014.2385702

    Article  Google Scholar 

  34. Pesquita C, Faria D, Bastos H, Ferreira A, Falcao A, Couto F (2008) Metrics for go based protein semantic similarity: a systematic evaluation. BMC Bioinformatics 9(Suppl 5):S4. https://doi.org/10.1186/1471-2105-9-S5-S4. http://www.biomedcentral.com/1471-2105/9/S5/S4

    Google Scholar 

  35. Pesquita C, Faria D, Falcão AO, Lord P, Couto FM (2009) Semantic similarity in biomedical ontologies. PLoS Comput Biol 5(7):12. https://doi.org/10.1371/journal.pcbi.1000443. http://www.ncbi.nlm.nih.gov/pubmed/19649320

    MathSciNet  Google Scholar 

  36. Ponzoni I, Azuaje F, Augusto J, Glass D Inferring adaptive regulation thresholds and association rules from gene expression data through combinatorial optimization learning. https://doi.org/10.1109/tcbb.2007.1049. http://www.ncbi.nlm.nih.gov/pubmed/17975273

  37. Quinlan JR (1993) C4.5: Programs for machine learning

  38. Rodius S, Nazarov P, Nepomuceno-Chamorro I, Jeanty C, Gonzalez-Rosa J, Ibberson M, da Costa RM, Xenarios I, Mercader N, Azuaje F (2014) Transcriptional response to cardiac injury in the zebrafish: systematic identification of genes with highly concordant activity across in vivo models. BMC Genomics 15(1):852. https://doi.org/10.1186/1471-2164-15-852. http://www.biomedcentral.com/1471-2164/15/852

    Google Scholar 

  39. Romero-Zaliz RC, Rubio-Escudero C, Cobb JP, Herrera F, Cordón O, Zwir I (2008) A multiobjective evolutionary conceptual clustering methodology for gene annotation within structural databases: a case of study on the gene ontology database. IEEE Trans Evol Comput 12(6):679–701. https://doi.org/10.1109/TEVC.2008.915995

    Article  Google Scholar 

  40. Segal E, SMRA, Pe’er D, Botstein D, Koller D, Friedman N (2003) Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. Nature Genet 34:166–176. https://doi.org/10.1038/ng1165

    Article  Google Scholar 

  41. Soinov LA, Krestyaninova MA, Brazma A (2003) Towards reconstruction of gene networks from expression data by supervised learning Genome biology. https://doi.org/10.1186/gb-2003-4-1-r6

  42. Spellman P, Sherlock G, Zhang M, et al. (1998) Comprehensive identification of cell cycle–regulated genes of the yeast saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell 9(12):3273–3297. https://doi.org/10.1091/mbc.9.12.3273

    Article  Google Scholar 

  43. Steele E, Tucker A, ’T Hoen PAC, Schuemie MJ (2009) Literature-based priors for gene regulatory networks. Bioinformatics (Oxford, England) 25(14):1768–74. https://doi.org/10.1093/bioinformatics/btp277

    Article  Google Scholar 

  44. Wang Y, Yang S, Zhao J, Du W, Liang Y, Wang C, Zhou F, Tian Y, Ma Q (2019) Using machine learning to measure relatedness between genes: a multi-features model. Scientific reports 9(1):1–15

    Google Scholar 

  45. Wang YR, Huang H (2014) Review on statistical methods for gene network reconstruction using expression data. J Theoret Biol 362:53–61. https://doi.org/10.1016/j.jtbi.2014.03.040

    Article  MATH  Google Scholar 

  46. Witten IH, Frank E, Trigg L, Hall M, Holmes G, Cunningham SJ (1999) Weka: Practicalmachine learning tools and techniques with java implementations

Download references

Acknowledgements

We would like to thank Spanish Ministry of Science and Innovation for the financial support under project TIN2017-88209-C2-2-R.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Isabel A. Nepomuceno-Chamorro.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nepomuceno-Chamorro, I.A., Nepomuceno, J.A., Galván-Rojas, J.L. et al. Using prior knowledge in the inference of gene association networks. Appl Intell 50, 3882–3893 (2020). https://doi.org/10.1007/s10489-020-01705-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-020-01705-4

Keywords

Navigation