Application of Supervised Machine Learning to Extract Brain Connectivity Information from Neuroscience Research Articles

Sharma, Ashika; Jayakumar, Jaikishan; Mitra, Partha P.; Chakraborti, Sutanu; Kumar, P. Sreenivasa

doi:10.1007/s12539-021-00443-6

Application of Supervised Machine Learning to Extract Brain Connectivity Information from Neuroscience Research Articles

Original research article
Published: 02 June 2021

Volume 13, pages 731–750, (2021)
Cite this article

Interdisciplinary Sciences: Computational Life Sciences Aims and scope Submit manuscript

Ashika Sharma^1,2,
Jaikishan Jayakumar³,
Partha P. Mitra⁴,
Sutanu Chakraborti² &
…
P. Sreenivasa Kumar²

487 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Understanding the complex connectivity structure of the brain is a major challenge in neuroscience. Vast and ever-expanding literature about neuronal connectivity between brain regions already exists in published research articles and databases. However, with the ever-expanding increase in published articles and repositories, it becomes difficult for a neuroscientist to engage with the breadth and depth of any given field within neuroscience. Natural Language Processing (NLP) techniques can be used to mine ‘Brain Region Connectivity’ information from published articles to build a centralized connectivity resource helping neuroscience researchers to gain quick access to research findings. Manually curating and continuously updating such a resource involves significant time and effort. This paper presents an application of supervised machine learning algorithms that perform shallow and deep linguistic analysis of text to automatically extract connectivity between brain region mentions. Our proposed algorithms are evaluated using benchmark datasets collated from PubMed and our own dataset of full text articles annotated by a domain expert. We also present a comparison with state-of-the-art methods including BioBERT. Proposed methods achieve best recall and \(F_2\) scores negating the need for any domain-specific predefined linguistic patterns. Our paper presents a novel effort towards automatically generating interpretable patterns of connectivity for extracting connected brain region mentions from text and can be expanded to include any other domain-specific information.

Graphic Abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

Fig. 6

Fig. 8

Fig. 9

Fig. 10

Fig. 11

Fig. 12

Case Representation and Retrieval Techniques for Neuroanatomical Connectivity Extraction from PubMed

ConnExt-BioBERT: Leveraging Transfer Learning for Brain-Connectivity Extraction from Neuroscience Articles

A Text Mining Pipeline Using Active and Deep Learning Aimed at Curating Information in Computational Neuroscience

Article Open access 15 November 2018

Data Availability

Models generated on the benchmark corpus WhiteText, Supplementary material including the datasets are made available at https://github.com/ashika55/BRConnExt.

Notes

http://brainarchitecture.org/text-mining

References

Canese K, Weis S (2013) Pubmed: the bibliographic database. In: The NCBI Handbook [internet]. 2nd edition, National Center for Biotechnology Information (US), https://www.ncbi.nlm.nih.gov/sites/books/NBK153385/
Sporns O (2011) The human connectome: a complex network. Annals of the New York Academy of Sciences 1224(1):109–125. https://doi.org/10.1016/S0920-9964(12)70100-7
Article PubMed Google Scholar
Richardet R, Chappelier JC, Telefont M, Hill S (2015) Large-scale extraction of brain connectivity from the neuroscientific literature. Bioinformatics 31(10):1640–1647. https://doi.org/10.1093/bioinformatics/btv025
Article CAS PubMed PubMed Central Google Scholar
French L, Lane S, Xu L, Siu C, Kwok C, Chen Y, Krebs C, Pavlidis P (2012) Application and evaluation of automated methods to extract neuroanatomical connectivity statements from free text. Bioinformatics 28(22):2963–2970. https://doi.org/10.1093/bioinformatics/bts542
Article CAS PubMed PubMed Central Google Scholar
French L, Liu P, Marais O, Koreman T, Tseng L, Lai A, Pavlidis P (2015) Text mining for neuroanatomy using whitetext with an updated corpus and a new web application. Front Neuroinform 9:13. https://doi.org/10.3389/fninf.2015.00013
Article PubMed PubMed Central Google Scholar
Künzle H, Radtke-Schuller S (2000) Basal telencephalic regions connected with the olfactory bulb in a madagascan hedgehog tenrec. J Comparative Neurol 423(4):706–726
Article Google Scholar
Hobbs JR (2002) Information extraction from biomedical text. J Biomed Inform 35(4):260–264. https://doi.org/10.1016/S1532-0464(03)00015-7
Article CAS PubMed Google Scholar
Tikk D, Thomas P, Palaga P, Hakenberg J, Leser U (2010) A comprehensive benchmark of kernel methods to extract protein-protein interactions from literature. PLoS Comput Biol 6(7):e1000837. https://doi.org/10.1371/journal.pcbi.1000837
Article CAS PubMed PubMed Central Google Scholar
Wu HY, Chiang CW, Li L (2014) Text mining for drug–drug interaction. In: Biomedical Literature Mining, Springer, pp 47–75, 10.1007/978-1-4939-0709-0\_4
French L, Lane S, Xu L, Pavlidis P (2009) Automated recognition of brain region mentions in neuroscience literature. Front Neuroinform 3:29. https://doi.org/10.3389/neuro.11.029.2009
Article PubMed PubMed Central Google Scholar
Giuliano C, Lavelli A, Romano L (2006) Exploiting shallow linguistic information for relation extraction from biomedical literature. In: 11th Conference of the European Chapter of the Association for Computational Linguistics, https://www.aclweb.org/anthology/E06-1051
Kluegl P, Toepfer M, Beck PD, Fette G, Puppe F (2016) Uima ruta: rapid development of rule-based information extraction applications. Nat Lang Eng 22(1):1–40. https://doi.org/10.1017/S1351324914000114
Article Google Scholar
Gökdeniz E, Özgür A, Canbeyli R (2016) Automated neuroanatomical relation extraction: a linguistically motivated approach with a pvt connectivity graph case study. Front Neuroinform 10:39. https://doi.org/10.3389/fninf.2016.00039
Article PubMed PubMed Central Google Scholar
Künzle H (1998) Thalamic territories innervated by cerebellar nuclear afferents in the hedgehog tenrec, echinops telfairi. J Comparative Neurol 402(3):313–326. 10.1002/(SICI)1096-9861(19981221)402:3%3c313::AID-CNE3%3e3.0.CO;2-E
Agichtein E, Gravano L (2000) Snowball: Extracting relations from large plain-text collections. In: Proceedings of the fifth ACM conference on Digital libraries, ACM, pp 85–94, 10.1145/336597.336644,
Sleator DD, Temperley D (1995) Parsing english with a link grammar. arXiv preprint cmp-lg/9508004 https://www.aclweb.org/anthology/1993.iwpt-1.22
Sleator DD, Temperley D (Website) Index to link grammar documentation. https://www.abisource.com/projects/link-grammar/dict/index.html
Groenewegen HJ, Berendse HW (1990) Connections of the subthalamic nucleus with ventral striatopallidal parts of the basal ganglia in the rat. J Comparative Neurol 294(4):607–622. https://doi.org/10.1002/cne.902940408
Article CAS Google Scholar
Frakes WB (1992) Information retrieval: data structures & algorithms. Pearson Education India, DOI 10(1145/182119):1096164
Google Scholar
Suchanek FM, Ifrim G, Weikum G (2006) Leila: Learning to extract information by linguistic analysis. In: Proceedings of the 2nd Workshop on Ontology Learning and Population: Bridging the Gap between Text and Knowledge, pp 18–25, https://www.aclweb.org/anthology/W06-0503
Wagner RA, Fischer MJ (1974) The string-to-string correction problem. J ACM (JACM) 21(1):168–173. https://doi.org/10.1145/321796.321811
Article Google Scholar
Grinberg D, Lafferty J, Sleator D (1995) A robust parsing algorithm for link grammars. arXiv preprint cmp-lg/9508003 https://www.aclweb.org/anthology/1995.iwpt-1.15
Dong HW (2008) The Allen reference atlas: A digital color brain atlas of the C57Bl/6J male mouse. John Wiley & Sons Inc, 10.1111/j.1601-183x.2009.00552.x
Sharma A, Sharma A, Deodhare D, Chakraborti S, Kumar PS, Mitra PP (2016) Case representation and retrieval techniques for neuroanatomical connectivity extraction from pubmed. In: International Conference on Case-Based Reasoning, Springer, pp 370–386, 10.1007/978-3-319-47096-2\_25
Schütze H, Manning CD, Raghavan P (2008) Introduction to information retrieval. In: Proceedings of the international communication of association for computing machinery conference, vol 4, 10.1017/CBO9780511809071
Lee J, Yoon W, Kim S, Kim D, Kim S, So CH, Kang J (2019) Biobert: pre-trained biomedical language representation model for biomedical text mining. arXiv preprint arXiv: 190108746. 10.1093/bioinformatics/btz682
Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprintarXiv:181004805 https://arxiv.org/abs/1810.04805
Zhu Y, Kiros R, Zemel R, Salakhutdinov R, Urtasun R, Torralba A, Fidler S (2015) Aligning books and movies: Towards story-like visual explanations by watching movies and reading books. In: Proceedings of the IEEE international conference on computer vision, pp 19–27, 10.1109/ICCV.2015.11
Lee J, Yoon W, Kim S, Kim D, Kim S, So CH, Kang J (Websiteb) Biobert model. https://gitbub.com/naver/biobert-pretrained
Lee J, Yoon W, Kim S, Kim D, Kim S, So CH, Kang J (Websitea) Biobert codebase. https://gitbub.com/dmis-lab/biobert
Bota M, Dong HW, Swanson LW (2005) Brain architecture management system. Neuroinformatics 3(1):15–47. https://doi.org/10.1385/NI:3:1:015
Article PubMed Google Scholar
Jiao X, Yin Y, Shang L, Jiang X, Chen X, Li L, Wang F, Liu Q (2019) Tinybert: Distilling bert for natural language understanding. arXiv preprint arXiv :190910351. 10.18653/v1/2020.findings-emnlp.372
Lan Z, Chen M, Goodman S, Gimpel K, Sharma P, Soricut R (2020) Albert: A lite bert for self-supervised learning of language representations. In: International conference on learning representations, https://openreview.net/forum?id=H1eA7AEtvS
Sanh V, Debut L, Chaumond J, Wolf T (2019) Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv: 191001108
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008, 10.5555/3295222.3295349
Kovaleva O, Romanov A, Rogers A, Rumshisky A (2019) Revealing the dark secrets of bert. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 4356–4365, 10.18653/v1/D19-1445
Swanson L (1998) Structure of the rat brain: a laboratory guide with printed and electronic templates for data, models an schematics. Brain maps: Structure of the Rat Brain, 2nd Edn Amsterdam: Elsevier Science p pp 17–30, https://searchworks.stanford.edu/view/4106941
Swanson L (2004) Brain maps : structure of the rat brain : a laboratory guide with printed and electronic templates for data, models and schematics. Brain Maps: Structure of the Rat Brain, 3rd Edn Amsterdam: Elsevier https://searchworks.stanford.edu/view/4106941
Paxinos G, Watson C (2014) The rat brain in stereotaxic coordinates: hard cover edition. Elsevier, 10.1016/c2009-0-63235-9
Bota M, Swanson LW (2008) Bams neuroanatomical ontology: design and implementation. Front Neuroinform 2:2. https://doi.org/10.3389/neuro.11.002.2008
Article PubMed PubMed Central Google Scholar

Download references

Funding

Not applicable.

Author information

Authors and Affiliations

Center for Artificial Intelligence and Robotics, DRDO Complex, Bangalore, 560093, India
Ashika Sharma
Department of Computer Science and Engineering, Indian Institute of Technology Madras, Chennai, 600036, India
Ashika Sharma, Sutanu Chakraborti & P. Sreenivasa Kumar
Center for Computational Brain Research, IIT Madras, Chennai, 600036, India
Jaikishan Jayakumar
Cold Spring Harbour Laboratory, Cold Spring Harbour, New York, 11724, USA
Partha P. Mitra

Authors

Ashika Sharma
View author publications
You can also search for this author in PubMed Google Scholar
Jaikishan Jayakumar
View author publications
You can also search for this author in PubMed Google Scholar
Partha P. Mitra
View author publications
You can also search for this author in PubMed Google Scholar
Sutanu Chakraborti
View author publications
You can also search for this author in PubMed Google Scholar
P. Sreenivasa Kumar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ashika Sharma.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Code availability

Software application, ‘ConnExt1’ can be accessed at http://brainarchitecture.org/text-mining.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (CSV 4 kb)

Supplementary file1 (PDF 103 kb)

Supplementary file1 (PDF 76 kb)

Supplementary file1 (PDF 97 kb)

Supplementary file1 (PDF 91 kb)

Supplementary file1 (XLSX 24 kb)

Supplementary file1 (TXT 688 kb)

Supplementary file1 (TXT 152 kb)

Supplementary file1 (TXT 304 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sharma, A., Jayakumar, J., Mitra, P.P. et al. Application of Supervised Machine Learning to Extract Brain Connectivity Information from Neuroscience Research Articles. Interdiscip Sci Comput Life Sci 13, 731–750 (2021). https://doi.org/10.1007/s12539-021-00443-6

Download citation

Received: 14 September 2020
Revised: 15 May 2021
Accepted: 18 May 2021
Published: 02 June 2021
Issue Date: December 2021
DOI: https://doi.org/10.1007/s12539-021-00443-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions