Application of Supervised Machine Learning to Extract Brain Connectivity Information from Neuroscience Research Articles,Interdisciplinary Sciences: Computational Life Sciences

当前位置： X-MOL 学术 › Interdiscip. Sci. Comput. Life Sci. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Application of Supervised Machine Learning to Extract Brain Connectivity Information from Neuroscience Research Articles
Interdisciplinary Sciences: Computational Life Sciences ( IF 3.9 ) Pub Date : 2021-06-02 , DOI: 10.1007/s12539-021-00443-6
Ashika Sharma _{1,

2} , Jaikishan Jayakumar ₃ , Partha P Mitra ₄ , Sutanu Chakraborti ₂ , P Sreenivasa Kumar ₂

Affiliation

Abstract

Understanding the complex connectivity structure of the brain is a major challenge in neuroscience. Vast and ever-expanding literature about neuronal connectivity between brain regions already exists in published research articles and databases. However, with the ever-expanding increase in published articles and repositories, it becomes difficult for a neuroscientist to engage with the breadth and depth of any given field within neuroscience. Natural Language Processing (NLP) techniques can be used to mine ‘Brain Region Connectivity’ information from published articles to build a centralized connectivity resource helping neuroscience researchers to gain quick access to research findings. Manually curating and continuously updating such a resource involves significant time and effort. This paper presents an application of supervised machine learning algorithms that perform shallow and deep linguistic analysis of text to automatically extract connectivity between brain region mentions. Our proposed algorithms are evaluated using benchmark datasets collated from PubMed and our own dataset of full text articles annotated by a domain expert. We also present a comparison with state-of-the-art methods including BioBERT. Proposed methods achieve best recall and \(F_2\) scores negating the need for any domain-specific predefined linguistic patterns. Our paper presents a novel effort towards automatically generating interpretable patterns of connectivity for extracting connected brain region mentions from text and can be expanded to include any other domain-specific information.

Graphic Abstract

中文翻译：

应用监督机器学习从神经科学研究文章中提取大脑连接信息

摘要

了解大脑复杂的连接结构是神经科学的一项重大挑战。已发表的研究文章和数据库中已经存在大量关于大脑区域之间神经元连接的文献。然而，随着已发表文章和知识库的不断增加，神经科学家很难涉足神经科学中任何给定领域的广度和深度。自然语言处理 (NLP) 技术可用于挖掘“大脑区域连接”来自已发表文章的信息以构建集中连接资源，帮助神经科学研究人员快速访问研究结果。手动策划和不断更新这样的资源需要大量的时间和精力。本文介绍了一种监督机器学习算法的应用，该算法对文本执行浅层和深层语言分析，以自动提取大脑区域提及之间的连接。我们提出的算法使用从 PubMed 整理的基准数据集和我们自己的由领域专家注释的全文文章数据集进行评估。我们还提供了与包括 BioBERT 在内的最先进方法的比较。提出的方法实现了最佳召回率和\(F_2\)分数否定任何特定领域的预定义语言模式的需要。我们的论文提出了一项新的努力，旨在自动生成可解释的连接模式，以从文本中提取连接的大脑区域提及，并且可以扩展以包括任何其他特定领域的信息。

图形摘要

更新日期：2021-06-02

点击分享查看原文

点击收藏

阅读更多本刊最新论文