当前位置: X-MOL 学术J. Biomed. Inform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Quantifying semantic similarity of clinical evidence in the biomedical literature to facilitate related evidence synthesis.
Journal of Biomedical informatics ( IF 4.5 ) Pub Date : 2019-10-30 , DOI: 10.1016/j.jbi.2019.103321
Hamed Hassanzadeh 1 , Anthony Nguyen 1 , Karin Verspoor 2
Affiliation  

Objective

Published clinical trials and high quality peer reviewed medical publications are considered as the main sources of evidence used for synthesizing systematic reviews or practicing Evidence Based Medicine (EBM). Finding all relevant published evidence for a particular medical case is a time and labour intensive task, given the breadth of the biomedical literature. Automatic quantification of conceptual relationships between key clinical evidence within and across publications, despite variations in the expression of clinically-relevant concepts, can help to facilitate synthesis of evidence. In this study, we aim to provide an approach towards expediting evidence synthesis by quantifying semantic similarity of key evidence as expressed in the form of individual sentences. Such semantic textual similarity can be applied as a key approach for supporting selection of related studies.

Material and methods

We propose a generalisable approach for quantifying semantic similarity of clinical evidence in the biomedical literature, specifically considering the similarity of sentences corresponding to a given type of evidence, such as clinical interventions, population information, clinical findings, etc. We develop three sets of generic, ontology-based, and vector-space models of similarity measures that make use of a variety of lexical, conceptual, and contextual information to quantify the similarity of full sentences containing clinical evidence. To understand the impact of different similarity measures on the overall evidence semantic similarity quantification, we provide a comparative analysis of these measures when used as input to an unsupervised linear interpolation and a supervised regression ensemble. In order to provide a reliable test-bed for this experiment, we generate a dataset of 1000 pairs of sentences from biomedical publications that are annotated by ten human experts. We also extend the experiments on an external dataset for further generalisability testing.

Results

The combination of all diverse similarity measures showed stronger correlations with the gold standard similarity scores in the dataset than any individual kind of measure. Our approach reached near 0.80 average Pearson correlation across different clinical evidence types using the devised similarity measures. Although they were more effective when combined together, individual generic and vector-space measures also resulted in strong similarity quantification when used in both unsupervised and supervised models. On the external dataset, our similarity measures were highly competitive with the state-of-the-art approaches developed and trained specifically on that dataset for predicting semantic similarity.

Conclusion

Experimental results showed that the proposed semantic similarity quantification approach can effectively identify related clinical evidence that is reported in the literature. The comparison with a state-of-the-art method demonstrated the effectiveness of the approach, and experiments with an external dataset support its generalisability.



中文翻译:

量化生物医学文献中临床证据的语义相似性,以促进相关证据的合成。

客观的

已发表的临床试验和高质量的同行评审医学出版物被认为是用于合成系统评价或实践循证医学(EBM)的主要证据来源。鉴于生物医学文献的广度,找到特定医学病例的所有相关公开证据是一项费时费力的工作。尽管临床相关概念的表达有所不同,但出版物内部和各个出版物之间的关键临床证据之间的概念关系的自动量化可以帮助促进证据的合成。在这项研究中,我们旨在通过量化以单句形式表达的关键证据的语义相似性,来提供一种加快证据综合的方法。

材料与方法

我们提出一种可量化的方法来量化生物医学文献中临床证据的语义相似性,特别是考虑与给定类型的证据(例如临床干预措施,人群信息,临床发现等)相对应的句子的相似性。我们开发了三套通用的方法,基于本体的相似性度量向量空间模型,这些模型利用各种词汇,概念和上下文信息来量化包含临床证据的完整句子的相似性。为了了解不同相似性度量对整体证据语义相似性量化的影响,我们在将这些度量用作无监督线性插值和有监督回归合奏的输入时提供了比较分析。为了为该实验提供可靠的测试平台,我们从生物医学出版物中生成了1000对句子的数据集,并由十位人类专家进行了注释。我们还将实验扩展到外部数据集上,以进行进一步的通用性测试。

结果

所有不同相似性度量的组合显示出与数据集中的金标准相似性得分相比,比任何一种单独类型的度量都更强的相关性。使用设计的相似性度量方法,我们的方法在不同临床证据类型之间的平均Pearson相关性接近0.80。尽管将它们组合在一起会更有效,但在非监督模型和监督模型中使用单个通用和向量空间度量值时,也可以产生很强的相似性量化。在外部数据集上,我们的相似性度量与专门针对该数据集开发并经过训练以预测语义相似性的最新方法具有很高的竞争力。

结论

实验结果表明,提出的语义相似度量化方法可以有效地识别文献报道的相关临床证据。与最新方法的比较证明了该方法的有效性,并且使用外部数据集进行的实验证明了该方法的通用性。

更新日期:2019-10-30
down
wechat
bug