当前位置: X-MOL 学术J. Biomed. Inform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
CIBS: A biomedical text summarizer using topic-based sentence clustering.
Journal of Biomedical informatics ( IF 4.0 ) Pub Date : 2018-11-18 , DOI: 10.1016/j.jbi.2018.11.006
Milad Moradi 1
Affiliation  

Automatic text summarizers can reduce the time required to read lengthy text documents by extracting the most important parts. Multi-document summarizers should produce a summary that covers the main topics of multiple related input texts to diminish the extent of redundant information. In this paper, we propose a novel summarization method named Clustering and Itemset mining based Biomedical Summarizer (CIBS). The summarizer extracts biomedical concepts from the input documents and employs an itemset mining algorithm to discover main topics. Then, it applies a clustering algorithm to put the sentences into clusters such that those in the same cluster share similar topics. Selecting sentences from all the clusters, the summarizer can produce a summary that covers a wide range of topics of the input text. Using the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) toolkit, we evaluate the performance of the CIBS method against four summarizers including a state-of-the-art method. The results show that the CIBS method can improve the performance of single- and multi-document biomedical text summarization. It is shown that the topic-based sentence clustering approach can be effectively used to increase the informative content of summaries, as well as to decrease the redundant information.

中文翻译:

CIBS:使用基于主题的句子聚类的生物医学文本摘要程序。

自动文本摘要程序可以通过提取最重要的部分来减少阅读冗长的文本文档所需的时间。多文档摘要器应提供涵盖多个相关输入文本主要主题的摘要,以减少冗余信息的范围。在本文中,我们提出了一种新的汇总方法,称为基于生物医学汇总器(CIBS)的聚类和项集挖掘。摘要器从输入文档中提取生物医学概念,并采用项集挖掘算法来发现主要主题。然后,它应用聚类算法将句子放入聚类中,以使同一聚类中的句子共享相似的主题。从所有群集中选择句子,摘要器可以生成一个摘要,该摘要涵盖输入文本的广泛主题。使用面向召回评估的大学生未成年人评估工具(ROUGE),我们针对包括最新技术在内的四个汇总器评估了CIBS方法的性能。结果表明,CIBS方法可以提高单文档和多文档生物医学文本摘要的性能。结果表明,基于主题的句子聚类方法可以有效地提高摘要的信息量,并减少冗余信息。
更新日期:2018-11-13
down
wechat
bug