当前位置: X-MOL 学术J. Braz. Comput. Soc. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Update summarization: building from scratch for Portuguese and comparing to English
Journal of the Brazilian Computer Society Pub Date : 2018-09-21 , DOI: 10.1186/s13173-018-0075-1
Fernando Antônio Asevedo Nóbrega , Thiago Alexandre Salgueiro Pardo

Update summarization aims at automatically producing a summary for a collection of texts for a reader that has already read some previous texts about the subject of interest. It is a challenging task, since it not only brings the demands from the summarization area (as producing informative, coherent, and cohesive summaries) but also includes the issue of finding relevant new/updated content. In this paper, we report a comprehensive investigation of update summarization methods for the Portuguese language, for which there are few initiatives. We also propose new methods that combine some summarization strategies and enrich a traditional method with linguistic knowledge (subtopics), producing better results and advancing the state of the art. More than this, we present a reference dataset for Portuguese, so far inexistent, and establish an experiment setup in the area in order to foster future research. To confirm some of our summarization results, we run experiments in a well-known benchmark dataset for English language and show that our methods still do well.

中文翻译:

更新总结:从头开始构建葡萄牙语并与英语进行比较

更新摘要旨在为已经阅读过有关感兴趣主题的一些先前文本的读者自动生成文本集合的摘要。这是一项具有挑战性的任务,因为它不仅带来了摘要领域的需求(如生成信息丰富、连贯和连贯的摘要),还包括寻找相关的新/更新内容的问题。在本文中,我们报告了对葡萄牙语更新摘要方法的全面调查,对此几乎没有任何举措。我们还提出了新方法,结合一些总结策略,并用语言知识(子主题)丰富传统方法,产生更好的结果并推进最先进的技术。不仅如此,我们还提供了迄今为止尚不存在的葡萄牙语参考数据集,并在该地区建立一个实验装置,以促进未来的研究。为了确认我们的一些总结结果,我们在著名的英语基准数据集上进行了实验,结果表明我们的方法仍然表现良好。
更新日期:2018-09-21
down
wechat
bug