当前位置: X-MOL 学术J. Intell. Inf. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
WebKey: a graph-based method for event detection in web news
Journal of Intelligent Information Systems ( IF 2.3 ) Pub Date : 2019-09-05 , DOI: 10.1007/s10844-019-00576-7
Elham Rasouli , Sajjad Zarifzadeh , Amir Jahangard Rafsanjani

With rapid and vast publishing of news over the Internet, there is a surge of interest to detect underlying hot events from online news streams. There are two main challenges in event detection: accuracy and scalability. In this paper, we propose a fast and efficient method to detect events in news websites. First, we identify bursty terms which suddenly appear in a lot of news documents. Then, we construct a novel co-occurrence graph between terms in which nodes and edges are weighted based on important features such as click and document frequency within burst intervals. Finally, a weighted community detection algorithm is used to cluster terms and find events. We also propose a couple of techniques to reduce the size of the graph. The results of our evaluations show that the proposed method yields a much higher precision and recall than past methods, such that their harmonic mean is improved by at least 40%. Moreover, it reduces the running time and memory usage by a factor of at least 2.

中文翻译:

WebKey:一种基于图的网络新闻事件检测方法

随着互联网上新闻的快速和大量发布,人们对从在线新闻流中检测潜在的热点事件产生了浓厚的兴趣。事件检测有两个主要挑战:准确性和可扩展性。在本文中,我们提出了一种快速有效的方法来检测新闻网站中的事件。首先,我们识别突然出现在许多新闻文档中的突发术语。然后,我们在术语之间构建了一个新的共现图,其中节点和边根据重要特征(例如突发间隔内的点击和文档频率)进行加权。最后,使用加权社区检测算法对术语进行聚类并查找事件。我们还提出了一些技术来减小图的大小。我们的评估结果表明,与过去的方法相比,所提出的方法产生了更高的精度和召回率,因此它们的调和平均值至少提高了 40%。此外,它将运行时间和内存使用量减少了至少 2 倍。
更新日期:2019-09-05
down
wechat
bug