当前位置: X-MOL 学术Int. J. Disaster Risk Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Dynamic Spatio-Temporal Tweet Mining for Event Detection: A Case Study of Hurricane Florence
International Journal of Disaster Risk Science ( IF 4 ) Pub Date : 2020-05-26 , DOI: 10.1007/s13753-020-00280-z
Mahdi Farnaghi , Zeinab Ghaemi , Ali Mansourian

Extracting information about emerging events in large study areas through spatiotemporal and textual analysis of geotagged tweets provides the possibility of monitoring the current state of a disaster. This study proposes dynamic spatio-temporal tweet mining as a method for dynamic event extraction from geotagged tweets in large study areas. It introduces the use of a modified version of ordering points to identify the clustering structure to address the intrinsic heterogeneity of Twitter data. To precisely calculate the textual similarity, three state-of-the-art text embedding methods of Word2vec, GloVe, and FastText were used to capture both syntactic and semantic similarities. The impact of selected embedding algorithms on the quality of the outputs was studied. Different combinations of spatial and temporal distances with the textual similarity measure were investigated to improve the event detection outcomes. The proposed method was applied to a case study related to 2018 Hurricane Florence. The method was able to precisely identify events of varied sizes and densities before, during, and after the hurricane. The feasibility of the proposed method was qualitatively evaluated using the Silhouette coefficient and qualitatively discussed. The proposed method was also compared to an implementation based on the standard density-based spatial clustering of applications with noise algorithm, where it showed more promising results.

中文翻译:

用于事件检测的动态时空Tweet挖掘:以佛罗伦萨飓风为例

通过对地理标记的推文进行时空和文本分析来提取大型研究区域中正在发生的事件的信息,可以监视灾难的当前状态。这项研究提出动态时空推文挖掘,作为从大型研究区域中的地理标签推文中动态事件提取的一种方法。它介绍了使用排序点的修改版本来标识聚类结构,以解决Twitter数据固有的异质性。为了精确计算文本相似性,使用了Word2vec,GloVe和FastText的三种最新文本嵌入方法来捕获语法和语义相似性。研究了所选嵌入算法对输出质量的影响。研究了空间和时间距离与文本相似性度量的不同组合,以改善事件检测结果。拟议的方法应用于与2018年佛罗伦萨飓风有关的案例研究。该方法能够准确地识别飓风之前,之中和之后的各种大小和密度的事件。使用Silhouette系数定性评估了该方法的可行性,并进行了定性讨论。还将所提出的方法与基于应用噪声的应用程序基于标准密度的空间聚类的实现方案进行了比较,结果显示出了更有希望的结果。该方法能够准确地识别飓风之前,之中和之后的各种大小和密度的事件。使用Silhouette系数定性评估了该方法的可行性,并进行了定性讨论。还将所提出的方法与基于应用噪声的应用程序基于标准密度的空间聚类的实现方案进行了比较,结果显示出了更有希望的结果。该方法能够准确地识别飓风之前,之中和之后的各种大小和密度的事件。使用Silhouette系数定性评估了该方法的可行性,并进行了定性讨论。还将所提出的方法与基于应用噪声的应用程序基于标准密度的空间聚类的实现方案进行了比较,结果显示出了更有希望的结果。
更新日期:2020-05-26
down
wechat
bug