当前位置: X-MOL 学术Adv. Eng. Inform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Comparison of different machine learning techniques on location extraction by utilizing geo-tagged tweets: A case study
Advanced Engineering Informatics ( IF 8.0 ) Pub Date : 2020-09-10 , DOI: 10.1016/j.aei.2020.101151
Nazmiye Eligüzel , Cihan Çetinkaya , Türkay Dereli

In emergencies, Twitter is an important platform to get situational awareness simultaneously. Therefore, information about Twitter users’ location is a fundamental aspect to understand the disaster effects. But location extraction is a challenging task. Most of the Twitter users do not share their locations in their tweets. In that respect, there are different methods proposed for location extraction which cover different fields such as statistics, machine learning, etc. This study is a sample study that utilizes geo-tagged tweets to demonstrate the importance of the location in disaster management by taking three cases into consideration. In our study, tweets are obtained by utilizing the “earthquake” keyword to determine the location of Twitter users. Tweets are evaluated by utilizing the Latent Dirichlet Allocation (LDA) topic model and sentiment analysis through machine learning classification algorithms including the Multinomial and Gaussian Naïve Bayes, Support Vector Machine (SVM), Decision Tree, Random Forest, Extra Trees, Neural Network, k Nearest Neighbor (kNN), Stochastic Gradient Descent (SGD), and Adaptive Boosting (AdaBoost) classifications. Therefore, 10 different machine learning algorithms are applied in our study by utilizing sentiment analysis based on location-specific disaster-related tweets by aiming fast and correct response in a disaster situation. In addition, the effectiveness of each algorithm is evaluated in order to gather the right machine learning algorithm. Moreover, topic extraction via LDA is provided to comprehend the situation after a disaster. The gathered results from the application of three cases indicate that Multinomial Naïve Bayes and Extra Trees machine learning algorithms give the best results with an F-measure value over 80%. The study aims to provide a quick response to earthquakes by applying the aforementioned techniques.



中文翻译:

利用地理标记推文进行位置提取的不同机器学习技术的比较:一个案例研究

在紧急情况下,Twitter是同时获得态势感知的重要平台。因此,有关Twitter用户位置的信息是了解灾难影响的基本方面。但是位置提取是一项艰巨的任务。大多数Twitter用户不会在其推文中分享其位置。在这方面,提出了不同的位置提取方法,涉及统计,机器学习等不同领域。本研究是一个样本研究,利用地理标签推文通过三个方面证明了位置在灾难管理中的重要性案件考虑在内。在我们的研究中,通过使用“ earthquake”关键字来确定Twitter用户的位置来获取推文。通过利用潜在狄利克雷分配(LDA)主题模型和机器学习分类算法(包括多项式和高斯朴素贝叶斯,支持向量机(SVM),决策树,随机森林,额外树,神经网络,k最近邻(kNN),随机梯度下降(SGD)和自适应增强(AdaBoost)分类。因此,我们在研究中应用了10种不同的机器学习算法,通过基于特定于灾难的相关推文的情感分析,针对灾难情况下的快速和正确响应,进行了情感分析。另外,评估每种算法的有效性,以收集正确的机器学习算法。此外,还提供了通过LDA提取主题以了解灾难后的情况。从三种情况的应用中收集的结果表明,多项朴素贝叶斯和Extra Trees机器学习算法以F-measure值超过80%给出了最佳结果。该研究旨在通过应用上述技术来快速响应地震。

更新日期:2020-09-10
down
wechat
bug