当前位置: X-MOL 学术Inf. Process. Manag. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Google Trends spatial clustering approach for a worldwide Twitter user geolocation
Information Processing & Management ( IF 7.4 ) Pub Date : 2020-06-12 , DOI: 10.1016/j.ipm.2020.102312
Paola Zola , Costantino Ragno , Paulo Cortez

User location data is valuable for diverse social media analytics. In this paper, we address the non-trivial task of estimating a worldwide city-level Twitter user location considering only historical tweets. We propose a purely unsupervised approach that is based on a synthetic geographic sampling of Google Trends (GT) city-level frequencies of tweet nouns and three clustering algorithms. The approach was validated empirically by using a recently collected dataset, with 3,268 worldwide city-level locations of Twitter users, obtaining competitive results when compared with a state-of-the-art Word Distribution (WD) user location estimation method. The best overall results were achieved by the GT noun DBSCAN (GTN-DB) method, which is computationally fast, and correctly predicts the ground truth locations of 15%, 23%, 39% and 58% of the users for tolerance distances of 250 km, 500 km, 1,000 km and 2,000 km.



中文翻译:

针对全球Twitter用户地理位置的Google趋势空间聚类方法

用户位置数据对于各种社交媒体分析非常有价值。在本文中,我们仅通过历史推文来解决估算全球城市级Twitter用户位置这一艰巨任务。我们提出了一种纯无监督的方法,该方法基于对Google趋势(GT)鸣叫名词的城市级频率的综合地理采样和三种聚类算法。该方法已通过使用最近收集的数据集进行了经验验证,该数据集具有3268个全球Twitter用户在城市级别的城市位置,与最新的Word Distribution(WD)用户位置估计方法相比,可获得竞争性结果。GT名词DBSCAN(GTN-DB)方法获得了最佳的总体结果,该方法计算速度快,可以正确预测地面真实位置的15%,23%,

更新日期:2020-06-12
down
wechat
bug