当前位置: X-MOL 学术Inf. Process. Manag. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An integrated model for textual social media data with spatio-temporal dimensions
Information Processing & Management ( IF 8.6 ) Pub Date : 2020-02-29 , DOI: 10.1016/j.ipm.2020.102219
Juglar Diaz , Barbara Poblete , Felipe Bravo-Marquez

GPS-enabled devices and social media popularity have created an unprecedented opportunity for researchers to collect, explore, and analyze text data with fine-grained spatial and temporal metadata. In this sense, text, time and space are different domains with their own representation scales and methods. This poses a challenge on how to detect relevant patterns that may only arise from the combination of text with spatio-temporal elements. In particular, spatio-temporal textual data representation has relied on feature embedding techniques. This can limit a model’s expressiveness for representing certain patterns extracted from the sequence structure of textual data. To deal with the aforementioned problems, we propose an Acceptor recurrent neural network model that jointly models spatio-temporal textual data. Our goal is to focus on representing the mutual influence and relationships that can exist between written language and the time-and-place where it was produced. We represent space, time, and text as tuples, and use pairs of elements to predict a third one. This results in three predictive tasks that are trained simultaneously. We conduct experiments on two social media datasets and on a crime dataset; we use Mean Reciprocal Rank as evaluation metric. Our experiments show that our model outperforms state-of-the-art methods ranging from a 5.5% to a 24.7% improvement for location and time prediction.



中文翻译:

具有时空维度的文本社交媒体数据集成模型

支持GPS的设备和社交媒体的普及为研究人员提供了前所未有的机会,可以使用细粒度的时空元数据来收集,探索和分析文本数据。从这个意义上说,文本,时间空间是具有自己的表示比例和方法的不同领域。这对如何检测可能仅由文本与时空元素组合产生的相关模式构成了挑战。特别地,时空文本数据表示依赖于特征嵌入技术。这可能会限制模型的表现力,以表示从文本数据的序列结构中提取的某些模式。为了解决上述问题,我们建议接受联合建模时空文本数据的递归神经网络模型。我们的目标是专注于表示书面语言与产生该语言的时间和地点之间可能存在的相互影响和关系。我们将空间,时间和文本表示为元组,并使用成对的元素预测第三个元素。这导致三个预测任务同时被训练。我们在两个社交媒体数据集和犯罪数据集上进行实验;我们使用均值倒数排名作为评估指标。我们的实验表明,我们的模型优于最新方法,其位置和时间预测的改进幅度为5.5%至24.7%。

更新日期:2020-04-21
down
wechat
bug