NeuroTPR: A neuro‐net toponym recognition model for extracting locations from social media messages,Transactions in GIS

当前位置： X-MOL 学术 › Trans. GIS › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

NeuroTPR: A neuro‐net toponym recognition model for extracting locations from social media messages
Transactions in GIS ( IF 2.568 ) Pub Date : 2020-05-14 , DOI: 10.1111/tgis.12627
Jimin Wang ₁ , Yingjie Hu ₁ , Kenneth Joseph ₂

Affiliation

Social media messages, such as tweets, are frequently used by people during natural disasters to share real‐time information and to report incidents. Within these messages, geographic locations are often described. Accurate recognition and geolocation of these locations are critical for reaching those in need. This article focuses on the first part of this process, namely recognizing locations from social media messages. While general named entity recognition tools are often used to recognize locations, their performance is limited due to the various language irregularities associated with social media text, such as informal sentence structures, inconsistent letter cases, name abbreviations, and misspellings. We present NeuroTPR, which is a Neuro‐net ToPonym Recognition model designed specifically with these linguistic irregularities in mind. Our approach extends a general bidirectional recurrent neural network model with a number of features designed to address the task of location recognition in social media messages. We also propose an automatic workflow for generating annotated data sets from Wikipedia articles for training toponym recognition models. We demonstrate NeuroTPR by applying it to three test data sets, including a Twitter data set from Hurricane Harvey, and comparing its performance with those of six baseline models.

中文翻译：

NeuroTPR：一种神经网络地名识别模型，用于从社交媒体消息中提取位置

人们在自然灾害期间经常使用诸如推文之类的社交媒体消息来共享实时信息并报告事件。在这些消息中，经常描述地理位置。这些位置的准确识别和地理位置对于满足需要者至关重要。本文重点介绍此过程的第一部分，即从社交媒体消息中识别位置。虽然通常使用通用命名实体识别工具来识别位置，但是由于与社交媒体文本相关的各种语言不规则性（例如非正式的句子结构，字母大小写不一致，名称缩写和拼写错误），其性能受到限制。我们介绍了NeuroTPR，这是一个专门针对这些语言不规则而设计的Neuro-net ToPonym识别模型。我们的方法扩展了具有多个功能的通用双向递归神经网络模型，这些功能旨在解决社交媒体消息中的位置识别任务。我们还提出了一种自动工作流程，该工作流程用于从Wikipedia文章生成带注释的数据集，以训练地名识别模型。我们通过将NeuroTPR应用于三个测试数据集（包括来自飓风哈维的Twitter数据集），并将其性能与六个基线模型的性能进行比较，来演示NeuroTPR。

更新日期：2020-05-14

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>