当前位置: X-MOL 学术Remote Sens. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Adaptive Geoparsing Method for Toponym Recognition and Resolution in Unstructured Text
Remote Sensing ( IF 4.2 ) Pub Date : 2020-09-17 , DOI: 10.3390/rs12183041
Edwin Aldana-Bobadilla , Alejandro Molina-Villegas , Ivan Lopez-Arevalo , Shanel Reyes-Palacios , Victor Muñiz-Sanchez , Jean Arreola-Trapala

The automatic extraction of geospatial information is an important aspect of data mining. Computer systems capable of discovering geographic information from natural language involve a complex process called geoparsing, which includes two important tasks: geographic entity recognition and toponym resolution. The first task could be approached through a machine learning approach, in which case a model is trained to recognize a sequence of characters (words) corresponding to geographic entities. The second task consists of assigning such entities to their most likely coordinates. Frequently, the latter process involves solving referential ambiguities. In this paper, we propose an extensible geoparsing approach including geographic entity recognition based on a neural network model and disambiguation based on what we have called dynamic context disambiguation. Once place names are recognized in an input text, they are solved using a grammar, in which a set of rules specifies how ambiguities could be solved, in a similar way to that which a person would utilize, considering the context. As a result, we have an assignment of the most likely geographic properties of the recognized places. We propose an assessment measure based on a ranking of closeness relative to the predicted and actual locations of a place name. Regarding this measure, our method outperforms OpenStreetMap Nominatim. We include other assessment measures to assess the recognition ability of place names and the prediction of what we called geographic levels (administrative jurisdiction of places).

中文翻译:

非结构化文本中地名识别和解析的自适应地理解析方法

地理空间信息的自动提取是数据挖掘的重要方面。能够从自然语言中发现地理信息的计算机系统涉及一个称为地理解析的复杂过程,该过程包括两个重要任务:地理实体识别和地名解析。可以通过机器学习方法来实现第一任务,在这种情况下,训练模型以识别与地理实体相对应的字符(单词)序列。第二项任务是将此类实体分配给它们最可能的坐标。通常,后一个过程涉及解决引用歧义。在本文中,我们提出了一种可扩展的地理解析方法,包括基于神经网络模型的地理实体识别和基于我们所谓的歧义消除动态语境消歧。一旦在输入文本中识别出地名,就可以使用一种语法来解决这些问题,在该语法中,一组规则指定了如何解决歧义性,其方式类似于考虑到上下文而使用的方式。因此,我们对已识别地点的最可能地理属性进行了分配。我们建议根据相对于地名的预测位置和实际位置的紧密程度进行评估。关于此措施,我们的方法优于OpenStreetMap Nominatim。我们还包括其他评估措施,以评估地名的识别能力和对我们所谓的地理级别(地方的行政管辖权)的预测。
更新日期:2020-09-18
down
wechat
bug