A pragmatic guide to geoparsing evaluation,Language Resources and Evaluation

当前位置： X-MOL 学术 › Lang. Resour. Eval. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A pragmatic guide to geoparsing evaluation
Language Resources and Evaluation ( IF 1.7 ) Pub Date : 2019-09-19 , DOI: 10.1007/s10579-019-09475-3
Milan Gritta ₁ , Mohammad Taher Pilehvar ₁ , Nigel Collier ₁

Affiliation

Empirical methods in geoparsing have thus far lacked a standard evaluation framework describing the task, metrics and data used to compare state-of-the-art systems. Evaluation is further made inconsistent, even unrepresentative of real world usage by the lack of distinction between the different types of toponyms, which necessitates new guidelines, a consolidation of metrics and a detailed toponym taxonomy with implications for Named Entity Recognition (NER) and beyond. To address these deficiencies, our manuscript introduces a new framework in three parts. (Part 1) Task Definition: clarified via corpus linguistic analysis proposing a fine-grained Pragmatic Taxonomy of Toponyms. (Part 2) Metrics: discussed and reviewed for a rigorous evaluation including recommendations for NER/Geoparsing practitioners. (Part 3) Evaluation data: shared via a new dataset called GeoWebNews to provide test/train examples and enable immediate use of our contributions. In addition to fine-grained Geotagging and Toponym Resolution (Geocoding), this dataset is also suitable for prototyping and evaluating machine learning NLP models.

中文翻译：

地理解析评估实用指南

迄今为止，地理解析中的经验方法缺乏一个标准的评估框架来描述用于比较最先进系统的任务、指标和数据。由于不同类型的地名之间缺乏区别，评估进一步变得不一致，甚至不能代表现实世界的使用情况，这需要新的指导方针、指标的整合和详细的地名分类，这对命名实体识别（NER）及其他领域都有影响。为了解决这些缺陷，我们的手稿分三个部分介绍了一个新框架。（第 1 部分）任务定义：通过语料库语言分析进行澄清，提出细粒度的地名语用分类法。（第 2 部分）指标：讨论和审查以进行严格的评估，包括对 NER/Geoparsing 从业者的建议。（第 3 部分）评估数据：通过名为GeoWebNews的新数据集共享，以提供测试/训练示例并能够立即使用我们的贡献。除了细粒度地理标记和地名解析（地理编码）之外，该数据集还适用于原型设计和评估机器学习 NLP 模型。

更新日期：2019-09-19

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11