A Deep Learning Approach to Geographical Candidate Selection through Toponym Matching

Ardanuy, Mariona Coll; Hosseini, Kasra; McDonough, Katherine; Krause, Amrey; van Strien, Daniel; Nanni, Federico

Computer Science > Computation and Language

arXiv:2009.08114 (cs)

[Submitted on 17 Sep 2020 (v1), last revised 22 Sep 2020 (this version, v2)]

Title:A Deep Learning Approach to Geographical Candidate Selection through Toponym Matching

Authors:Mariona Coll Ardanuy, Kasra Hosseini, Katherine McDonough, Amrey Krause, Daniel van Strien, Federico Nanni

View PDF

Abstract:Recognizing toponyms and resolving them to their real-world referents is required for providing advanced semantic access to textual data. This process is often hindered by the high degree of variation in toponyms. Candidate selection is the task of identifying the potential entities that can be referred to by a toponym previously recognized. While it has traditionally received little attention in the research community, it has been shown that candidate selection has a significant impact on downstream tasks (i.e. entity resolution), especially in noisy or non-standard text. In this paper, we introduce a flexible deep learning method for candidate selection through toponym matching, using state-of-the-art neural network architectures. We perform an intrinsic toponym matching evaluation based on several new realistic datasets, which cover various challenging scenarios (cross-lingual and regional variations, as well as OCR errors). We report its performance on candidate selection in the context of the downstream task of toponym resolution, both on existing datasets and on a new manually-annotated resource of nineteenth-century English OCR'd text.

Comments:	10 pages, 1 figure
Subjects:	Computation and Language (cs.CL); Digital Libraries (cs.DL); Information Retrieval (cs.IR)
Cite as:	arXiv:2009.08114 [cs.CL]
	(or arXiv:2009.08114v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2009.08114

Submission history

From: Mariona Coll Ardanuy [view email]
[v1] Thu, 17 Sep 2020 07:24:56 UTC (768 KB)
[v2] Tue, 22 Sep 2020 14:24:12 UTC (761 KB)

Computer Science > Computation and Language

Title:A Deep Learning Approach to Geographical Candidate Selection through Toponym Matching

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:A Deep Learning Approach to Geographical Candidate Selection through Toponym Matching

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators