SVM ensembles for named entity disambiguation,Computing

当前位置： X-MOL 学术 › Computing › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

SVM ensembles for named entity disambiguation
Computing ( IF 3.3 ) Pub Date : 2019-08-21 , DOI: 10.1007/s00607-019-00748-x
Amal Alokaili , Mohamed El Bachir Menai

The enormous quantity of digital data necessitates automation, which among other things can help link unstructured to structured data. Such a task requires a systematic approach of mapping entity mentions (e.g., person, location) to corresponding entries in a Knowledge Base. This area of research is rapidly evolving at a breathtaking pace, which has led to the popularization of the Named Entity Disambiguation (NED). NED, also known as Entity Linking, described as the task of removing any ambiguities occurring when processing unstructured data packed with Named Entities. The goal of this paper is to investigate ensemble learning using Support Vector Machines (SVM) for tackling the NED problem. Multiple ensemble learning algorithms were studied, including bagging, boosting and voting using different SVM kernel functions, including Linear, RBF, and Polynomial kernels. Our results on three benchmark corpora show that ensemble learning using SVM produces competitive performance levels compared to well-known entity annotation systems and ensemble models. Specifically, the proposed method was best at the disambiguation of AIDA/CONLL-TestB and AQUAINT with F-measure equals to 78.5 and 71.5%, respectively.

中文翻译：

用于命名实体消歧的 SVM 集成

大量的数字数据需要自动化，这有助于将非结构化数据与结构化数据联系起来。此类任务需要将实体提及（例如，人员、位置）映射到知识库中的相应条目的系统方法。这一研究领域正以惊人的速度迅速发展，这导致了命名实体消歧 (NED) 的普及。NED，也称为实体链接，描述为消除在处理包含命名实体的非结构化数据时出现的任何歧义的任务。本文的目标是研究使用支持向量机 (SVM) 解决 NED 问题的集成学习。研究了多种集成学习算法，包括使用不同的 SVM 核函数，包括线性、RBF、和多项式核。我们在三个基准语料库上的结果表明，与众所周知的实体注释系统和集成模型相比，使用 SVM 的集成学习产生了具有竞争力的性能水平。具体而言，所提出的方法最擅长于消歧 AIDA/CONLL-TestB 和 AQUAINT，F-measure 分别等于 78.5 和 71.5%。

更新日期：2019-08-21

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11