当前位置: X-MOL 学术Inform. Syst. › 论文详情
Eras: Improving the quality control in the annotation process for Natural Language Processing tasks
Information Systems ( IF 2.066 ) Pub Date : 2020-05-21 , DOI: 10.1016/j.is.2020.101553
Jonatas S. Grosman; Pedro H.T. Furtado; Ariane M.B. Rodrigues; Guilherme G. Schardong; Simone D.J. Barbosa; Hélio C.V. Lopes

The increasing amount of valuable, unstructured textual information poses a major challenge to extract value from those texts. We need to use NLP (Natural Language Processing) techniques, most of which rely on manually annotating a large corpus of text for its development and evaluation. Creating a large annotated corpus is laborious and requires suitable computational support. There are many annotation tools available, but their main weaknesses are the absence of data management features for quality control and the need for a commercial license. As the quality of the data used to train an NLP model directly affects the quality of the results, the quality control of the annotations is essential. In this paper, we introduce ERAS, a novel web-based text annotation tool developed to facilitate and manage the process of text annotation. ERAS includes not only the key features of current mainstream annotation systems but also other features necessary to improve the curation process, such as the inter-annotator agreement, self-agreement and annotation log visualization, for annotation quality control. ERAS also implements a series of features to improve the customization of the user’s annotation workflow, such as: random document selection, re-annotation stages, and warm-up annotations. We conducted two empirical studies to evaluate the tool’s support to text annotation, and the results suggest that the tool not only meets the basic needs of the annotation task but also has some important advantages over the other tools evaluated in the studies. ERAS is freely available at https://github.com/grosmanjs/eras.
更新日期:2020-05-21

 

全部期刊列表>>
Springer化学材料学
骄傲月
如何通过Nature平台传播科研成果
跟Nature、Science文章学绘图
隐藏1h前已浏览文章
中洪博元
课题组网站
新版X-MOL期刊搜索和高级搜索功能介绍
ACS材料视界
x-mol收录
南开大学
朱守非
廖良生
郭东升
汪铭
伊利诺伊大学香槟分校
徐明华
中山大学化学工程与技术学院
试剂库存
天合科研
down
wechat
bug