当前位置: X-MOL 学术Appl. Soft Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Journalistic transparency using CRFs to identify the reporter of newspaper articles in Spanish
Applied Soft Computing ( IF 8.7 ) Pub Date : 2020-06-23 , DOI: 10.1016/j.asoc.2020.106496
Francisco Jurado

Journalistic transparency rises as a key issue against the lack of credibility to which journalists are exposed, as well as the media manipulators and fake news providers. With the use of Natural Language Processing (NLP) and Machine Learning (ML), it is possible to automate the extraction of information from newspaper articles to know what the sources of information are to verify their veracity. Along with this article, we present the application of Conditional Random Fields (CRFs) for a specific type of Entity Recognition (ER) task, namely, to identify what we have called the “reporter” in newspaper articles, i.e., who or what is the provider of the information. Thus, we have created a labelled corpus for the Spanish language and trained and analysed several CRFs models with a set of specific features. The obtained results suppose a solid baseline for our goal.



中文翻译:

使用CRF来识别西班牙语报纸报道的记者的新闻透明度

新闻透明度的提高是克服缺乏公信力的关键问题,新闻工作者以及媒体操纵者和虚假新闻提供者都缺乏公信力。通过使用自然语言处理(NLP)和机器学习(ML),可以自动从报纸文章中提取信息,以了解哪些信息源可以验证其准确性。与本文一起,我们介绍了条件随机字段(CRF)在特定类型的实体识别(ER)任务中的应用,即识别我们所谓的“报告者”在报纸文章中,即信息的提供者是谁或什么。因此,我们为西班牙语创建了标记的语料库,并训练和分析了具有一组特定功能的几种CRF模型。获得的结果为我们的目标奠定了坚实的基础。

更新日期:2020-06-23
down
wechat
bug