A hybrid lexicon-based and neural approach for explainable polarity detection,Information Processing & Management

当前位置： X-MOL 学术 › Inf. Process. Manag. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A hybrid lexicon-based and neural approach for explainable polarity detection
Information Processing & Management ( IF 7.4 ) Pub Date : 2022-08-19 , DOI: 10.1016/j.ipm.2022.103058
Marco Polignano , Valerio Basile , Pierpaolo Basile , Giuliano Gabrieli , Marco Vassallo , Cristina Bosco

In this work, we propose BERT-WMAL, a hybrid model that brings together information coming from data through the recent transformer deep learning model and those obtained from a polarized lexicon. The result is a model for sentence polarity that manages to have performances comparable with those at the state-of-the-art, but with the advantage of being able to provide the end-user with an explanation regarding the most important terms involved with the provided prediction. The model has been evaluated on three polarity detection Italian dataset, i.e., SENTIPOLC, AGRITREND and ABSITA. While the first contains 7,410 tweets released for training and 2,000 for testing, the second and the third respectively include 1,000 tweets without splitting , and 2,365 reviews for training, 1,171 for testing. The use of lexicon-based information proves to be effective in terms of the F1 measure since it shows an improvement of F1 score on all the observed dataset: from 0.664 to 0.669 (i.e, 0.772%) on AGRITREND, from 0.728 to 0.734 (i.e., 0.854%) on SENTIPOLC and from 0.904 to 0.921 (i.e, 1.873%) on ABSITA. The usefulness of this model not only depends on its effectiveness in terms of the F1 measure, but also on its ability to generate predictions that are more explainable and especially convincing for the end-users. We evaluated this aspect through a user study involving four native Italian speakers, each evaluating 64 sentences with associated explanations. The results demonstrate the validity of this approach based on a combination of weights of attention extracted from the deep learning model and the linguistic knowledge stored in the WMAL lexicon. These considerations allow us to regard the approach provided in this paper as a promising starting point for further works in this research area.

中文翻译：

用于可解释极性检测的基于混合词典和神经的方法

在这项工作中，我们提出了 BERT-WMAL，这是一种混合模型，它将通过最近的 Transformer 深度学习模型来自数据的信息与从极化词典中获得的信息结合在一起。结果是一个句子极性模型，其性能可以与最先进的模型相媲美，但其优势是能够为最终用户提供关于与句子极性相关的最重要术语的解释。提供了预测。该模型已在三个极性检测意大利数据集上进行了评估，即 SENTIPOLC、AGRITREND 和 ABSITA。第一个包含 7,410 条用于训练的推文和 2,000 条用于测试的推文，第二个和第三个分别包括 1,000 条未拆分的推文，2,365 条用于训练的评论，1,171 条用于测试。事实证明，使用基于词典的信息在 F1 测量方面是有效的，因为它在所有观察到的数据集上显示了 F1 分数的提高：在 AGRITREND 上从 0.664 到 0.669（即 0.772%），从 0.728 到 0.734（即, 0.854%) 在 SENTIPOLC 和从 0.904 到 0.921 (即 1.873%) 在 ABSITA。该模型的有用性不仅取决于其在 F1 度量方面的有效性，还取决于其生成更易于解释且尤其对最终用户具有说服力的预测的能力。我们通过一项涉及四位以意大利语为母语的人的用户研究评估了这一方面，每人评估了 64 个带有相关解释的句子。结果证明了这种方法的有效性，该方法基于从深度学习模型中提取的注意力权重和存储在 WMAL 词典中的语言知识的组合。

更新日期：2022-08-20

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11