Idiom—based features in sentiment analysis: Cutting the Gordian knot,IEEE Transactions on Affective Computing

当前位置： X-MOL 学术 › IEEE Trans. Affect. Comput. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Idiom—based features in sentiment analysis: Cutting the Gordian knot
IEEE Transactions on Affective Computing ( IF 11.2 ) Pub Date : 2020-04-01 , DOI: 10.1109/taffc.2017.2777842
Irena Spasic , Lowri Williams , Andreas Buerki

In this paper we describe an automated approach to enriching sentiment analysis with idiom-based features. Specifically, we automated the development of the supporting lexico-semantic resources, which include (1) a set of rules used to identify idioms in text and (2) their sentiment polarity classifications. Our method demonstrates how idiom dictionaries, which are readily available general pedagogical resources, can be adapted into purpose-specific computational resources automatically. These resources were then used to replace the manually engineered counterparts in an existing system, which originally outperformed the baseline sentiment analysis approaches by 17 percentage points on average, taking the F-measure from 40s into 60s. The new fully automated approach outperformed the baselines by 8 percentage points on average taking the F-measure from 40s into 50s. Although the latter improvement is not as high as the one achieved with the manually engineered features, it has got the advantage of being more general in a sense that it can readily utilize an arbitrary list of idioms without the knowledge acquisition overhead previously associated with this task, thereby fully automating the original approach.

中文翻译：

情感分析中基于习语的特征：斩断戈尔迪之结

在本文中，我们描述了一种使用基于习语的特征来丰富情感分析的自动化方法。具体来说，我们自动化了支持词汇语义资源的开发，其中包括（1）一组用于识别文本中习语的规则和（2）它们的情感极性分类。我们的方法展示了习语词典（现成的通用教学资源）如何自动适应特定用途的计算资源。然后，这些资源被用于替换现有系统中手动设计的对应物，该系统最初比基线情感分析方法平均高出 17 个百分点，将 F 度量从 40 秒提高到 60 秒。新的全自动方法将 F-measure 从 40 秒提高到 50 秒，比基线平均高出 8 个百分点。尽管后者的改进不如手动设计的特征所实现的那么高，但它具有更通用的优势，因为它可以轻松利用任意的习语列表，而无需先前与此任务相关的知识获取开销，从而使原始方法完全自动化。

更新日期：2020-04-01

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>