当前位置: X-MOL 学术International Journal of Research in Marketing › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
More than a Feeling: Accuracy and Application of Sentiment Analysis
International Journal of Research in Marketing ( IF 5.9 ) Pub Date : 2022-06-20 , DOI: 10.1016/j.ijresmar.2022.05.005
Jochen Hartmann , Mark Heitmann , Christian Siebert , Christina Schamp

Sentiment is fundamental to human communication. Countless marketing applications mine opinions from social media communication, news articles, customer feedback, or corporate communication. Various sentiment analysis methods are available and new ones have recently been proposed. Lexicons can relate individual words and expressions to sentiment scores. In contrast, machine learning methods are more complex to interpret, but promise higher accuracy, i.e., fewer false classifications. We propose an empirical framework and quantify these trade-offs for different types of research questions, data characteristics, and analytical resources to enable informed method decisions contingent on the application context. Based on a meta-analysis of 272 datasets and 12 million sentiment-labeled text documents, we find that the recently proposed transfer learning models indeed perform best, but can perform worse than popular leaderboard benchmarks suggest. We quantify the accuracy-interpretability trade-off, showing that, compared to widely established lexicons, transfer learning models on average classify more than 20 percentage points more documents correctly. To form realistic performance expectations, additional context variables, most importantly the desired number of sentiment classes and the text length, should be taken into account. We provide a pre-trained sentiment analysis model (called SiEBERT) with open-source scripts that can be applied as easily as an off-the-shelf lexicon.



中文翻译:

不仅仅是一种感觉:情感分析的准确性和应用

情感是人类交流的基础。无数营销应用程序从社交媒体交流、新闻文章、客户反馈或企业交流中挖掘意见。可以使用各种情感分析方法,并且最近提出了新的方法。词典可以将单个单词和表达与情感分数相关联。相比之下,机器学习方法解释起来更复杂,但可以保证更高的准确性,即更少的错误分类。我们提出了一个实证框架,并针对不同类型的研究问题、数据特征和分析资源量化了这些权衡,以根据应用环境做出明智的方法决策。基于对 272 个数据集和 1200 万个情感标签文本文档的元分析,我们发现最近提出的迁移学习模型确实表现最好,但可能比流行的排行榜基准建议的表现更差。我们量化了准确性与可解释性之间的权衡,表明与广泛建立的词典相比,迁移学习模型平均对文档的正确分类率高出 20 个百分点以上。为了形成切合实际的性能预期,应考虑额外的上下文变量,最重要的是所需的情感类别数量和文本长度。我们提供了一个预训练的情绪分析模型(称为 SiEBERT)和开源脚本,可以像现成的词典一样轻松应用。与广泛建立的词典相比,迁移学习模型对文档的平均分类正确率高出 20 个百分点以上。为了形成切合实际的性能预期,应考虑额外的上下文变量,最重要的是所需的情感类别数量和文本长度。我们提供了一个预训练的情绪分析模型(称为 SiEBERT)和开源脚本,可以像现成的词典一样轻松应用。与广泛建立的词典相比,迁移学习模型对文档的平均分类正确率高出 20 个百分点以上。为了形成切合实际的性能预期,应考虑额外的上下文变量,最重要的是所需的情感类别数量和文本长度。我们提供了一个预训练的情绪分析模型(称为 SiEBERT)和开源脚本,可以像现成的词典一样轻松应用。

更新日期:2022-06-20
down
wechat
bug