Identifying Structural Holes for Sentiment Classification,Information Systems Frontiers

当前位置： X-MOL 学术 › Inf. Syst. Front. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Identifying Structural Holes for Sentiment Classification
Information Systems Frontiers ( IF 6.9 ) Pub Date : 2021-09-01 , DOI: 10.1007/s10796-021-10185-x
Zheng Xie ₁ , Guannan Liu ₂ , Jinming Qu ₂ , Junjie Wu _{2,

3} , Hong Li ₄

Affiliation

The prevalence of online user-generated content has attracted great interest in textual sentiment analysis, which provides a low-cost yet effective way to discern consumers and markets. A mainstream of sentiment analysis is to construct a classification model with Bag-of-Words (BoW) features, but the large vocabulary base and skewed distribution of term frequency consistently pose research challenges, which is made even worse by the limited valid sentiment labels. In light of this, in this paper, we propose a novel method called Structural Holes based Sentiment Classifier (SHSC) for BoW-based sentiment classification. The key to SHSC is to reinforce the classification contribution of semantically rich words with clear-cut sentiment polarity. To this end, a word co-occurrence network is carefully constructed to represent both high and low frequency words. The work to find classification-inefficient words is then transformed into the identification of so-called bridge nodes that occupy the positions of structural holes in the network. Two interesting measures, i.e., information advantage rank and control advantage weight, are then designed elaborately for this purpose, which are based on the proposed sentiment-label propagation and short-path computation algorithms, respectively. SHSC finally feeds this information as the key regularizers into a simple regression model to guide parametric learning. Extensive experiments on real-world text datasets demonstrate the advantage of our SHSC model over competitive benchmarks, particularly when sentiment labels are scarce. The effectiveness of uncovering structural holes for sentiment classification is also carefully verified with some robustness checks and demonstration cases.

中文翻译：

识别情感分类的结构漏洞

在线用户生成内容的盛行引起了对文本情感分析的极大兴趣，它提供了一种低成本但有效的方式来识别消费者和市场。情感分析的主流是构建具有词袋（BoW）特征的分类模型，但庞大的词汇库和词频的偏态分布一直给研究带来挑战，而有效情感标签的有限则使问题变得更糟。鉴于此，在本文中，我们提出了一种称为基于结构孔的情感分类器的新方法(SHSC) 用于基于 BoW 的情感分类。SHSC的关键是加强语义丰富、情感极性明确的词的分类贡献。为此，精心构建了一个词共现网络来表示高频和低频词。然后将寻找分类效率低的词的工作转化为对占据网络中结构孔位置的所谓桥节点的识别。两个有趣的措施，即，信息优势排名和控制体重的优势，然后为此目的精心设计，分别基于所提出的情感标签传播和短路径计算算法。SHSC 最终将这些信息作为关键正则化器输入到一个简单的回归模型中，以指导参数学习。对真实世界文本数据集的大量实验证明了我们的 SHSC 模型相对于竞争基准的优势，尤其是在情感标签稀缺的情况下。通过一些稳健性检查和示范案例，还仔细验证了发现用于情感分类的结构漏洞的有效性。

更新日期：2021-09-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11