Identifying Structural Holes for Sentiment Classification

Xie, Zheng; Liu, Guannan; Qu, Jinming; Wu, Junjie; Li, Hong

doi:10.1007/s10796-021-10185-x

Identifying Structural Holes for Sentiment Classification

Published: 01 September 2021

Volume 24, pages 1735–1751, (2022)
Cite this article

Information Systems Frontiers Aims and scope Submit manuscript

Zheng Xie¹,
Guannan Liu ORCID: orcid.org/0000-0002-4532-7109²,
Jinming Qu²,
Junjie Wu^2,3 &
…
Hong Li⁴

422 Accesses
1 Citation
Explore all metrics

Abstract

The prevalence of online user-generated content has attracted great interest in textual sentiment analysis, which provides a low-cost yet effective way to discern consumers and markets. A mainstream of sentiment analysis is to construct a classification model with Bag-of-Words (BoW) features, but the large vocabulary base and skewed distribution of term frequency consistently pose research challenges, which is made even worse by the limited valid sentiment labels. In light of this, in this paper, we propose a novel method called Structural Holes based Sentiment Classifier (SHSC) for BoW-based sentiment classification. The key to SHSC is to reinforce the classification contribution of semantically rich words with clear-cut sentiment polarity. To this end, a word co-occurrence network is carefully constructed to represent both high and low frequency words. The work to find classification-inefficient words is then transformed into the identification of so-called bridge nodes that occupy the positions of structural holes in the network. Two interesting measures, i.e., information advantage rank and control advantage weight, are then designed elaborately for this purpose, which are based on the proposed sentiment-label propagation and short-path computation algorithms, respectively. SHSC finally feeds this information as the key regularizers into a simple regression model to guide parametric learning. Extensive experiments on real-world text datasets demonstrate the advantage of our SHSC model over competitive benchmarks, particularly when sentiment labels are scarce. The effectiveness of uncovering structural holes for sentiment classification is also carefully verified with some robustness checks and demonstration cases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on sentiment analysis methods, applications, and challenges

Article 07 February 2022

Sentiment Analysis in the Age of Generative AI

Article Open access 05 March 2024

A survey of sentiment analysis in social media

Article 04 July 2018

Notes

In the experiment, we set the window size to 2. That is, if two words are adjacent to each other, their co-occurrence is increased by one.
COAE2013, https://download.csdn.net/download/u013906860/9776509 https://download.csdn.net/download/u013906860/9776509.
Hotel, https://download.csdn.net/download/lssc4205/9903298.
NLPCC2014, http://tcci.ccf.org.cn/conference/2014/pages/page04_sam.html http://tcci.ccf.org.cn/conference/2014/pages/page04_sam.html.
The ratio ranges from 10% to 80%. In the subsequent subsections, the results are at a ratio of 40% unless otherwise stated.
Gensim: https://radimrehurek.com/gensim/
HowNet, http://www.keenage.com/html/c_index.html.
SWOL, http://ir.dlut.edu.cn/news/detail/215.
NTUSD, http://nlg18.csie.ntu.edu.tw:8080/opinion/pub1.html.
Jieba, https://github.com/fxsjy/jieba.

References

Amancio, D.R., Fabbri, R., ONO, Jr., Nunes, M.G.V., & Costa, L.D.F. (2010). Distinguishing between positive and negative opinions with complex network features. In TextGraphs-5 Proceddings of the 2010 Workshop on Graph-based Methods for Natual Language Processing (pp. 83–87).
Blum, A., & Chawla, S. (2001). Learning from labeled and unlabeled data using graph mincuts. In Eighteenth International Conference on Machine Learning (pp. 19–26).
Bo, P., & Lee, L. (2004). A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In Meeting on Association for Computational Linguistics (pp. 271).
Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2016). Enriching word vectors with subword information. arXiv:160704606.
Burt, R.S. (2004). Structural holes and good ideas. American Journal of Sociology, 110, 349–399.
Article Google Scholar
Cancho, R.F.I., & Solé, R. V. (2001). Two regimes in the frequency of words and the origins of complex lexicons: Zipf’s law revisited. Journal of Quantitative Linguistics, 8(3), 165–173.
Article Google Scholar
Chan, S.W.K., & Chong, M.W.C. (2017). Sentiment analysis in financial texts. Decision Support Systems, 94, 53–64.
Article Google Scholar
Choi, Y., & Lee, H. (2017). Data properties and the performance of sentiment classification for electronic commerce applications. Information Systems Frontiers, 19(5), 993–1012.
Article Google Scholar
Cormen, T.H., Leiserson, C.E., Rivest, R.L., & Stein C. (2009). Introduction to Algorithms, 3rd edn. Cambridge: The MIT Press.
Google Scholar
Deng, S., Sinha, A.P., & Zhao, H. (2017). Adapting sentiment lexicons to domain-specific social media texts. Decision Support Systems, 94, 65–76.
Article Google Scholar
Duric, A., & Song, F. (2012). Feature selection for sentiment analysis based on content and syntax models. Decision Support Systems, 53(4), 704–711.
Article Google Scholar
Esuli, A., & Sebastiani, F. (2007). Pageranking wordnet synsets: an application to opinion mining. In ACL 2007, Proceedings of the Meeting of the Association for Computational Linguistics. Prague.
Hassan, A., Abu-Jbara, A., Lu, W., & Radev, D. (2014). A random walk-based model for identifying semantic orientation. Computational Linguistics, 40(3), 539–562.
Article Google Scholar
Huang, S., Niu, Z., & Shi, C. (2014). Automatic construction of domain-specific sentiment lexicon based on constrained label propagation. Knowledge-Based Systems, 56(C), 191– 200.
Article Google Scholar
Johnson, R., & Zhang, T. (2013). Accelerating stochastic gradient descent using predictive variance reduction. In International Conference on Neural Information Processing Systems (pp. 315–323).
Khan, F.H., Bashir, S., & Qamar, U. (2014). TOM: Twitter opinion mining framework using hybrid classification scheme. Decision Support Systems, 57(3):245–257.
Kiritchenko, S., Zhu, X., & Mohammad SM. (2014). Sentiment analysis of short informal texts. AI Access Foundation.
Lau, R.Y.K., Liao, S.S.Y., Wong, K.F., & Chiu, D.K.W. (2012). Web 2.0 environmental scanning and adaptive decision support for business mergers and acquisitions. MIS Quarterly, 36(4), 1239–1268.
Article Google Scholar
Li, S., Wang, Z., Zhou, G., & Lee, S.Y.M. (2011). Semi-supervised learning for imbalanced sentiment classification. In Twenty-Second International Joint Conference on Aritifical Intelligence (pp. 1826–1831).
Li, W., Guo, K., Shi, Y., Zhu, L., & Zheng, Y. (2018). DWWP: Domain-specific new words detection and word propagation system for sentiment analysis in the tourism domain. Knowledge-Based Systems.
Li, Y.M., & Li, T.Y. (2013). Deriving market intelligence from microblogs. Decision Support Systems, 55(1), 206–217.
Article Google Scholar
Liu, S., Li, F., Li, F., Cheng, X., & Shen, H. (2013). Adaptive co-training SVM for sentiment classification on tweets. In ACM International Conference on Conference on Information & Knowledge Management (pp. 2079–2088).
Luo, X., Zhang, J., Gu, B., & Phang, C.W. (2017). Expert blogs and consumer perceptions of competing brands. MIS Quarterly, 41(2), 371–395.
Article Google Scholar
Mihalcea, R., & Tarau, P. (2004). Textrank: Bringing order into texts. ACM Conference on Empirical Methods in Natural Langugae Processing, 404–411.
Ferrer, R. (2001).
Rao, D., & Ravichandran, D. (2009). Semi-supervised polarity lexicon induction. In Eacl 2009, Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference (pp. 675–682). Greece.
Rudkowsky, E., Haselmayer, M., Wastian, M., Jenny, M., Emrich, Š., & Sedlmair, M. (2018). More than bags of words: Sentiment analysis with word embeddings. Communication Methods and Measures, 12(2-3), 140–157.
Article Google Scholar
Singh, P., Dwivedi, Y.K., Kahlon, K.S., Sawhney, R.S., Alalwan, A.A., & Rana, N.P. (2019). Smart monitoring and controlling of government policies using social media and cloud computing. Information Systems Frontiers, 22(2), 315–337.
Google Scholar
Swain, A.K., & Cao, R.Q. (2017). Using sentiment analysis to improve supply chain intelligence. Information Systems Frontiers, 21(2), 469–484.
Article Google Scholar
Tang, D., Wei, F., Qin, B., Zhou, M., & Ting, L. (2014). Building large-scale twitter-specific sentiment lexicon: A representation learning approach. In COLING 2014 Proceedings of the 25rd International Conference on Computational Linguistics (pp. 172–182).
Velikovich, L., Blair-Goldensohn, S, Hannan, K., & Mcdonald, R. (2010). The viability of web-derived polarity lexicons. In Human Language Technologies: the 2010 Conference of the North American Chapter of the Association for Computational Linguistics (pp. 777–785).
Wu, F., & Huang, Y. (2016). Personalized microblog sentiment classification via multi-task learning. In Thirtieth AAAI Conference on Artificial Intelligence (pp. 3059–3065).
Wu, F., Song, Y., & Huang, Y. (2015). Microblog sentiment classification with contextual knowledge regularization. In Twenty-Ninth AAAI Conference on Artificial Intelligence (pp. 2332–2338).
Wu, F., Huang, Y., Song, Y., & Liu, S. (2016a). Towards building a high-quality microblog-specific Chinese sentiment lexicon. Decision Support Systems, 87(C):39–49.
Wu, F., Wu, S., Huang, Y., Huang, S., & Qin, Y. (2016b). Sentiment domain adaptation with multi-level contextual sentiment knowledge. In ACM International Conference on Information and Knowledge Management (pp. 949–958).
Wu, F., Yuan, Z., & Huang, Y. (2017a). Collaboratively training sentiment classifiers for multiple domains. IEEE Transactions on Knowledge & Data Engineering, 29(7):1370–1383.
Wu, F., Zhang, J., Yuan, Z., Wu, S., Huang, Y., & Yan, J. (2017b). Sentence-level sentiment classification with weak supervision. In: The International ACM SIGIR Conference (pp. 973–976).
Zhang, D., Xu, H., Su, Z., & Xu, Y. (2015). Chinese comments sentiment classification based on word2vec and SVMperf. Expert Systems with Applications, 42(4):1857–1863.
Zhang, K., Bhattacharyya, S., & Ram, S. (2016). Large-scale network analysis for online social brand advertising. MIS Quarterly, 40(4), 849–868.
Article Google Scholar
Zhiguang, L., Xishuang, D., Yi, G., & Jinfeng, Y. (2013). Reserved self-training: A semi-supervised sentiment classification method for Chinese microblogs. In International Joint Conference on Natural Language Processing (pp. 455–462).
Zhou, S., Chen, Q., & Wang, X. (2013). Active deep learning method for semi-supervised sentiment classification. Neurocomputing, 120(10), 536–546.
Article Google Scholar

Download references

Author information

Authors and Affiliations

National Internet Emergency Center (CNCERT/CC), Beijing, 100029, China
Zheng Xie
School of Economics and Management, Beihang University, Beijing, 100191, China
Guannan Liu, Jinming Qu & Junjie Wu
Beijing Advanced Innovation Center for Big Data and Brain Computing, Beihang University, Beijing, 100191, China
Junjie Wu
Beijing Key Laboratory of Emergency Support Simulation Technologies for City Operations, Beihang University, Beijing, 100191, China
Hong Li

Authors

Zheng Xie
View author publications
You can also search for this author in PubMed Google Scholar
Guannan Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jinming Qu
View author publications
You can also search for this author in PubMed Google Scholar
Junjie Wu
View author publications
You can also search for this author in PubMed Google Scholar
Hong Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guannan Liu.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Dr. Guannan Liu was supported by National Natural Science Foundation of China (NSFC) (71701007,92046025). Dr. Junjie Wu was supported by National Natural Science Foundation of China (NSFC) (71725002, 72031001,72021001). Dr. Zheng Xie was supported by the Science and Technology Innovation 2030 Major Project of China (2020AAA0108405, 2020AAA0108400).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xie, Z., Liu, G., Qu, J. et al. Identifying Structural Holes for Sentiment Classification. Inf Syst Front 24, 1735–1751 (2022). https://doi.org/10.1007/s10796-021-10185-x

Download citation

Accepted: 01 August 2021
Published: 01 September 2021
Issue Date: October 2022
DOI: https://doi.org/10.1007/s10796-021-10185-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Identifying Structural Holes for Sentiment Classification

Abstract

Access this article

Similar content being viewed by others

A survey on sentiment analysis methods, applications, and challenges

Sentiment Analysis in the Age of Generative AI

A survey of sentiment analysis in social media

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Identifying Structural Holes for Sentiment Classification

Abstract

Access this article

Similar content being viewed by others

A survey on sentiment analysis methods, applications, and challenges

Sentiment Analysis in the Age of Generative AI

A survey of sentiment analysis in social media

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation