Skip to main content
Log in

Attentive convolutional gated recurrent network: a contextual model to sentiment analysis

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Considering contextual features is a key issue in sentiment analysis. Existing approaches including convolutional neural networks (CNNs) and recurrent neural networks (RNNs) lack the ability to account and prioritize informative contextual features that are necessary for better sentiment interpretation. CNNs present limited capability since they are required to be very deep, which can lead to the gradient vanishing whereas, RNNs fail because they sequentially process input sequences. Furthermore, the two approaches treat all words equally. In this paper, we suggest a novel approach named attentive convolutional gated recurrent network (ACGRN) that alleviates the above issues for sentiment analysis. The motivation behind ACGRN is to avoid the vanishing gradient caused by deep CNN via applying a shallow-and-wide CNN that learns local contextual features. Afterwards, to solve the problem caused by the sequential structure of RNN and prioritizing informative contextual information, we use a novel prior knowledge attention based bidirectional gated recurrent unit (ATBiGRU). Prior knowledge ATBiGRU captures global contextual features with a strong focus on the previous hidden states that carry more valuable information to the current time step. The experimental results show that ACGRN significantly outperforms the baseline models over six small and large real-world datasets for the sentiment classification task.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. http://ai.stanford.edu/~amaas/data/sentiment/.

  2. https://nlp.stanford.edu/sentiment/.

  3. http://jmcauley.ucsd.edu/data/amazon/.

  4. http://nlp.stanford.edu/projects/glove/.

References

  1. AlSmadi M, Talafha B, AlAyyoub M, Jararweh Y (2019) Using long shortterm memory deep neural networks for aspect based sentiment analysis of Arabic reviews. Int J Mach Learn Cybern 10(8):2163–2175

    Article  Google Scholar 

  2. Amplayo RK, Kim J, Sung S, Hwang S (2018) Cold-start aware user and product attention for sentiment classification. In: Proceedings of the 56th annual meeting of the association for computational linguistics (ACL), pp 2535–2544

  3. Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: 3rd international conference on learning representations (ICLR), pp 1–15

  4. Bengio Y (2017) The consciousness prior. CoRR arXiv:1709.08568

  5. Cai Y, Yang K, Huang D, ZhouXue Z, Lei X, Xie H et al (2019) A hybrid model for opinion mining based on domain sentiment dictionary. Int J Mach Learn Cybern 10(8):2131–2142

    Article  Google Scholar 

  6. Cambria E, White B, Durrani TS, Howard N (2014) Computational intelligence for natural language processing [guest editorial]. IEEE Comput Intell Mag Nat Lang Process 9(1):19–63

    Article  Google Scholar 

  7. Campos V, Jou B, Giró i Nieto X, Torres J, Chang S (2018) Skip RNN: learning to skip state updates in recurrent neural networks. In: 6th international conference on learning representations (ICLR), pp 1–17

  8. Cho K, van Merriënboer B, Gülçehre C, Bahdanau D, Bougares F, Schwenk H et al (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1724–1734

  9. Collobert R, Weston J (2008) A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the twenty-fifth international conference machine learning (ICML), pp 160–167

  10. Conneau A, Barrault L, Schwenk H, LeCun Y (2017) Very deep convolutional networks for text classification. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics (EACL), pp 1107–1116

  11. Hassan A, Mahmood A (2017) Deep learning approach for sentiment analysis of short texts. In: 3rd international conference on control, automation and robotics (ICCAR), pp 705–710

  12. dos Santos CN, Gatti M (2014) Deep convolutional neural networks for sentiment analysis of short texts. In: 25th international conference on computational linguistics (COLING), pp 69–78

  13. Hemmatian F, Sohrab MK (2019) A survey on classification techniques for opinion mining and sentiment analysis. Artif Intell Rev 52(3):1495–1545

    Article  Google Scholar 

  14. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  15. Joulin A, Grave E, Bojanowski P, Mikolov T (2017) Bag of tricks for efficient text classification. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics, pp 427–431

  16. Johnson R, Zhang T (2015) Effective use of word order for text categorization with convolutional neural networks. In: The 2015 conference of the North American chapter of the association for computational linguistics: human language technologies (HLT-NAACL), pp 103–112

  17. Johnson R, Zhang T (2017) Deep pyramid convolutional neural networks for text categorization. In: Proceedings of the 55th annual meeting of the association for computational linguistics (ACL), pp 562–570

  18. Habimana O, Li Y, Li R, Gu X (2020) Sentiment analysis using deep learning approaches: an overview. Sci China Inf Sci 63(1):111102

    Article  Google Scholar 

  19. Kalchbrenner N, Grefenstette E, Blunsom P (2014) A convolutional neural network for modelling sentences. In: Proceedings of the 52nd annual meeting of the association for computational linguistics (ACL), pp 655–665

  20. Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1746–1751

  21. Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: 3rd international conference on learning representations (ICLR), pp 1–15

  22. Lai S, Xu L, Liu K, Zhao J (2015) Recurrent convolutional neural networks for text classification. In: Proceedings of the twenty-ninth AAAI conference on artificial intelligence, pp 2267–2273

  23. Le HT, Cerisara C, Alexandre DA (2018) Do convolutional networks need to be deep for text classification? In: The workshops of the thirty-second AAAI conference on artificial intelligence, pp 29–36

  24. Liu B (2012) Sentiment analysis and opinion mining. Synthesis lectures on human language technologies. Morgan & Claypool Publishers, San Rafael

    Google Scholar 

  25. Liu J, Wang G, Hu P, Duan LY, Kot AC (2017) Global context-aware attention LSTM networks for 3D action recognition. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 3671–3680

  26. Long F, Zhou K, Ou W (2019) Sentiment analysis of text based on bidirectional LSTM with multi-head attention. IEEE Access 7:141960–141969

    Article  Google Scholar 

  27. Long Y, Qin L, Xiang R, Li M, Huang C (2017) A cognition based attention model for sentiment analysis. In: Proceedings of the 2017 conference on empirical methods in natural language processing (EMNLP), pp 462–471

  28. Luong T, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. In: Proceedings of the 2015 conference on empirical methods in natural language processing (EMNLP), pp 1412–1421

  29. Ma Q, Yu L, Tian S, Chen E, Ng WWY (2019) Global-local mutual attention model for text classification. IEEE/ACM Trans Audio Speech Lang Process 27(12):2127–2139

    Article  Google Scholar 

  30. Maas AL, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies (ACL), pp 142–150

  31. McAuley JJ, Leskovec J (2013) Hidden factors and hidden topics: understanding rating dimensions with review text. In: Seventh ACM conference on recommender systems (RecSys), pp 165–172

  32. Mishra A, Tamilselvam S, Dasgupta R, Nagar S, Dey K (2018) Cognition-cognizant sentiment analysis with multitask subjectivity summarization based on annotators’ gaze behavior. In: Proceedings of the 32nd AAAI conference on artificial intelligence, pp 5884–5891

  33. Mousa AE, Schuller BW (2017) Contextual bidirectional long short-term memory recurrent neural network language models: A generative approach to sentiment analysis. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics (EACL), pp 1023–1032

  34. Muhammad A, Wiratunga N, Lothian R (2016) Contextual sentiment analysis for social media genres. Knowl Based Syst 108:92–101

    Article  Google Scholar 

  35. Mujika A, Meier F, Steger A (2017) Fast-slow recurrent neural networks. In: Advances in neural information processing systems 30: annual conference on neural information processing systems (NIPS), pp 5917–5926

  36. Pang B, Lee L (2007) Opinion mining and sentiment analysis. Found Trends Inf Retr 2(1–2):1–135

    Article  Google Scholar 

  37. Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543

  38. Potamianos A, Kokkinos F (2017) Structural attention neural networks for improved sentiment analysis. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics (EACL), pp 586–591

  39. Pozzi FA, Fersini E, Messina E, Liu B ( 2016) Sentiment Analysis in Social Networks. Morgan Kaufmann Publishers Inc

  40. Qiao X, Peng C, Liu Z, Hu Y (2019) Word-character attention model for Chinese text classification. Int J Mach Learn Cybern 10(12):3521–3537

    Article  Google Scholar 

  41. Socher R, Perelygin A, Wu J, Chuang J, Manning CD, Ng AY, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 conference on empirical methods in natural language processing (EMNLP), pp 1631–1642

  42. Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958

    MathSciNet  MATH  Google Scholar 

  43. Tai KS, Socher R, Manning CD (2015) Improved semantic representations from tree-structured long short-term memory networks. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing of the Asian federation of natural language processing (ACL–AFNLP), pp 1556–1566

  44. Wang J, Yu L, Lai KR, Zhang X (2019) Investigating dynamic routing in tree-structured LSTM for sentiment analysis. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing, pp 3430–3435

  45. Wang N, Wang J, Zhang X (2017) YNU-HPCC at IJCNLP-2017 task 4: attention-based bi-directional GRU model for customer feedback analysis task of English. In: Proceedings of the IJCNLP, pp 174–179

  46. Wang L, Tu Z, Way A, Liu Q (2017) Exploiting cross-sentence context for neural machine translation. In: Proceedings of the 2017 conference on empirical methods in natural language processing (EMNLP), pp 2826–2831

  47. Wang Y, Tian F (2016) Recurrent residual learning for sequence classification. In: Proceedings of the 2016 conference on empirical methods in natural language processing (EMNLP), pp 938–943

  48. Weston J, Chopra S, Bordes A (2015) Memory networks. In: 3rd international conference on learning representations (ICLR), pp 1–15

  49. Wilson T, Wiebe J, Hoffmann P (2005) Recognizing contextual polarity in phrase-level sentiment analysis. In: Human language technology conference and conference on empirical methods in natural language processing, proceedings of the conference (HLT/EMNLP), pp 347–354

  50. Wu Z, Dai X, Yin C, Huang S, Chen J (2018) Improving review representations with user attention and product attention for sentiment classification. In: Proceedings of the thirty-second AAAI conference on artificial intelligence, (AAAI-18), the 30th innovative applications of artificial intelligence (IAAI-18), and the 8th AAAI symposium on educational advances in artificial intelligence (EAAI-18), pp 5989–5996

  51. Xu G, Meng Y, Qiu X, Yu Z, Wu X (2019) Sentiment analysis of comment texts based on BiLSTM. IEEE Access 7:51522–51532

    Article  Google Scholar 

  52. Yang M, Tu W, Wang J, Xu F, Chen X (2017) Attention based LSTM for target dependent sentiment classification. In: Proceedings of the thirty-first AAAI conference on artificial intelligence, pp 5013–5014

  53. Yang Z, Yang D, Dyer C, He X, Smola AJ, Hovy EH (2016) Hierarchical attention networks for document classification. In: The 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 1480–1489

  54. Zhang M, Zhang Y, Vo D (2016) Gated neural networks for targeted sentiment analysis. In: Proceedings of the thirtieth AAAI conference on artificial intelligence (AAAI), pp 3087–3093

  55. Zhang R, Lee H, Radev DR (2016) Dependency sensitive convolutional neural networks for modeling sentences and documents. In: The 2016 conference of the North American chapter of the association for computational linguistics: human language technologies (NAACL/HLT), pp 1512–1521

  56. Zhang Y, Wallace BC (2017) A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification. In: Proceedings of the the 8th international joint conference on natural language processing (IJCNLP), pp 253–263

  57. Zhao J, Zhan Z, Yang Q, Zhang Y, Hu C, Li Z et al (2018) Adaptive learning of local semantic and global structure representations for text classification. In: Proceedings of the 27th international conference on computational linguistics (COLING), pp 2033–2043

  58. Zheng L, Wang H, Gao S (2018) Sentimental feature selection for sentiment analysis of Chinese online reviews. Int J Mach Learn Cybern 9(1):75–84

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported by the National Key Research and Development Program of China under Grants 2016YFB0800402 and 2016QY01W0202, National Natural Science Foundation of China under Grants U1836204, U1936108, 61433006, U1401258, and 61502185.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Yuhua Li or Ruixuan Li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Habimana, O., Li, Y., Li, R. et al. Attentive convolutional gated recurrent network: a contextual model to sentiment analysis. Int. J. Mach. Learn. & Cyber. 11, 2637–2651 (2020). https://doi.org/10.1007/s13042-020-01135-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-020-01135-1

Keywords

Navigation