Skip to main content
Log in

Enabling 5G: sentimental image dominant graph topic model for cross-modality topic detection

  • Published:
Wireless Networks Aims and scope Submit manuscript

Abstract

Fifth generation mobile networks (5G) is coming into our life and it will not only provide an increase of 1000 times in Internet traffic in the next decade but will also offer “smarter” user experience. With the commercial uses of 5G, online social networks and smart phones will spring up again. Then cross-modality data will play a more important role in the daily information dissemination. As an effective way of content analysis, topic detection has attracted much research interest, but conventional topic analysis is undergoing the limitations from the cross-modality heterogenous data. This paper proposes a sentimental image dominant graph topic model, that can detect the topic from the heterogenous data and mine the sentiment of each topic. In details, we design a topic model to transfer both the low-level visual modality and the high-level text modality into a semantic manifold, and improve the discriminative power of CNN feature by jointly optimizing the output of both convolutional layer and fully-connected layer. Furthermore, since the sentimental impact is very significant for understanding the intrinsic meaning of topics, we introduce a semantic score of subjective sentences to calculate the sentiment on the base of the contextual sentence structure. The comparison experiments on the public cross-modality benchmark show the promising performance of our model. So our method using AI technology will facilitate the intellectualization of 5G.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Agiwal, M., Roy, A., & Saxena, N. (2016). Next generation 5G wireless networks: A comprehensive survey. IEEE Communications Surveys Tutorials, 18(3), 1617–1655.

    Article  Google Scholar 

  2. Akpakwu, G., Silva, B., Hancke, G. P., & Abu-Mahfouz, A. M. (2018). A survey on 5G networks for the internet of things: Communication technologies and challenges. IEEE Access, 6(99), 3619–3647.

    Article  Google Scholar 

  3. Alnoman, A., & Anpalagan, A. (2017). Towards the fulfillment of 5G network requirements: Technologies and challenges. Telecommunication Systems, 65(1), 1–16.

    Article  Google Scholar 

  4. Andrew, G., Arora, R., Bilmes, J., & Livescu, K. (2013). Deep canonical correlation analysis. In Proceedings of ICML (pp. 1247–1255).

  5. Aradhye, H., Toderici, G., & Yagnik, J. (2009). Video2text: Learning to annotate video content. In IEEE international conference on data mining workshops, 2009. ICDMW’09 (pp. 144–151).

  6. Blei, D. M., & Jordan, M. I. (2003). Modeling annotated data. In Proceedings of ACM SIGIR (pp. 127–134).

  7. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. JMLR, 3, 993–1022.

    MATH  Google Scholar 

  8. Cao, J., Zhang, Y. D., Song, Y. C., Chen, Z. N., Zhang, X., & Li, J. T. (2009). Mcg-webv: A benchmark dataset for web video analysis. Beijing: Institute of Computing Technology, 10, 324–334.

    Google Scholar 

  9. Cao, L., Chang, S.-F., Codella, N., Cotton, C., Ellis, D., Gong, L., et al. (2011). IBM research and Columbia University TRECVID-2011 multimedia event detection (MED) system. In NIST TRECVID workshop.

  10. Chang, J., & Blei, D. M. (2009). Relational topic models for document networks. In Proceedings of AISTATS (pp. 81–88).

  11. Chu, L., Zhang, Y., Li, G., Wang, S., Zhang, W., & Huang, Q. (2016). Effective multimodality fusion framework for cross-media topic detection. IEEE Transactions on Circuits and Systems for Video Technology, 26, 556–569.

    Article  Google Scholar 

  12. Ding, W., Zhu, J., Guo, L., Hu, X., Luo, J., & Wang, H. (2014). Jointly image topic and emotion detection using multi-modal hierarchical latent dirichlet allocation. Journal of Multimedia Information System, 1(1), 55–67.

    Google Scholar 

  13. Esuli, A., & Sebastiani, F. (2006). Sentiwordnet: A publicly available lexical resource for opinion mining. In Proceedings of LREC (Vol. 6, pp. 417–422).

  14. Frome, A., Corrado, G. S., Shlens, J., Bengio, S., Dean, J., Mikolov, T., et al. (2013) Devise: A deep visual-semantic embedding model. In Proceedings of NIPS (pp. 2121–2129).

  15. Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. NAS, 101(suppl 1), 5228–5235.

    Article  Google Scholar 

  16. Hanjalic, A., Kofler, C., & Larson, M. (2012). Intent and its discontents: The user at the wheel of the online video search engine. In Proceedings of the 20th ACM international conference on multimedia (pp. 1239–1248). ACM.

  17. Hu, W., Wu, O., Chen, Z., Fu, Z., & Maybank, S. (2007). Recognition of pornographic web pages by classifying texts and images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 1019–1034.

    Article  Google Scholar 

  18. Isola, P., Xiao, J., Torralba, A., & Oliva, A. (2011). What makes an image memorable? In 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 145–152). IEEE.

  19. Jia, J., Wu, S., Wang, X., Hu, P., Cai, L., & Tang, J. (2012). Can we understand van gogh’s mood?: Learning to infer affects from images in social networks. In Proceedings of the 20th ACM international conference on multimedia (pp. 857–860). ACM.

  20. Jia, Y., Salzmann, M., & Darrell, T. (2011). Learning cross-modality similarity for multinomial data. In Proceedings of IEEE ICCV (pp. 2407–2414).

  21. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., et al. (2014). Caffe: Convolutional architecture for fast feature embedding. In Proceedings of ACM multimedia (pp. 675–678).

  22. Jin, Z., Cao, J., Zhang, Y., Zhou, J., & Tian, Q. (2017). Novel visual and statistical image features for microblogs news verification. IEEE Transactions on MultiMedia, 19(3), 598–608.

    Article  Google Scholar 

  23. Joshi, D., Datta, R., Fedorovskaya, E., Luong, Q.-T., Wang, J. Z., Li, J., et al. (2011). Aesthetics and emotions in images. IEEE Signal Processing Magazine, 28(5), 94–115.

    Article  Google Scholar 

  24. Kakalou, I., Psannis, K. E., Krawiec, P., & Badea, R. (2017). Cognitive radio network and network service chaining toward 5G: Challenges and requirements. IEEE Communications Magazine, 55(11), 145–151.

    Article  Google Scholar 

  25. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Proceedings of NIPS (pp. 1097–1105).

  26. Lei, Z., Shuai, W., & Bing, L. (2018). Deep learning for sentiment analysis: A survey. Wiley Interdisciplinary Reviews Data Mining and Knowledge Discovery, 8, e1253.

    Google Scholar 

  27. Li, B., Feng, S., Xiong, W., & Hu, W. (2012). Scaring or pleasing: Exploit emotional impact of an image. In Proceedings of the 20th ACM international conference on Multimedia (pp. 1365–1366). ACM.

  28. Li, L., Jiang, S. Q., & Huang, Q. M. (2012). Learning hierarchical semantic description via mixed-norm regularization for image understanding. IEEE Transactions on MultiMedia, 14(5), 1401–1413.

    Article  Google Scholar 

  29. Li, L., Jiang, S. Q., Zha, Z. J., Wu, Z. P., & Huang, Q. M. (2013). Partial-duplicate image retrieval via saliency-guided visual matching. IEEE Multimedia, 20(3), 13–23.

    Article  Google Scholar 

  30. Li, S., Li, D. X., & Zhao, S. (2018). 5G internet of things: A survey. Journal of Industrial Information Integration, 10, 1–9.

    Article  Google Scholar 

  31. Li, L., Wang, S. H., Jiang, S. Q., & Huang, Q. M. (2018). Attentive recurrent neural network weak-supervised multi-label image classification. In Proceedings of the 26th ACM international conference on Multimedia (pp. 1092–1100). ACM.

  32. Li, L., Yan, C. C., Chen, C., Zhang, C. J., Yin, J., Jiang, B. C., & Huang, Q. M. (2016). Distributed image understanding with semantic dictionary and semantic expansion. Neurocomputing, 174(A), 384–392

    Article  Google Scholar 

  33. Li, L., Yan, C. C., Ji, W., Chen, B. W., Jiang, S. Q., & Huang, Q. M. (2015). Lsh-based semantic dictionary learning for large scale image understanding. JVCIR, 31, 231–236.

    Google Scholar 

  34. Machajdik, J., & Hanbury, A. (2010). Affective image classification using features inspired by psychology and art theory. In Proceedings of the international conference on multimedia (pp. 83–92). ACM.

  35. Marchesotti, L., Perronnin, F., Larlus, D., & Csurka, G. (2011). Assessing the aesthetic quality of photographs using generic image descriptors. In 2011 IEEE international conference on computer vision (ICCV) (pp. 1784–1791). IEEE.

  36. Minka, T. P. (2001). Expectation propagation for approximate bayesian inference. In Proceedings of UAI (pp. 362–369).

  37. Ngiam, J., Khosla, A., Kim, M., Nam, J., Lee, H., & Ng, A.Y. (2011). Multimodal deep learning. In Proceedings of ICML (pp. 689–696).

  38. Niu, Z. X., Hua, G., Gao, X.B., & Tian, Q. (2014). Semi-supervised relational topic model for weakly annotated image recognition in social media. In Proceedings of IEEE CVPR (pp. 4233–4240).

  39. Ohana, B. (2009). Opinion mining with the sentwordnet lexical resource. Dublin: Dublin Institute of Technology.

    Google Scholar 

  40. Ohana, B., & Tierney, B. (2009). Sentiment classification of reviews using sentiwordnet. In 9th. IT & T conference (p. 13).

  41. Oliva, A., & Torralba, A. (2001). Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision, 42(3), 145–175.

    Article  MATH  Google Scholar 

  42. Over, P., Awad, G. M., Fiscus, J., Antonishek, B., Michel, M., Smeaton, A. F., et al. (2011). Trecvid 2010—An overview of the goals, tasks, data, evaluation mechanisms, and metrics. Gaithersburg: National Institute of Standards and Tchnology.

    Google Scholar 

  43. Pang, B., & Lee, L. (2008). Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, 2(1–2), 1–135.

    Article  Google Scholar 

  44. Poria, S., Cambria, E., Hazarika, D., Majumder, N., Zadeh, A., & Morency, L.-P. (2017). Context-dependent sentiment analysis in user-generated videos. In Proceedings of the 55th annual meeting of the association for computational linguistics (Vol. 1: Long Papers, pp. 873–883), Vancouver, Canada, July 2017. Association for Computational Linguistics.

  45. Putthividhy, D., Attias, H. T., & Nagarajan, S. S. (2010). Topic regression multi-modal latent Dirichlet allocation for image annotation. In Proceedings of IEEE CVPR (pp. 3408–3415).

  46. Rosen-Zvi, M., Griffiths, T., Steyvers, M., & Smyth, P. (2004). The author-topic model for authors and documents. In Proceedings of UAI (pp. 487–494).

  47. Roy, S.D., Tao, M., Zeng, W., & Li, S. (2012). Socialtransfer: Cross-domain transfer learning from social streams for media applications. In ACM international conference on multimedia.

  48. Roy, S. D., Tao, M., Zeng, W., & Li, S. (2013). Towards cross-domain learning for social video popularity prediction. IEEE Transactions on Multimedia, 15(6), 1255–1267.

    Article  Google Scholar 

  49. Rui, T., Cui, P., & Zhu, W. (2017). Joint user-interest and social-influence emotion prediction for individuals. Neurocomputing, 230, 66–76.

    Article  Google Scholar 

  50. Snoek, C. G., & Worring, M. (2008). Concept-based video retrieval. Foundations and Trends in Information Retrieval, 2(4), 215–322.

    Article  Google Scholar 

  51. Srivastava, N., & Salakhutdinov, R. R. (2012). Multimodal learning with deep Boltzmann machines. In Proceedings of NIPS (pp. 2222–2230).

  52. Sun, L., Wang, X., Wang, Z., Zhao, H., & Zhu, W. (2017). Social-aware video recommendation for online social groups. IEEE Transactions on MultiMedia, 19(3), 609–618.

    Article  Google Scholar 

  53. Suryanegara, M., Mirfananda, A. S., Asvial, M., & Hayati, N. (2016). 5G as intelligent system: Model and regulatory consequences. In Sai intelligent systems conference.

  54. Thelwall, M., Buckley, K., Paltoglou, G., Cai, D., & Kappas, A. (2010). Sentiment strength detection in short informal text. Journal of the American Society for Information Science and Technology, 61(12), 2544–2558.

    Article  Google Scholar 

  55. Tripathi, S., Acharya, S., Sharma, R. D., Mittal, S., & Bhattacharya, S. (2017). Using deep and convolutional neural networks for accurate emotion classification on deap dataset. In Proceedings of AAAI.

  56. Virtanen, S., Jia, Y., Klami, A., & Darrell, T. (2012). Factorized multi-modal topic model. arXiv preprint arXiv:1210.4920.

  57. Vonikakis, V., & Winkler, S. (2012). Emotion-based sequence of family photos. In Proceedings of the 20th ACM international conference on multimedia (pp. 1371–1372). ACM.

  58. Wang, C., Blei, D. M., & Li, F. F. (2009). Simultaneous image classification and annotation. In Proceedings of IEEE CVPR (pp. 1903–1910).

  59. Wang, D., Cui, P., Ou, M., & Zhu, W. (2015). Learning compact hash codes for multimodal representations using orthogonal deep structure. IEEE Transactions on MultiMedia, 17(9), 1404–1416.

    Article  Google Scholar 

  60. Wang, D., Zhang, Y., Wei, H., You, X., Gao, X., & Wang, J. (2016). An overview of transmission theory and techniques of large-scale antenna systems for 5G wireless communications. Science China Information Sciences, 59(8), 081301.

    Article  Google Scholar 

  61. Wang, H., Meghawat, A., Morency, L. P., & Xing, E. P. (2017). Select-additive learning: Improving cross-individual generalization in multimodal sentiment analysis. In Proceedings of ICME.

  62. Wang, W., & He, Q. (2008). A survey on emotional semantic image retrieval. In ICIP (pp. 117–120).

  63. Wang, X., Jia, J., Hu, P., Wu, S., Tang, J., & Cai, L. (2012). Understanding the emotional impact of images. In Proceedings of the 20th ACM international conference on multimedia (pp. 1369–1370). ACM.

  64. Wang, Y., Liu, J., Qu, J. S., Huang, Y. L., Chen, J. M., & Feng, X. (2014). Hashtag graph based topic model for tweet mining. In Proceedings of IEEE ICDM (pp. 1025–1030).

  65. Wang, Y., Wu, F., Song, J., Li, X., & Zhuang, Y. (2014). Multi-modal mutual topic reinforce modeling for cross-media retrieval. In Proceedings of ACM multimedia (pp. 307–316).

  66. Wilson, T., Wiebe, J., & Hoffmann, P. (2005). Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of the conference on human language technology and empirical methods in natural language processing (pp. 347–354).

  67. Yang, J., Ming, S., & Sun, X. (2017). Learning visual sentiment distributions via augmented conditional probability neural network. In Proceedings of AAAI.

  68. Yang, S., Li, L., Wang, S., Zhang, W., & Huang, Q. (2017). A graph regularized deep neural network for unsupervised image representation learning. In IEEE CVPR.

  69. Yang, Y., Jia, J., Zhang, S., Wu, B., Chen, Q., Li, J., et al. (2014). How do your friends on social media disclose your emotions? In Proceedings of AAAI (Vol. 14, pp. 1–7).

  70. Yanulevskaya, V., Uijlings, J., Bruni, E., Sartori, A., Zamboni, E., Bacci, F., et al. (2012). In the eye of the beholder: Employing statistical analysis and eye tracking for analyzing abstract paintings. In Proceedings of the 20th ACM international conference on multimedia (pp. 349–358). ACM.

  71. Yanulevskaya, V., Van Gemert, J., Roth, K., Herbold, A.-K., Sebe, N., & Geusebroek, J.-M. (2008). Emotional valence categorization using holistic image features. In 15th IEEE international conference on image processing, 2008. ICIP 2008 (pp. 101–104). IEEE.

  72. You, Q., Jin, H., & Luo, J. (2017). Visual sentiment analysis by attending on local image regions. In Proceedings of AAAI.

  73. You, Q., Luo, J., Jin, H., & Yang, J. (2015). Robust image sentiment analysis using progressively trained and domain transferred deep networks. In Proceedings of AAAI.

  74. You, X., Zhang, C., Tan, X., Jin, S., & Wu, H. (2018). Ai for 5G: Research directions and paradigms. Scientia Sinica (Informationis).

  75. Zadeh, A., Chen, M., Poria, S., Cambria, E., & Morency, L. P. (2017). Tensor fusion network for multimodal sentiment analysis. In Proceedings of EMNLP.

  76. Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. In Proceedings of ECCV (pp. 818–833). Springer.

  77. Zhao, R., & Grosky, W. I. (2002). Narrowing the semantic gap—Improved text-based web document retrieval using visual features. IEEE Transactions on Multimedia, 4, 189–200.

    Article  Google Scholar 

  78. Zhao, S., Yao, H., Gao, Y., Ji, R., & Ding, G. (2017). Continuous probability distribution prediction of image emotions via multitask shared sparse regression. IEEE Transactions on MultiMedia, 19(3), 632–645.

    Article  Google Scholar 

  79. Zheng, Y., Zhang, Y. J., & Hugo, L. (2014). A deep and autoregressive approach for topic modeling of multimodal data. Preprint arXiv:1409.3970.

  80. Zhu, X., Li, L., Weigang, Z., Rao, T., Xu, M., Huang, Q., et al. (2017). Dependency exploitation: A Unified CNN-RNN approach for visual emotion recognition. In Proceedings of IJCAI.

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Liang Li or Wenchao Li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Liang Li is the co-corresponding author.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sun, J., Li, L., Li, W. et al. Enabling 5G: sentimental image dominant graph topic model for cross-modality topic detection. Wireless Netw 26, 1549–1561 (2020). https://doi.org/10.1007/s11276-019-02009-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11276-019-02009-3

Keywords

Navigation