Skip to main content
Log in

Multi-level word features based on CNN for fake news detection in cultural communication

  • Original Article
  • Published:
Personal and Ubiquitous Computing Aims and scope Submit manuscript

A Correction to this article was published on 11 March 2022

This article has been updated

Abstract

In recent years, due to the booming development of online social networks, fake news has been appearing in large numbers and widespread in the online world. With deceptive words, online social network users can get infected by these online fake news easily, which has brought about tremendous effects on the offline society already. An important goal in improving the trustworthiness of information in online social networks is to identify the fake news timely. However, fake news detection remains to be a challenge, primarily because the content is crafted to resemble the truth in order to deceive readers, and without fact-checking or additional information, it is often hard to determine veracity by text analysis alone. In this paper, we first proposed multi-level convolutional neural network (MCNN), which introduced the local convolutional features as well as the global semantics features, to effectively capture semantic information from article texts which can be used to classify the news as fake or not. We then employed a method of calculating the weight of sensitive words (TFW), which has shown their stronger importance with their fake or true labels. Finally, we develop MCNN-TFW, a multiple-level convolutional neural network-based fake news detection system, which is combined to perform fake news detection in that MCNN extracts article representation and WS calculates the weight of sensitive words for each news. Extensive experiments have been done on fake news detection in cultural communication to compare MCNN-TFW with several state-of-the-art models, and the experimental results have demonstrated the effectiveness of the proposed model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Change history

References

  1. Fuller CM, Biros DP, Wilson RL (2009) Decision support for determining veracity via linguisticbased cues. Decis Support Syst 46(3):695–703

    Article  Google Scholar 

  2. Morstatter F, Kumar S, Liu H, Maciejewski R (2013) Understanding twitter data with tweetxplorer. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 1482–1485)

  3. Shu K, Mahudeswaran D, Wang S, Lee D, Liu H (2018). Fakenewsnet: A data repository with news content, social context and spatialtemporal information for studying fake news on social media. arXiv preprint arXiv:1809.01286

  4. Shu K, Mahudeswaran D, Liu H (2019). FakeNewsTracker: a tool for fake news collection, detection, and visualization. Computational and Mathematical Organization Theory 25(1):60–71

  5. Bogaard G, Meijer EH, Vrij A, Merckelbach H (2016) Scientific Content Analysis (SCAN) cannot distinguish between truthful and fabricated accounts of a negative event. Front Psychol 7(2016):243

    Google Scholar 

  6. Nahari G, Vrij A, Fisher RP (2012) Does the truth come out in the writing? Scan as a lie detection tool. Law Human Behavior 36(1):68

    Article  Google Scholar 

  7. Castillo C, Mendoza M, Poblete B (2011) Information credibility on twitter. In Proceedings of the 20th international conference on World Wide Web. ACM, 675–684. 134:A635–A646, Dec. 1965

  8. Ott M, Cardie C, Hancock JT (2013) Negative deceptive opinion spam. In HLT-NAACL. 497–501

  9. Ott M, Choi Y, Cardie C, Hancock JT (2011) Finding deceptive opinion spam by any stretch of the imagination. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1. Association for Computational Linguistics, pp 309–319

  10. Blunsom P, Grefenstette P, Kalchbrenner N (2014) A convolutional neural network for modelling sentences. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics

  11. Kimura M, Saito K, Motoda H (2009) Efficient estimation of influence functions for SIS model on social networks. In IJCAI. 2046–2051

  12. Yih W-t, He X, Meek C (2014) Semantic parsing for single-relation question answering. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 2:643–648

  13. Kim Y (2014) Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882

  14. Mihalcea R, Strapparava C (2009) The lie detector: explorations in the automatic recognition of deceptive language. In: Proceedings of the ACL-IJCNLP 2009 Conference Short Papers. Association for Computational Linguistics, pp 309–312

  15. Johnson M (1998) PCFG models of linguistic tree representations. Comput Linguist 24(4):613–632

    Google Scholar 

  16. Dou H, Qi Y, Wei W, Song H (2016) A two-time-scale load balancing framework for minimizing electricity bills of internet data centers [J]. Pers Ubiquit Comput 20(5):681–693

    Article  Google Scholar 

  17. Feng S, Banerjee R, Choi Y (2012) Syntactic stylometry for deception detection. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers-Volume 2. Association for Computational Linguistics, pp 171–175

  18. LeCun Y, Bottou L, Bengio Y, Patrick H (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  19. Qi S, Zheng Y, Li M, Liu Y, Qiu J (2016) Scalable industry data access control in RFID-enabled supply chain. IEEE/ACM Trans Netw (ToN) 24(6):3551–3564, 3.376

    Article  Google Scholar 

  20. Qi S, Zheng Y, Li M, Lu L, Liu Y (2016) Secure and private RFID-enabled third-party supply chain systems. IEEE Trans Comput (TC) 65(11):3413–3426 2.916

    Article  MathSciNet  Google Scholar 

  21. Qi S, Zheng Y Crypt-DAC: cryptographically enforced dynamic access control in the Cloud, IEEE Transactions on Dependable and Secure Computing, 29 March 2019. https://doi.org/10.1109/TDSC.2019.2908164

  22. Xi M, Qi Y, Wu K, Zhao J, Li M (2011) Using potential to guide mobile nodes in wireless sensor networks. Ad Hoc & Sensor Wireless Networks 12(3–4):229–251

    Google Scholar 

  23. Wang WY (2017) “Liar, Liar pants on fire”: a new benchmark dataset for fake news detection. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, pp 422–426

  24. Qian F, Gong C, Sharma K, Liu Y (2018) Neural user response generator: fake news detection with collective user intelligence. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18. International Joint Conferences on Artificial Intelligence Organization, pp 3834–3840

  25. Rumelhart DE, Hinton GE, Williams RJ et al (1988) Learning representations by back-propagating errors. Cogn Model 3(1988):1

    MATH  Google Scholar 

  26. Yan J, Qi Y, Rao Q (2018) Detecting malware with an ensemble method based on deep neural network. https://doi.org/10.1155/2018/7247095, UNSP 7247095

  27. Chen P,Qi Y, Li X, Hou D, Lyu MR-T (2016) ARF-Predictor: effective prediction of aging-related failure using entropy. https://doi.org/10.1109/TDSC.2016.2604381

  28. Wang X, Qi Y, Wang Z et al Design and implementation of SecPod: a framework for virtualization-based security systems. https://doi.org/10.1109/TDSC.2017.2675991

  29. Hochreiter S, Jürgen S (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  30. Wang P, Qi Y, Liu X (2014) Power-aware optimization for heterogeneous multi-tier clusters, pages 2005–2015

  31. Sun Z, Song H, Wang H, Fan X Energy balance-based steerable arguments coverage method in WSNs. IEEE Access 2017 Mar 20, Issue: 99. 6: 33766–33773. https://doi.org/10.1109/ACCESS.2017.2682845

  32. Rashkin H, Choi E, Jang JY, Volkova S, Choi Y (2017) Truth of varying shades: analyzing language in fake news and political fact-checking. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp 2921–2927

  33. Volkova S, Shaffer K, Jang JY, Hodas N (2017) Separating facts from fiction: linguistic models to classify suspicious and trusted news posts on twitter. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Vol. 2. 647–653

  34. Zheng P, Qi Y, Zhou Y, Chen P, Zhan J, L yu MR-T (2014) An automatic framework for detecting and characterizing the performance degradation of software systems. 63:927–943

  35. Chopra S, Jain S, Sholar JM (2017) Towards automatic identification of fake news: Headline-article stance detection with LSTM attention models

  36. Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pages 1746–1751

  37. Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119

  38. Wu HC, Luk RWP, Wong KF, Kwok KL (Jun. 2008) Interpreting TF-IDF term weights as making relevance decisions. ACM Trans Inf Syst 26(3):13

    Article  Google Scholar 

  39. Aizawa A (2003) An information-theoretic perspective of TF–IDF measures. Inf Process Manage 39(1):45–65

    Article  Google Scholar 

  40. Qiao Y-n, Yong Q, Di H (2001) Tensor field model for higher-order information retrieval. 84(12):2303–2313

  41. Wei W, Yong Q (2011) Information potential fields navigation in wireless ad-hoc sensor networks. Sensors 11(5):4794–4807

    Article  Google Scholar 

  42. Xu Q, Wang L, Hei XH, Shen P, Shi W, Shan L (2014) GI/Geom/1 queue based on communication model for mesh networks. Int J Commun Syst 27(11):3013–3029

    Google Scholar 

  43. Yang XL, Shen PY et al (2012) Holes detection in anisotropic sensornets: topological methods [J]. Int J Distrib Sens Netw 8(10):135054

    Article  Google Scholar 

  44. Song H, Li W, Shen P, Vasilakos A (2017) Gradient-driven parking navigation using a continuous information potential field based on wireless sensor network. Inf Sci 408(C):100–114. https://doi.org/10.1016/j.ins.2017.04.042

    Article  Google Scholar 

  45. Qiang Y, Zhang J (2013) A bijection between lattice-valued filters and lattice-valued congruences in residuated lattices. Math Probl Eng 36(8):4218–4229

    MathSciNet  MATH  Google Scholar 

  46. Yang XL, Zhou B, Feng J, Shen PY (2012) Combined energy minimization for image reconstruction from few views. Math Probl Eng 2012

  47. Srivastava HM, Zhang Y, Wang L, Shen P, Zhang J (2014) A local fractional integral inequality on fractal space analogous to Anderson’s inequality[C]//Abstract and Applied Analysis. Hindawi Publishing Corporation, 46(8): 5218–5229, Ariticle number: 797561, https://doi.org/10.1155/2014/797561, WOS:000339756400001

  48. Zhang J, Song H, Wan Y (2018) Big data analytics enabled by feature extraction based on partial independence. Neurocomputing 288:3–10

    Article  Google Scholar 

  49. Ma J, Gao W, Mitra P, Kwon S, Jansen BJ, Wong K-F, Cha M (2016) Detecting rumors from microblogs with recurrent neural networks. In: IJCAI, pp 3818–3824

  50. Ma J, Gao W, Wong K-F (2017) Detect rumors in microblog posts using propagation structure via kernel learning. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vol. 1. pp 708–717

  51. Pennebaker JW, Boyd RL, Jordan K, Blackburn K (2015) The development and psychometric properties of LIWC2015. Technical Report

  52. Ren Y, Zhang Y (2016) Deceptive opinion spam detection using neural network. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 140–150

  53. Zhang J, WeiWei DP, Woźniak M, Kośmider L, Damaševĭcius R (2019) A neuro-heuristic approach for recognition of lung diseases from X-ray images. Author links open overlay panel. Exp Syst Appl 126:218–232

    Article  Google Scholar 

  54. Ji Y, Eisenstein J (2014) Representation learning for text-level discourse parsing. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), volume 1, pages 13–24

  55. Rubin VL, Lukoianova T (2015) Truth and deception at the rhetorical structure level. J Assoc Inf Sci Technol 66(5):905–917

    Article  Google Scholar 

  56. Wang X, Qi Y, Wang Z et al (2019) Design and implementation of SecPod: a framework for virtualization-based security systems. 16(1):44–57

Download references

Funding

This work is supported by the National key R&D Program of China under Grant No. 2018YFB0203901, 2016YFB1000604, and 2018YFB1402700 and the Key Research and Development Program of Shaanxi Province (No. 2018ZDXM-GY-036).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qingyuan Hu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised due to change in the order of authors, addition of references and deletion of Figures 7 and 11.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hu, Q., Li, Q., Lu, Y. et al. Multi-level word features based on CNN for fake news detection in cultural communication. Pers Ubiquit Comput 24, 259–272 (2020). https://doi.org/10.1007/s00779-019-01289-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00779-019-01289-y

Keywords

Navigation