Abstract
In recent years, due to the booming development of online social networks, fake news has been appearing in large numbers and widespread in the online world. With deceptive words, online social network users can get infected by these online fake news easily, which has brought about tremendous effects on the offline society already. An important goal in improving the trustworthiness of information in online social networks is to identify the fake news timely. However, fake news detection remains to be a challenge, primarily because the content is crafted to resemble the truth in order to deceive readers, and without fact-checking or additional information, it is often hard to determine veracity by text analysis alone. In this paper, we first proposed multi-level convolutional neural network (MCNN), which introduced the local convolutional features as well as the global semantics features, to effectively capture semantic information from article texts which can be used to classify the news as fake or not. We then employed a method of calculating the weight of sensitive words (TFW), which has shown their stronger importance with their fake or true labels. Finally, we develop MCNN-TFW, a multiple-level convolutional neural network-based fake news detection system, which is combined to perform fake news detection in that MCNN extracts article representation and WS calculates the weight of sensitive words for each news. Extensive experiments have been done on fake news detection in cultural communication to compare MCNN-TFW with several state-of-the-art models, and the experimental results have demonstrated the effectiveness of the proposed model.
Similar content being viewed by others
Change history
11 March 2022
A Correction to this paper has been published: https://doi.org/10.1007/s00779-022-01670-4
References
Fuller CM, Biros DP, Wilson RL (2009) Decision support for determining veracity via linguisticbased cues. Decis Support Syst 46(3):695–703
Morstatter F, Kumar S, Liu H, Maciejewski R (2013) Understanding twitter data with tweetxplorer. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 1482–1485)
Shu K, Mahudeswaran D, Wang S, Lee D, Liu H (2018). Fakenewsnet: A data repository with news content, social context and spatialtemporal information for studying fake news on social media. arXiv preprint arXiv:1809.01286
Shu K, Mahudeswaran D, Liu H (2019). FakeNewsTracker: a tool for fake news collection, detection, and visualization. Computational and Mathematical Organization Theory 25(1):60–71
Bogaard G, Meijer EH, Vrij A, Merckelbach H (2016) Scientific Content Analysis (SCAN) cannot distinguish between truthful and fabricated accounts of a negative event. Front Psychol 7(2016):243
Nahari G, Vrij A, Fisher RP (2012) Does the truth come out in the writing? Scan as a lie detection tool. Law Human Behavior 36(1):68
Castillo C, Mendoza M, Poblete B (2011) Information credibility on twitter. In Proceedings of the 20th international conference on World Wide Web. ACM, 675–684. 134:A635–A646, Dec. 1965
Ott M, Cardie C, Hancock JT (2013) Negative deceptive opinion spam. In HLT-NAACL. 497–501
Ott M, Choi Y, Cardie C, Hancock JT (2011) Finding deceptive opinion spam by any stretch of the imagination. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1. Association for Computational Linguistics, pp 309–319
Blunsom P, Grefenstette P, Kalchbrenner N (2014) A convolutional neural network for modelling sentences. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics
Kimura M, Saito K, Motoda H (2009) Efficient estimation of influence functions for SIS model on social networks. In IJCAI. 2046–2051
Yih W-t, He X, Meek C (2014) Semantic parsing for single-relation question answering. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 2:643–648
Kim Y (2014) Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882
Mihalcea R, Strapparava C (2009) The lie detector: explorations in the automatic recognition of deceptive language. In: Proceedings of the ACL-IJCNLP 2009 Conference Short Papers. Association for Computational Linguistics, pp 309–312
Johnson M (1998) PCFG models of linguistic tree representations. Comput Linguist 24(4):613–632
Dou H, Qi Y, Wei W, Song H (2016) A two-time-scale load balancing framework for minimizing electricity bills of internet data centers [J]. Pers Ubiquit Comput 20(5):681–693
Feng S, Banerjee R, Choi Y (2012) Syntactic stylometry for deception detection. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers-Volume 2. Association for Computational Linguistics, pp 171–175
LeCun Y, Bottou L, Bengio Y, Patrick H (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Qi S, Zheng Y, Li M, Liu Y, Qiu J (2016) Scalable industry data access control in RFID-enabled supply chain. IEEE/ACM Trans Netw (ToN) 24(6):3551–3564, 3.376
Qi S, Zheng Y, Li M, Lu L, Liu Y (2016) Secure and private RFID-enabled third-party supply chain systems. IEEE Trans Comput (TC) 65(11):3413–3426 2.916
Qi S, Zheng Y Crypt-DAC: cryptographically enforced dynamic access control in the Cloud, IEEE Transactions on Dependable and Secure Computing, 29 March 2019. https://doi.org/10.1109/TDSC.2019.2908164
Xi M, Qi Y, Wu K, Zhao J, Li M (2011) Using potential to guide mobile nodes in wireless sensor networks. Ad Hoc & Sensor Wireless Networks 12(3–4):229–251
Wang WY (2017) “Liar, Liar pants on fire”: a new benchmark dataset for fake news detection. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, pp 422–426
Qian F, Gong C, Sharma K, Liu Y (2018) Neural user response generator: fake news detection with collective user intelligence. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18. International Joint Conferences on Artificial Intelligence Organization, pp 3834–3840
Rumelhart DE, Hinton GE, Williams RJ et al (1988) Learning representations by back-propagating errors. Cogn Model 3(1988):1
Yan J, Qi Y, Rao Q (2018) Detecting malware with an ensemble method based on deep neural network. https://doi.org/10.1155/2018/7247095, UNSP 7247095
Chen P,Qi Y, Li X, Hou D, Lyu MR-T (2016) ARF-Predictor: effective prediction of aging-related failure using entropy. https://doi.org/10.1109/TDSC.2016.2604381
Wang X, Qi Y, Wang Z et al Design and implementation of SecPod: a framework for virtualization-based security systems. https://doi.org/10.1109/TDSC.2017.2675991
Hochreiter S, Jürgen S (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Wang P, Qi Y, Liu X (2014) Power-aware optimization for heterogeneous multi-tier clusters, pages 2005–2015
Sun Z, Song H, Wang H, Fan X Energy balance-based steerable arguments coverage method in WSNs. IEEE Access 2017 Mar 20, Issue: 99. 6: 33766–33773. https://doi.org/10.1109/ACCESS.2017.2682845
Rashkin H, Choi E, Jang JY, Volkova S, Choi Y (2017) Truth of varying shades: analyzing language in fake news and political fact-checking. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp 2921–2927
Volkova S, Shaffer K, Jang JY, Hodas N (2017) Separating facts from fiction: linguistic models to classify suspicious and trusted news posts on twitter. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Vol. 2. 647–653
Zheng P, Qi Y, Zhou Y, Chen P, Zhan J, L yu MR-T (2014) An automatic framework for detecting and characterizing the performance degradation of software systems. 63:927–943
Chopra S, Jain S, Sholar JM (2017) Towards automatic identification of fake news: Headline-article stance detection with LSTM attention models
Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pages 1746–1751
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119
Wu HC, Luk RWP, Wong KF, Kwok KL (Jun. 2008) Interpreting TF-IDF term weights as making relevance decisions. ACM Trans Inf Syst 26(3):13
Aizawa A (2003) An information-theoretic perspective of TF–IDF measures. Inf Process Manage 39(1):45–65
Qiao Y-n, Yong Q, Di H (2001) Tensor field model for higher-order information retrieval. 84(12):2303–2313
Wei W, Yong Q (2011) Information potential fields navigation in wireless ad-hoc sensor networks. Sensors 11(5):4794–4807
Xu Q, Wang L, Hei XH, Shen P, Shi W, Shan L (2014) GI/Geom/1 queue based on communication model for mesh networks. Int J Commun Syst 27(11):3013–3029
Yang XL, Shen PY et al (2012) Holes detection in anisotropic sensornets: topological methods [J]. Int J Distrib Sens Netw 8(10):135054
Song H, Li W, Shen P, Vasilakos A (2017) Gradient-driven parking navigation using a continuous information potential field based on wireless sensor network. Inf Sci 408(C):100–114. https://doi.org/10.1016/j.ins.2017.04.042
Qiang Y, Zhang J (2013) A bijection between lattice-valued filters and lattice-valued congruences in residuated lattices. Math Probl Eng 36(8):4218–4229
Yang XL, Zhou B, Feng J, Shen PY (2012) Combined energy minimization for image reconstruction from few views. Math Probl Eng 2012
Srivastava HM, Zhang Y, Wang L, Shen P, Zhang J (2014) A local fractional integral inequality on fractal space analogous to Anderson’s inequality[C]//Abstract and Applied Analysis. Hindawi Publishing Corporation, 46(8): 5218–5229, Ariticle number: 797561, https://doi.org/10.1155/2014/797561, WOS:000339756400001
Zhang J, Song H, Wan Y (2018) Big data analytics enabled by feature extraction based on partial independence. Neurocomputing 288:3–10
Ma J, Gao W, Mitra P, Kwon S, Jansen BJ, Wong K-F, Cha M (2016) Detecting rumors from microblogs with recurrent neural networks. In: IJCAI, pp 3818–3824
Ma J, Gao W, Wong K-F (2017) Detect rumors in microblog posts using propagation structure via kernel learning. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vol. 1. pp 708–717
Pennebaker JW, Boyd RL, Jordan K, Blackburn K (2015) The development and psychometric properties of LIWC2015. Technical Report
Ren Y, Zhang Y (2016) Deceptive opinion spam detection using neural network. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 140–150
Zhang J, WeiWei DP, Woźniak M, Kośmider L, Damaševĭcius R (2019) A neuro-heuristic approach for recognition of lung diseases from X-ray images. Author links open overlay panel. Exp Syst Appl 126:218–232
Ji Y, Eisenstein J (2014) Representation learning for text-level discourse parsing. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), volume 1, pages 13–24
Rubin VL, Lukoianova T (2015) Truth and deception at the rhetorical structure level. J Assoc Inf Sci Technol 66(5):905–917
Wang X, Qi Y, Wang Z et al (2019) Design and implementation of SecPod: a framework for virtualization-based security systems. 16(1):44–57
Funding
This work is supported by the National key R&D Program of China under Grant No. 2018YFB0203901, 2016YFB1000604, and 2018YFB1402700 and the Key Research and Development Program of Shaanxi Province (No. 2018ZDXM-GY-036).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The original online version of this article was revised due to change in the order of authors, addition of references and deletion of Figures 7 and 11.
Rights and permissions
About this article
Cite this article
Hu, Q., Li, Q., Lu, Y. et al. Multi-level word features based on CNN for fake news detection in cultural communication. Pers Ubiquit Comput 24, 259–272 (2020). https://doi.org/10.1007/s00779-019-01289-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00779-019-01289-y