Skip to main content
Log in

ACTSSD: social spammer detection based on active learning and co-training

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

The rumors, advertisements and malicious links are spread in social networks by social spammers, which affect users’ normal access to social networks and cause security problems. Most methods aim to detect social spammers by various features, such as content features, behavior features and relationship graph features, which rely on a large-scale labeled data. However, labeled data are lacking for training in real world, and manual annotating is time-consuming and labor-intensive. To solve this problem, we propose a novel method which combines active learning algorithm with co-training algorithm to make full use of unlabeled data. In co-training, user features are divided into two views without overlap. Classifiers are trained iteratively with labeled instances and the most confident unlabeled instances with pseudo-labels. In active learning, the most representative and uncertain instances are selected and annotated with real labels to extend labeled dataset. Experimental results on the Twitter and Apontador datasets show that our method can effectively detect social spammers in the case of limited labeled data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

References

  1. Adewole KS, Anuar NB, Kamsin A, Varathan KD, Razak SA (2017) Malicious accounts: dark of the social networks. J Netw Comput Appl 79:41–67

    Article  Google Scholar 

  2. Can U, Alatas B (2019) A new direction in social network analysis: online social network analysis problems and applications. Physica A 535:122372

    Article  Google Scholar 

  3. Gao H, Hu J, Wilson C, Li Z, Chen Y, Zhao BY (2010) Detecting and characterizing social spam campaigns. In: Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement, pp 35–47

  4. Ferrara E, Varol O, Davis C, Menczer F, Flammini A (2016) The rise of social bots. Commun ACM 59(7):96–104

    Article  Google Scholar 

  5. Xiaotao C, Caixia L, Shuxin L (2015) Graph-based features for identifying spammers in microblog networks. Acta Autom Sin 41(9):1533–1541

    Google Scholar 

  6. Zhang Y, Huang Y, Gan S, Ding Y et al (2017) Weibo spammers’ identification algorithm based on Bayesian model. J Commun 38(1):44

    Google Scholar 

  7. Zheng X, Zeng Z, Chen Z, Yu Y, Rong C (2015) Detecting spammers on social networks. Neurocomputing 159:27–34

    Article  Google Scholar 

  8. Chen K, Chen L, Zhu P, Xiong Y (2015) Interaction based on method for spam detection in online social networks. J Commun 36(7):120–127

    Google Scholar 

  9. Amleshwaram AA, Reddy AN, Yadav S, Gu G, Yang C (2013) Cats: characterizing automation of twitter spammers. In: COMSNETS, Citeseer, pp 1–10

  10. Prasetyo PK, Lo D, Achananuparp P, Tian Y, Lim EP (2012) Automatic classification of software related microblogs. In: 2012 28th IEEE International Conference on Software Maintenance (ICSM), IEEE, pp 596–599

  11. Shen H, Ma F, Zhang X, Zong L, Liu X, Liang W (2017) Discovering social spammers from multiple views. Neurocomputing 225:49–57

    Article  Google Scholar 

  12. Li Z, Zhang X, Shen H, Liang W, He Z (2015) A semi-supervised framework for social spammer detection. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, pp 177–188

  13. Li Y, Lv Y, Wang S, Liang J, Li J, Li X (2019) Cooperative hybrid semi-supervised learning for text sentiment classification. Symmetry 11(2):133

    Article  Google Scholar 

  14. Fu Y, Zhu X, Li B (2013) A survey on instance selection for active learning. Knowl Inf Syst 35(2):249–283

    Article  Google Scholar 

  15. Benevenuto F, Rodrigues T, Almeida V, Almeida J, Gonçalves M (2009) Detecting spammers and content promoters in online video social networks. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp 620–627

  16. Ye S, Ye R, Zhu m (2017) Method to find spammer group for Weibo based on network relationship. Comput Eng Appl 06

  17. Li S, Li X, Yang H, Sun G, Lang F (2017) A zombie account detection method in microblog based on the pagerank. In: 2017 IEEE International Conference on Software Quality. Reliability and Security Companion (QRS-C). IEEE, pp 267–270

  18. Tan E, Guo L, Chen S, Zhang X, Zhao Y (2013) Unik: unsupervised social network spam detection. In: Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, pp 479–488

  19. Wang G, Zhang X, Tang S, Zheng H, Zhao BY (2016) Unsupervised clickstream clustering for user behavior analysis. In: Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, pp 225–236

  20. Chen H, Liu J, Lv Y, Li MH, Liu M, Zheng Q (2018) Semi-supervised clue fusion for spammer detection in Sina Weibo. Inf Fusion 44:22–32

    Article  Google Scholar 

  21. Wu F, Wu C, Liu J (2018) Semi-supervised collaborative learning for social spammer and spam message detection in microblogging. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management, pp 1791–1794

  22. Settles B (2009) Active learning literature survey

  23. Zhang X, Bai H, Liang W (2016) A social spam detection framework via semi-supervised learning. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, pp 214–226

  24. Zhou ZH, Li M (2010) Semi-supervised learning by disagreement. Knowl Inf Syst 24(3):415–439

    Article  Google Scholar 

  25. Yang C, Harkreader R, Gu G (2013) Empirical evaluation and new design for fighting evolving twitter spammers. IEEE Trans Inf Forensics Secur 8(8):1280–1293

    Article  Google Scholar 

  26. Tan K, Gao M, Li W, Tian R, Wen J, Xiong Q (2017) Two-layer sampling active learning algorithm for social spammer detection. Zidonghua Xuebao/Acta Autom Sin

  27. Benevenuto F, Magno G, Rodrigues T, Almeida V (2010) Detecting spammers on twitter. In: Collaboration, Electronic Messaging, Anti-abuse and Spam Conference (CEAS), vol 6, p 12

  28. Costa H, Merschmann LH, Barth F, Benevenuto F (2014) Pollution, bad-mouthing, and local marketing: the underground of location-based social networks. Inf Sci 279:123–137

    Article  Google Scholar 

  29. Li W, Gao M, Rong W, Wen J, Xiong Q, Ling B (2016) LSSL-SSD: social spammer detection with Laplacian score and semi-supervised learning. In: International Conference on Knowledge Science. Springer, Engineering and Management, pp 439–450

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pin Yang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, A., Yang, P. & Cheng, P. ACTSSD: social spammer detection based on active learning and co-training. J Supercomput 78, 2744–2771 (2022). https://doi.org/10.1007/s11227-021-03966-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-021-03966-3

Keywords

Navigation