Skip to main content
Log in

Unsupervised and supervised methods for the detection of hurriedly created profiles in recommender systems

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Recommender systems try to provide users with accurate personalized suggestions for items based on an analysis of previous user decisions and the decisions made by other users. These systems suffer from profile injection attacks, where malicious profiles are generated in order to promote or demote a particular item introducing abnormal ratings. The problem of automatic detection of such malicious profiles has been recently addressed by a great number of authors in the literature using supervised and unsupervised approaches. In this paper, we propose a framework to identify anomalous rating profiles, where each attacker (outlier) hurriedly creates profiles that inject into the system an unspecified combination of random ratings and specific ratings, without any prior knowledge of the existing ratings. This attack is a superset of the two different attacks (Uniform and Delta) proposed in Harper et al. (ACM Trans Interact Intell Syst 5(4):19, 2016) making the attack model more realistic and its detection more challenging. The proposed detection method is based on several attributes related to the unpredictable behavior of the outliers in a validation set, on the user-item rating matrix, on the similarity between users and on the filler items. In this work, we propose a new attribute (RIS) to capture the randomness in item selection of the abnormal profiles. In this work, three different systems are proposed: (1) a probabilistic framework that estimates the probability of a user to be an outlier by combining several features in a completely unsupervised way. (2) An unsupervised clustering system based on the k-means algorithm that automatically spots the spurious profiles. (3) A supervised framework that uses a random forest classifier for cases where labeling sample data is available. Experimental results on the MovieLens and the Small Netflix datasets demonstrate the high performance of the proposed methods as well as the discrimination accuracy of the proposed features.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. On the opposite case, e.g. under RIS attribute, the \(1-CDF\) can be used.

  2. \(\overline{w}(f) = \frac{w(f)}{\sum _{f'} w(f')}\)

  3. The code implementing the proposed method together with the datasets and the experimental results is publicly available at https://sites.google.com/site/costaspanagiotakis/research/hurryattackrs.

  4. The rest of the features proposed in [13] were not computed, since they use connections between users that are not available in our datasets.

  5. https://grouplens.org/datasets/movielens/100k/

  6. https://grouplens.org/datasets/movielens/1m/

References

  1. Adomavicius G, Kwon Y (2012) Improving aggregate recommendation diversity using ranking-based techniques. IEEE Trans Knowl Data Eng 24(5):896–911

    Google Scholar 

  2. Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng 17(6):734–749

    Google Scholar 

  3. Belgiu M, Drăguţ L (2016) Random forest in remote sensing: a review of applications and future directions. ISPRS J Photogramm Remote Sens 114:24–31

    Google Scholar 

  4. Bennett J, Lanning S, Netflix N (2007) The netflix prize. In: In KDD Cup and Workshop in conjunction with KDD

  5. Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  MATH  Google Scholar 

  6. Burke R, Mobasher B, Williams C (2006) Classification features for attack detection in collaborative recommender systems. In: International conference on knowledge discovery and data mining, p 17–20

  7. Burke R, O’Mahony MP, Hurley NJ (2015) Robust collaborative recommendation. In: Recommender systems handbook, Springer, pp 961–995

  8. Cai H, Zhang F (2018) An unsupervised method for detecting shilling attacks in recommender systems by mining item relationship and identifying target items. Comput J 62(4):579–597

    MathSciNet  Google Scholar 

  9. Cao J, Wu Z, Mao B, Zhang Y (2013) Shilling attack detection utilizing semi-supervised learning method for collaborative recommender system. World Wide Web 16(5–6):729–748

    Google Scholar 

  10. Chen K, Chan PP, Zhang F, Li Q (2018) Shilling attack based on item popularity and rated item correlation against collaborative filtering. Int J Mach Learn Cybern 10:1–13

    Google Scholar 

  11. Chirita PA, Nejdl W, Zamfir C (2005) Preventing shilling attacks in online recommender systems. In: Proceedings of the 7th annual ACM international workshop on Web information and data management, ACM, pp 67–74

  12. Costa H, Macedo L (2013) Emotion-based recommender system for overcoming the problem of information overload. In: International conference on practical applications of agents and multi-agent systems, Springer, pp 178–189

  13. Davoudi A, Chatterjee M (2017) Detection of profile injection attacks in social recommender systems using outlier analysis. In: 2017 IEEE International conference on big data (Big Data), IEEE, pp 2714–2719

  14. Gorrell G (2006) Generalized hebbian algorithm for incremental singular value decomposition in natural language processing. In: EACL 2006, 11st Conference of the European chapter of the association for computational linguistics, proceedings of the conference

  15. GraphLab: The smallnetflix recommender systems dataset (2012). http://www.select.cs.cmu.edu/code/graphlab/datasets/

  16. Grinias I, Panagiotakis C, Tziritas G (2016) MRF-based segmentation and unsupervised classification for building and road detection in peri-urban areas of high-resolution satellite images. ISPRS J Photogramm Remote Sens 122:145–166

    Google Scholar 

  17. Gunawardana A, Shani G (2009) A survey of accuracy evaluation metrics of recommendation tasks. J Mach Learn Res 10(Dec):2935–2962

    MathSciNet  MATH  Google Scholar 

  18. Harper FM, Konstan JA (2016) The movielens datasets: history and context. ACM Trans Interact Intell Syst 5(4):19

    Google Scholar 

  19. He X, Du X, Wang X, Tian F, Tang J, Chua TS (2018) Outer product-based neural collaborative filtering. arXiv:1808.03912

  20. Herlocker JL, Konstan JA, Borchers A, Riedl J (1999) An algorithmic framework for performing collaborative filtering. In: Proceedings of the 22nd annual international ACM SIGIR conference on research and development in information retrieval, ACM, pp 230–237

  21. Jamali M, Ester M (2010) A matrix factorization technique with trust propagation for recommendation in social networks. In: Proceedings of the fourth ACM conference on Recommender systems, ACM, pp 135–142

  22. Linden G, Smith B, York J (2003) Amazon.com recommendations: Item-to-item collaborative filtering. IEEE Internet Comput 7(1):76–80

    Google Scholar 

  23. Ma H, King I, Lyu MR (2012) Mining web graphs for recommendations. IEEE Trans Knowl Data Eng 24(6):1051–1064. https://doi.org/10.1109/TKDE.2011.18

    Article  Google Scholar 

  24. Mobasher B, Burke RD, Sandvig JJ (2006) Model-based collaborative filtering as a defense against profile injection attacks. In: Proceedings, The twenty-first national conference on artificial intelligence and the eighteenth innovative applications of artificial intelligence conference

  25. O’Sullivan D, Wilson D, Smyth B (2002) Improving case-based recommendation. In: European conference on case-based reasoning, Springer, pp 278–291

  26. Panagiotakis C (2015) Point clustering via voting maximization. J Classif 32(2):212–240

    MathSciNet  MATH  Google Scholar 

  27. Panagiotakis C, Papadakis H, Fragopoulou P (2018) Detection of hurriedly created abnormal profiles in recommender systems. In: International conference on intelligent systems

  28. Panagiotakis C, Papadakis H, Grinias E, Komodakis N, Fragopoulou P, Tziritas G (2013) Interactive image segmentation based on synthetic graph coordinates. Pattern Recognit 46(11):2940–2952

    Google Scholar 

  29. Papadakis H, Panagiotakis C, Fragopoulou P (2014) Distributed detection of communities in complex networks using synthetic coordinates. J Stat Mech Theory Exp 2014(3):P03013

    Google Scholar 

  30. Papadakis H, Panagiotakis C, Fragopoulou P (2017) SCoR: a synthetic coordinate based system for recommendations. Expert Syst Appl 79:8–19

    Google Scholar 

  31. Park DH, Kim HK, Choi IY, Kim JK (2012) A literature review and classification of recommender systems research. Expert Syst Appl 39(11):10059–10072

    Google Scholar 

  32. Pitsilis GK, Ramampiaro H, Langseth H (2019) Securing tag-based recommender systems against profile injection attacks: a comparative study. arXiv:1901.08422

  33. Ricci F, Rokach L, Shapira B (2015) Recommender systems: introduction and challenges. In: Recommender systems handbook, Springer, pp 1–34

  34. Salakhutdinov R, Mnih A, Hinton G (2007) Restricted boltzmann machines for collaborative filtering. In: Proceedings of the 24th international conference on Machine learning, ACM, pp 791–798

  35. Si M, Li Q (2018) Shilling attacks against collaborative recommender systems: a review. Artif Intell Rev 53:1–29

    Google Scholar 

  36. Turk AM, Bilge A (2019) Robustness analysis of multi-criteria collaborative filtering algorithms against shilling attacks. Expert Syst Appl 115:386–402

    Google Scholar 

  37. Williams CA, Mobasher B, Burke R (2007) Defending recommender systems: detection of profile injection attacks. Serv Oriented Comput Appl 1(3):157–170

    Google Scholar 

  38. Yang F, Gao M, Yu J, Song Y, Wang X (2018) Detection of shilling attack based on bayesian model and user embedding. In: 2018 IEEE 30th International conference on tools with artificial intelligence (ICTAI), IEEE, pp 639–646

  39. Yang Z, Cai Z, Guan X (2016) Estimating user behavior toward detecting anomalous ratings in rating systems. Knowl Based Syst 111:144–158

    Google Scholar 

  40. Yang Z, Sun Q, Zhang Y, Zhang B (2018) Uncovering anomalous rating behaviors for rating systems. Neurocomputing 308:205–226. https://doi.org/10.1016/j.neucom.2018.05.001

    Article  Google Scholar 

  41. Yang Z, Xu L, Cai Z, Xu Z (2016) Re-scale adaboost for attack detection in collaborative filtering recommender systems. Knowl Based Syst 100:74–88

    Google Scholar 

  42. Zhang F, Zhang Z, Zhang P, Wang S (2018) UD-HMM: an unsupervised method for shilling attack detection based on hidden Markov model and hierarchical clustering. Knowl Based Syst 148:146–166

    Google Scholar 

  43. Zhang F, Zhou Q (2014) HHT-SVM: an online method for detecting profile injection attacks in collaborative recommender systems. Knowl Based Syst 65:96–105

    Google Scholar 

  44. Zhou W, Wen J, Xiong Q, Gao M, Zeng J (2016) SVM-TIA a shilling attack detection method based on svm and target item analysis in recommender systems. Neurocomputing 210:197–205

    Google Scholar 

Download references

Acknowledgements

This research has been co-financed by the European Union and Greek national funds through the Operational Program Competitiveness, Entrepreneurship and Innovation, under the call RESEARCH - CREATE - INNOVATE (Project Code: T1EDK-02147).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Costas Panagiotakis.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Panagiotakis, C., Papadakis, H. & Fragopoulou, P. Unsupervised and supervised methods for the detection of hurriedly created profiles in recommender systems. Int. J. Mach. Learn. & Cyber. 11, 2165–2179 (2020). https://doi.org/10.1007/s13042-020-01108-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-020-01108-4

Keywords

Navigation