Unsupervised and supervised methods for the detection of hurriedly created profiles in recommender systems

Panagiotakis, Costas; Papadakis, Harris; Fragopoulou, Paraskevi

doi:10.1007/s13042-020-01108-4

Unsupervised and supervised methods for the detection of hurriedly created profiles in recommender systems

Original Article
Published: 30 April 2020

Volume 11, pages 2165–2179, (2020)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Costas Panagiotakis¹,
Harris Papadakis² &
Paraskevi Fragopoulou²

431 Accesses
13 Citations
Explore all metrics

Abstract

Recommender systems try to provide users with accurate personalized suggestions for items based on an analysis of previous user decisions and the decisions made by other users. These systems suffer from profile injection attacks, where malicious profiles are generated in order to promote or demote a particular item introducing abnormal ratings. The problem of automatic detection of such malicious profiles has been recently addressed by a great number of authors in the literature using supervised and unsupervised approaches. In this paper, we propose a framework to identify anomalous rating profiles, where each attacker (outlier) hurriedly creates profiles that inject into the system an unspecified combination of random ratings and specific ratings, without any prior knowledge of the existing ratings. This attack is a superset of the two different attacks (Uniform and Delta) proposed in Harper et al. (ACM Trans Interact Intell Syst 5(4):19, 2016) making the attack model more realistic and its detection more challenging. The proposed detection method is based on several attributes related to the unpredictable behavior of the outliers in a validation set, on the user-item rating matrix, on the similarity between users and on the filler items. In this work, we propose a new attribute (RIS) to capture the randomness in item selection of the abnormal profiles. In this work, three different systems are proposed: (1) a probabilistic framework that estimates the probability of a user to be an outlier by combining several features in a completely unsupervised way. (2) An unsupervised clustering system based on the k-means algorithm that automatically spots the spurious profiles. (3) A supervised framework that uses a random forest classifier for cases where labeling sample data is available. Experimental results on the MovieLens and the Small Netflix datasets demonstrate the high performance of the proposed methods as well as the discrimination accuracy of the proposed features.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 2

A review of spam email detection: analysis of spammer strategies and the dataset shift problem

Article Open access 11 May 2022

Recommender Systems: Techniques, Applications, and Challenges

Advances in Collaborative Filtering

Notes

On the opposite case, e.g. under RIS attribute, the \(1-CDF\) can be used.
\(\overline{w}(f) = \frac{w(f)}{\sum _{f'} w(f')}\)
The code implementing the proposed method together with the datasets and the experimental results is publicly available at https://sites.google.com/site/costaspanagiotakis/research/hurryattackrs.
The rest of the features proposed in [13] were not computed, since they use connections between users that are not available in our datasets.
https://grouplens.org/datasets/movielens/100k/
https://grouplens.org/datasets/movielens/1m/

References

Adomavicius G, Kwon Y (2012) Improving aggregate recommendation diversity using ranking-based techniques. IEEE Trans Knowl Data Eng 24(5):896–911
Google Scholar
Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng 17(6):734–749
Google Scholar
Belgiu M, Drăguţ L (2016) Random forest in remote sensing: a review of applications and future directions. ISPRS J Photogramm Remote Sens 114:24–31
Google Scholar
Bennett J, Lanning S, Netflix N (2007) The netflix prize. In: In KDD Cup and Workshop in conjunction with KDD
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Article MATH Google Scholar
Burke R, Mobasher B, Williams C (2006) Classification features for attack detection in collaborative recommender systems. In: International conference on knowledge discovery and data mining, p 17–20
Burke R, O’Mahony MP, Hurley NJ (2015) Robust collaborative recommendation. In: Recommender systems handbook, Springer, pp 961–995
Cai H, Zhang F (2018) An unsupervised method for detecting shilling attacks in recommender systems by mining item relationship and identifying target items. Comput J 62(4):579–597
MathSciNet Google Scholar
Cao J, Wu Z, Mao B, Zhang Y (2013) Shilling attack detection utilizing semi-supervised learning method for collaborative recommender system. World Wide Web 16(5–6):729–748
Google Scholar
Chen K, Chan PP, Zhang F, Li Q (2018) Shilling attack based on item popularity and rated item correlation against collaborative filtering. Int J Mach Learn Cybern 10:1–13
Google Scholar
Chirita PA, Nejdl W, Zamfir C (2005) Preventing shilling attacks in online recommender systems. In: Proceedings of the 7th annual ACM international workshop on Web information and data management, ACM, pp 67–74
Costa H, Macedo L (2013) Emotion-based recommender system for overcoming the problem of information overload. In: International conference on practical applications of agents and multi-agent systems, Springer, pp 178–189
Davoudi A, Chatterjee M (2017) Detection of profile injection attacks in social recommender systems using outlier analysis. In: 2017 IEEE International conference on big data (Big Data), IEEE, pp 2714–2719
Gorrell G (2006) Generalized hebbian algorithm for incremental singular value decomposition in natural language processing. In: EACL 2006, 11st Conference of the European chapter of the association for computational linguistics, proceedings of the conference
GraphLab: The smallnetflix recommender systems dataset (2012). http://www.select.cs.cmu.edu/code/graphlab/datasets/
Grinias I, Panagiotakis C, Tziritas G (2016) MRF-based segmentation and unsupervised classification for building and road detection in peri-urban areas of high-resolution satellite images. ISPRS J Photogramm Remote Sens 122:145–166
Google Scholar
Gunawardana A, Shani G (2009) A survey of accuracy evaluation metrics of recommendation tasks. J Mach Learn Res 10(Dec):2935–2962
MathSciNet MATH Google Scholar
Harper FM, Konstan JA (2016) The movielens datasets: history and context. ACM Trans Interact Intell Syst 5(4):19
Google Scholar
He X, Du X, Wang X, Tian F, Tang J, Chua TS (2018) Outer product-based neural collaborative filtering. arXiv:1808.03912
Herlocker JL, Konstan JA, Borchers A, Riedl J (1999) An algorithmic framework for performing collaborative filtering. In: Proceedings of the 22nd annual international ACM SIGIR conference on research and development in information retrieval, ACM, pp 230–237
Jamali M, Ester M (2010) A matrix factorization technique with trust propagation for recommendation in social networks. In: Proceedings of the fourth ACM conference on Recommender systems, ACM, pp 135–142
Linden G, Smith B, York J (2003) Amazon.com recommendations: Item-to-item collaborative filtering. IEEE Internet Comput 7(1):76–80
Google Scholar
Ma H, King I, Lyu MR (2012) Mining web graphs for recommendations. IEEE Trans Knowl Data Eng 24(6):1051–1064. https://doi.org/10.1109/TKDE.2011.18
Article Google Scholar
Mobasher B, Burke RD, Sandvig JJ (2006) Model-based collaborative filtering as a defense against profile injection attacks. In: Proceedings, The twenty-first national conference on artificial intelligence and the eighteenth innovative applications of artificial intelligence conference
O’Sullivan D, Wilson D, Smyth B (2002) Improving case-based recommendation. In: European conference on case-based reasoning, Springer, pp 278–291
Panagiotakis C (2015) Point clustering via voting maximization. J Classif 32(2):212–240
MathSciNet MATH Google Scholar
Panagiotakis C, Papadakis H, Fragopoulou P (2018) Detection of hurriedly created abnormal profiles in recommender systems. In: International conference on intelligent systems
Panagiotakis C, Papadakis H, Grinias E, Komodakis N, Fragopoulou P, Tziritas G (2013) Interactive image segmentation based on synthetic graph coordinates. Pattern Recognit 46(11):2940–2952
Google Scholar
Papadakis H, Panagiotakis C, Fragopoulou P (2014) Distributed detection of communities in complex networks using synthetic coordinates. J Stat Mech Theory Exp 2014(3):P03013
Google Scholar
Papadakis H, Panagiotakis C, Fragopoulou P (2017) SCoR: a synthetic coordinate based system for recommendations. Expert Syst Appl 79:8–19
Google Scholar
Park DH, Kim HK, Choi IY, Kim JK (2012) A literature review and classification of recommender systems research. Expert Syst Appl 39(11):10059–10072
Google Scholar
Pitsilis GK, Ramampiaro H, Langseth H (2019) Securing tag-based recommender systems against profile injection attacks: a comparative study. arXiv:1901.08422
Ricci F, Rokach L, Shapira B (2015) Recommender systems: introduction and challenges. In: Recommender systems handbook, Springer, pp 1–34
Salakhutdinov R, Mnih A, Hinton G (2007) Restricted boltzmann machines for collaborative filtering. In: Proceedings of the 24th international conference on Machine learning, ACM, pp 791–798
Si M, Li Q (2018) Shilling attacks against collaborative recommender systems: a review. Artif Intell Rev 53:1–29
Google Scholar
Turk AM, Bilge A (2019) Robustness analysis of multi-criteria collaborative filtering algorithms against shilling attacks. Expert Syst Appl 115:386–402
Google Scholar
Williams CA, Mobasher B, Burke R (2007) Defending recommender systems: detection of profile injection attacks. Serv Oriented Comput Appl 1(3):157–170
Google Scholar
Yang F, Gao M, Yu J, Song Y, Wang X (2018) Detection of shilling attack based on bayesian model and user embedding. In: 2018 IEEE 30th International conference on tools with artificial intelligence (ICTAI), IEEE, pp 639–646
Yang Z, Cai Z, Guan X (2016) Estimating user behavior toward detecting anomalous ratings in rating systems. Knowl Based Syst 111:144–158
Google Scholar
Yang Z, Sun Q, Zhang Y, Zhang B (2018) Uncovering anomalous rating behaviors for rating systems. Neurocomputing 308:205–226. https://doi.org/10.1016/j.neucom.2018.05.001
Article Google Scholar
Yang Z, Xu L, Cai Z, Xu Z (2016) Re-scale adaboost for attack detection in collaborative filtering recommender systems. Knowl Based Syst 100:74–88
Google Scholar
Zhang F, Zhang Z, Zhang P, Wang S (2018) UD-HMM: an unsupervised method for shilling attack detection based on hidden Markov model and hierarchical clustering. Knowl Based Syst 148:146–166
Google Scholar
Zhang F, Zhou Q (2014) HHT-SVM: an online method for detecting profile injection attacks in collaborative recommender systems. Knowl Based Syst 65:96–105
Google Scholar
Zhou W, Wen J, Xiong Q, Gao M, Zeng J (2016) SVM-TIA a shilling attack detection method based on svm and target item analysis in recommender systems. Neurocomputing 210:197–205
Google Scholar

Download references

Acknowledgements

This research has been co-financed by the European Union and Greek national funds through the Operational Program Competitiveness, Entrepreneurship and Innovation, under the call RESEARCH - CREATE - INNOVATE (Project Code: T1EDK-02147).

Author information

Authors and Affiliations

Department of Management Science and Technology, Hellenic Mediterranean University, 72100, Agios Nikolaos, Crete, Greece
Costas Panagiotakis
Department of Electrical and Computer Engineering, Hellenic Mediterranean University, 71004, Heraklion, Crete, Greece
Harris Papadakis & Paraskevi Fragopoulou

Authors

Costas Panagiotakis
View author publications
You can also search for this author in PubMed Google Scholar
Harris Papadakis
View author publications
You can also search for this author in PubMed Google Scholar
Paraskevi Fragopoulou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Costas Panagiotakis.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Panagiotakis, C., Papadakis, H. & Fragopoulou, P. Unsupervised and supervised methods for the detection of hurriedly created profiles in recommender systems. Int. J. Mach. Learn. & Cyber. 11, 2165–2179 (2020). https://doi.org/10.1007/s13042-020-01108-4

Download citation

Received: 28 January 2019
Accepted: 24 February 2020
Published: 30 April 2020
Issue Date: September 2020
DOI: https://doi.org/10.1007/s13042-020-01108-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Unsupervised and supervised methods for the detection of hurriedly created profiles in recommender systems

Abstract

Access this article

Similar content being viewed by others

A review of spam email detection: analysis of spammer strategies and the dataset shift problem

Recommender Systems: Techniques, Applications, and Challenges

Advances in Collaborative Filtering

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Unsupervised and supervised methods for the detection of hurriedly created profiles in recommender systems

Abstract

Access this article

Similar content being viewed by others

A review of spam email detection: analysis of spammer strategies and the dataset shift problem

Recommender Systems: Techniques, Applications, and Challenges

Advances in Collaborative Filtering

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation