当前位置: X-MOL 学术Expert Syst. Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Improving neighbor-based collaborative filtering by using a hybrid similarity measurement
Expert Systems with Applications ( IF 7.5 ) Pub Date : 2020-07-06 , DOI: 10.1016/j.eswa.2020.113651
Dawei Wang , Yuehwern Yih , Mario Ventresca

Memory-based collaborative filtering is one of the recommendation system methods used to predict a user’s rating or preference by exploring historic ratings, but without incorporating any content information about users or items. It can be either item-based or user-based. Taking item-based Collaborative Filtering (CF) as an example, the way it makes predictions is accomplished in 2 steps: first, it selects based on pair-wise similarities a number of most similar items to the predicting item from those that the user has already rated on. Second, it aggregates the user’s opinions on those most similar items to predict a rating on the predicting item. Thus, similarity measurement determines which items are similar, and plays an important role on how accurate the predictions are. Many studies have been conducted on memory-based CFs to improve prediction accuracy, but none of them have achieved better prediction accuracy than state-of-the-art model-based CFs. In this paper, we proposed a new approach that combines both structural and rating-based similarity measurement. We found that memory-based CF using combined similarity measurement can achieve better prediction accuracy than model-based CFs in terms of lower MAE and reduce memory and time by using less neighbors than traditional memory-based CFs on MovieLens and Netflix datasets.



中文翻译:

通过使用混合相似性度量来改进基于邻居的协作过滤

基于内存的协作过滤是推荐系统方法之一,用于通过探索历史评级来预测用户的评级或偏好,但不合并有关用户或项目的任何内容信息。它可以基于项目或基于用户。以基于项目的协作过滤(CF)为例,它的预测方法分两个步骤完成:首先,它基于成对相似性从用户拥有的预测项目中选择与预测项目最相似的项目。已经评分。其次,它汇总用户对那些最相似项目的意见,以预测该预测项目的等级。因此,相似性度量确定哪些项相似,并且对预测的准确性起重要作用。为了提高预测精度,已经对基于内存的CF进行了许多研究,但是没有一个方法比最新的基于模型的CF具有更好的预测精度。在本文中,我们提出了一种结合结构和基于等级的相似性度量的新方法。我们发现,在MovieLens和Netflix数据集上,与传统的基于内存的CF相比,使用组合的相似性度量的基于内存的CF与基于模型的CF相比,MAE更低,并且使用的邻居更少,从而减少了内存和时间,从而实现了更好的预测准确性。

更新日期:2020-07-06
down
wechat
bug