Measuring Recommender System Effects with Simulated Users,arXiv - CS - Information Retrieval

当前位置： X-MOL 学术 › arXiv.cs.IR › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Measuring Recommender System Effects with Simulated Users
arXiv - CS - Information Retrieval Pub Date : 2021-01-12 , DOI: arxiv-2101.04526
Sirui Yao, Yoni Halpern, Nithum Thain, Xuezhi Wang, Kang Lee, Flavien Prost, Ed H. Chi, Jilin Chen, Alex Beutel

Imagine a food recommender system -- how would we check if it is \emph{causing} and fostering unhealthy eating habits or merely reflecting users' interests? How much of a user's experience over time with a recommender is caused by the recommender system's choices and biases, and how much is based on the user's preferences and biases? Popularity bias and filter bubbles are two of the most well-studied recommender system biases, but most of the prior research has focused on understanding the system behavior in a single recommendation step. How do these biases interplay with user behavior, and what types of user experiences are created from repeated interactions? In this work, we offer a simulation framework for measuring the impact of a recommender system under different types of user behavior. Using this simulation framework, we can (a) isolate the effect of the recommender system from the user preferences, and (b) examine how the system performs not just on average for an "average user" but also the extreme experiences under atypical user behavior. As part of the simulation framework, we propose a set of evaluation metrics over the simulations to understand the recommender system's behavior. Finally, we present two empirical case studies -- one on traditional collaborative filtering in MovieLens and one on a large-scale production recommender system -- to understand how popularity bias manifests over time.

中文翻译：

与模拟用户一起评估推荐系统的效果

想象一下一个食物推荐系统-我们将如何检查它是否“发汗”并养成不健康的饮食习惯或仅仅反映了用户的兴趣？用户在一段时间内使用推荐器的体验有多少是由推荐器系统的选择和偏见引起的，多少是基于用户的偏好和偏见？流行度偏差和过滤器气泡是研究最深入的推荐系统偏差中的两个，但是大多数现有研究集中在单个推荐步骤中了解系统行为。这些偏见如何与用户行为相互作用，以及通过重复的交互来创建哪些类型的用户体验？在这项工作中，我们提供了一个模拟框架，用于测量推荐系统在不同类型的用户行为下的影响。使用这个模拟框架，我们可以（a）将推荐系统的效果与用户的偏好隔离开来，并且（b）不仅可以查看“平均用户”系统的平均表现，还可以查看非典型用户行为下的极端体验。作为仿真框架的一部分，我们提出了一组针对仿真的评估指标，以了解推荐系统的行为。最后，我们提出了两个经验案例研究-一个关于MovieLens中的传统协作过滤，另一个关于大型生产推荐系统-来了解随着时间的流逝，偏见是如何显现的。作为仿真框架的一部分，我们提出了一组针对仿真的评估指标，以了解推荐系统的行为。最后，我们提出了两个经验案例研究-一个关于MovieLens中的传统协作过滤，另一个关于大型生产推荐系统-来了解随着时间的流逝，偏见是如何显现的。作为仿真框架的一部分，我们提出了一组针对仿真的评估指标，以了解推荐系统的行为。最后，我们提出了两个经验案例研究-一个关于MovieLens中的传统协作过滤，另一个关于大型生产推荐系统-来了解随着时间的流逝，偏见是如何显现的。

更新日期：2021-01-13

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>