当前位置: X-MOL 学术User Model. User-Adap. Inter. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A flexible framework for evaluating user and item fairness in recommender systems
User Modeling and User-Adapted Interaction ( IF 3.0 ) Pub Date : 2021-01-27 , DOI: 10.1007/s11257-020-09285-1
Yashar Deldjoo , Vito Walter Anelli , Hamed Zamani , Alejandro Bellogín , Tommaso Di Noia

One common characteristic of research works focused on fairness evaluation (in machine learning) is that they call for some form of parity (equality) either in treatment—meaning they ignore the information about users’ memberships in protected classes during training—or in impact—by enforcing proportional beneficial outcomes to users in different protected classes. In the recommender systems community, fairness has been studied with respect to both users’ and items’ memberships in protected classes defined by some sensitive attributes (e.g., gender or race for users, revenue in a multi-stakeholder setting for items). Again here, the concept has been commonly interpreted as some form of equality—i.e., the degree to which the system is meeting the information needs of all its users in an equal sense. In this work, we propose a probabilistic framework based on generalized cross entropy (GCE) to measure fairness of a given recommendation model. The framework comes with a suite of advantages: first, it allows the system designer to define and measure fairness for both users and items and can be applied to any classification task; second, it can incorporate various notions of fairness as it does not rely on specific and predefined probability distributions and they can be defined at design time; finally, in its design it uses a gain factor, which can be flexibly defined to contemplate different accuracy-related metrics to measure fairness upon decision-support metrics (e.g., precision, recall) or rank-based measures (e.g., NDCG, MAP). An experimental evaluation on four real-world datasets shows the nuances captured by our proposed metric regarding fairness on different user and item attributes, where nearest-neighbor recommenders tend to obtain good results under equality constraints. We observed that when the users are clustered based on both their interaction with the system and other sensitive attributes, such as age or gender, algorithms with similar performance values get different behaviors with respect to user fairness due to the different way they process data for each user cluster.



中文翻译:

评估推荐系统中用户和项目公平性的灵活框架

注重公平性评估(机器学习中)的研究工作的一个共同特征是,他们要求在治疗方面采取某种形式的均等(平等),即在培训期间或在影响方面,他们忽略了有关用户在受保护班级中的成员身份的信息。通过对不同保护类别的用户强制执行成比例的收益。在推荐系统社区中,已经对由某些敏感属性(例如,用户的性别或种族,项目的多方利益相关者设置的收入)定义的受保护类别中的用户和项目成员资格的公平性进行了研究。同样,这里的概念通常被解释为某种形式的平等,即,系统在同等程度上满足所有用户的信息需求的程度。在这项工作中,我们提出了一个基于广义交叉熵(GCE)的概率框架来衡量给定推荐模型的公平性。该框架具有一系列优点:首先,它允许系统设计人员定义和衡量用户和项目的公平性,并且可以应用于任何分类任务。第二,它可以包含各种公平概念,因为它不依赖于特定的和预定义的概率分布,并且可以在设计时进行定义。最后,在其设计中,它使用增益因子,可以灵活定义增益因子,以考虑与准确性相关的不同度量,以基于决策支持度量(例如精度,召回率)或基于等级的度量(例如NDCG,MAP)来度量公平性。对四个真实世界数据集的实验评估表明,我们提出的度量标准在不同用户和商品属性上的公平性上存在细微差别,在这种情况下,最近邻居推荐者往往会在平等约束下获得良好的结果。我们观察到,当基于用户与系统的交互以及其他敏感属性(例如年龄或性别)对用户进行聚类时,具有相似性能值的算法在用户公平性方面会表现出不同的行为,这是因为他们处理每个用户数据的方式不同用户集群。

更新日期:2021-01-28
down
wechat
bug