当前位置: X-MOL 学术ACM Trans. Inf. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Bounding System-Induced Biases in Recommender Systems with a Randomized Dataset
ACM Transactions on Information Systems ( IF 5.4 ) Pub Date : 2023-04-08 , DOI: https://dl.acm.org/doi/10.1145/3582002
Dugang Liu, Pengxiang Cheng, Zinan Lin, Xiaolian Zhang, Zhenhua Dong, Rui Zhang, Xiuqiang He, Weike Pan, Zhong Ming

Debiased recommendation with a randomized dataset has shown very promising results in mitigating system-induced biases. However, it still lacks more theoretical insights or an ideal optimization objective function compared with the other more well-studied routes without a randomized dataset. To bridge this gap, we study the debiasing problem from a new perspective and propose to directly minimize the upper bound of an ideal objective function, which facilitates a better potential solution to system-induced biases. First, we formulate a new ideal optimization objective function with a randomized dataset. Second, according to the prior constraints that an adopted loss function may satisfy, we derive two different upper bounds of the objective function: a generalization error bound with triangle inequality and a generalization error bound with separability. Third, we show that most existing related methods can be regarded as the insufficient optimization of these two upper bounds. Fourth, we propose a novel method called debiasing approximate upper bound (DUB) with a randomized dataset, which achieves a more sufficient optimization of these upper bounds. Finally, we conduct extensive experiments on a public dataset and a real product dataset to verify the effectiveness of our DUB.



中文翻译:

使用随机数据集在推荐系统中限制系统引起的偏差

使用随机数据集的去偏推荐在减轻系统引起的偏差方面显示出非常有希望的结果。然而,与其他没有随机数据集的更深入研究的路线相比,它仍然缺乏更多的理论见解或理想的优化目标函数。为了弥合这一差距,我们从一个新的角度研究去偏问题,并提出直​​接最小化理想目标函数的上限,这有助于更好地潜在解决系统引起的偏差。首先,我们用随机数据集制定了一个新的理想优化目标函数。其次,根据采用的损失函数可能满足的先验约束,我们推导出目标函数的两个不同上界:与三角不等式相关的泛化误差和与可分离性相关的泛化误差。第三,我们表明大多数现有的相关方法都可以看作是对这两个上限的优化不足。第四,我们提出了一种新方法,称为使用随机数据集对近似上限( DUB ) 进行去偏,从而实现对这些上限的更充分优化。最后,我们在公共数据集和真实产品数据集上进行了大量实验,以验证我们 DUB 的有效性。

更新日期:2023-04-08
down
wechat
bug