Scalable Bayesian preference learning for crowds,Machine Learning

当前位置： X-MOL 学术 › Mach. Learn. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Scalable Bayesian preference learning for crowds
Machine Learning ( IF 4.3 ) Pub Date : 2020-02-06 , DOI: 10.1007/s10994-019-05867-2
Edwin Simpson , Iryna Gurevych

We propose a scalable Bayesian preference learning method for jointly predicting the preferences of individuals as well as the consensus of a crowd from pairwise labels. Peoples’ opinions often differ greatly, making it difficult to predict their preferences from small amounts of personal data. Individual biases also make it harder to infer the consensus of a crowd when there are few labels per item. We address these challenges by combining matrix factorisation with Gaussian processes, using a Bayesian approach to account for uncertainty arising from noisy and sparse data. Our method exploits input features, such as text embeddings and user metadata, to predict preferences for new items and users that are not in the training set. As previous solutions based on Gaussian processes do not scale to large numbers of users, items or pairwise labels, we propose a stochastic variational inference approach that limits computational and memory costs. Our experiments on a recommendation task show that our method is competitive with previous approaches despite our scalable inference approximation. We demonstrate the method’s scalability on a natural language processing task with thousands of users and items, and show improvements over the state of the art on this task. We make our software publicly available for future work ( https://github.com/UKPLab/tacl2018-preference-convincing/tree/crowdGPPL ).

中文翻译：

人群的可扩展贝叶斯偏好学习

我们提出了一种可扩展的贝叶斯偏好学习方法，用于从成对标签中联合预测个人的偏好以及人群的共识。人们的意见往往差别很大，因此很难从少量的个人数据中预测他们的偏好。当每个项目的标签很少时，个人偏见也会使推断人群的共识变得更加困难。我们通过将矩阵分解与高斯过程相结合来解决这些挑战，使用贝叶斯方法来解决由嘈杂和稀疏数据引起的不确定性。我们的方法利用输入特征，例如文本嵌入和用户元数据，来预测不在训练集中的新项目和用户的偏好。由于之前基于高斯过程的解决方案无法扩展到大量用户、项目或成对标签，我们提出了一种限制计算和内存成本的随机变分推理方法。我们在推荐任务上的实验表明，尽管我们的可扩展推理近似，我们的方法与以前的方法相比具有竞争力。我们在具有数千个用户和项目的自然语言处理任务上展示了该方法的可扩展性，并在此任务上展示了对现有技术的改进。我们将我们的软件公开用于未来的工作（https://github.com/UKPLab/tacl2018-preference-convincing/tree/crowdGPPL）。我们在具有数千个用户和项目的自然语言处理任务上展示了该方法的可扩展性，并在此任务上展示了对现有技术的改进。我们将我们的软件公开用于未来的工作（https://github.com/UKPLab/tacl2018-preference-convincing/tree/crowdGPPL）。我们在具有数千个用户和项目的自然语言处理任务上展示了该方法的可扩展性，并在此任务上展示了对现有技术的改进。我们将我们的软件公开用于未来的工作（https://github.com/UKPLab/tacl2018-preference-convincing/tree/crowdGPPL）。

更新日期：2020-02-06

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11