Exploiting predicted answer in label aggregation to make better use of the crowd wisdom,Information Sciences

当前位置： X-MOL 学术 › Inform. Sci. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Exploiting predicted answer in label aggregation to make better use of the crowd wisdom
Information Sciences Pub Date : 2021-06-05 , DOI: 10.1016/j.ins.2021.05.060
Jiacheng Liu , Feilong Tang , Long Chen , Yanmin Zhu

Nowadays, crowdsourcing is a widespread and effective method to gather the crowd wisdom. At the same time, label aggregation is used to aggregate the noisy and biased data generated by the crowd. In the real-world crowdsourcing tasks, most workers only answer a small fraction of questions, which makes the collected answer sparse. However, the existing label aggregation approaches often build upon some probabilistic modeling procedures which is sensitive to the data sparsity. In this paper, we exploit the predicted answers to improve the performance of label aggregation and propose PLA (Prediction-based Label Aggregation) to intelligently aggregate the crowd wisdom. With PLA, we firstly learn representations to capture the characteristics of the workers and questions. Then we deploy a neural network model to predict the answer given by different workers. After that we add the most valuable predicted answers to the answer set. Finally, we use the augmented answer set to enhance representative label aggregation algorithms. To validate our proposed PLA, we compare it with other 6 existing methods on 8 real-world datasets. Our results show that PLA can enhance the performance of different aggregation algorithms in crowdsourcing tasks and achieves up to 16% performance improvement.

中文翻译：

利用标签聚合中的预测答案更好地利用群体智慧

如今，众包是一种广泛而有效的聚集众智的方式。同时，标签聚合用于聚合人群产生的嘈杂和有偏见的数据。在现实世界的众包任务中，大多数工人只回答了一小部分问题，这使得收集到的答案变得稀疏。然而，现有的标签聚合方法通常建立在一些对数据稀疏性敏感的概率建模程序上。在本文中，我们利用预测的答案来提高标签聚合的性能，并提出 PLA（基于预测的标签聚合）来智能地聚合群体智慧。使用 PLA，我们首先学习表征来捕捉工人和问题的特征。然后我们部署一个神经网络模型来预测不同工人给出的答案。之后，我们将最有价值的预测答案添加到答案集中。最后，我们使用增强的答案集来增强代表性标签聚合算法。为了验证我们提出的 PLA，我们将其与 8 个真实世界数据集上的其他 6 种现有方法进行了比较。我们的结果表明，PLA 可以提高不同聚合算法在众包任务中的性能，并实现高达 16% 的性能提升。

更新日期：2021-06-18

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11