Crowdsourcing aggregation with deep Bayesian learning

Li, Shao-Yuan; Huang, Sheng-Jun; Chen, Songcan

doi:10.1007/s11432-020-3118-7

Crowdsourcing aggregation with deep Bayesian learning

Research Paper
Published: 07 February 2021

Volume 64, article number 130104, (2021)
Cite this article

Science China Information Sciences Aims and scope Submit manuscript

Shao-Yuan Li¹,
Sheng-Jun Huang^1,2 &
Songcan Chen¹

318 Accesses
21 Citations
Explore all metrics

Abstract

In this study, we consider a crowdsourcing classification problem in which labeling information from crowds is aggregated to infer latent true labels. We propose a fully Bayesian deep generative crowdsourcing model (BayesDGC), which combines the strength of deep neural networks (DNNs) on automatic representation learning and the interpretable probabilistic structure encoding of probabilistic graphical models. The model comprises a DNN classifier as a prior for the true labels and a probabilistic model for the annotation generation process. The DNN classifier and annotation generation process share the latent true label variables. To address the inference challenge, we developed a natural-gradient stochastic variational inference, which combines variational message passing for conjugate parameters and stochastic gradient descent for DNN and learns the distribution of latent true labels and workers’ confusion matrix via end-to-end training. We illustrated the effectiveness of the proposed model using empirical results on 22 real-world datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Microsoft COCO: Common Objects in Context

Learning from positive and unlabeled data: a survey

Article 02 April 2020

Jessa Bekker & Jesse Davis

Emerging trends in federated learning: from model fusion to federated X learning

Article Open access 02 April 2024

Shaoxiong Ji, Yue Tan, … Anwar Walid

References

Horvitz E. Reflections on challenges and promises of mixed-initiative interaction. AI Mag, 2007, 28: 13–22
Google Scholar
Weld D, Lin C, Bragg J. Artificial intelligence and collective intelligence. In: Proceedings of Collective Intelligence Handbook. 2015
Snow R, O’Connor B, Jurafsky D, et al. Cheap and fast — but is it good? Evaluating non-expert annotations for natural language tasks. In: Proceedings of Conference on Empirical Methods in Natural Language Processing, 2008. 254–263
Raykar V, Yu S, Zhao L, et al. Learning from crowds. J Mach Learn Res, 2010, 11: 1297–1322
MathSciNet Google Scholar
Welinder P, Branson S, Belongie S, et al. The multidimensional wisdom of crowds. In: Proceedings of Advances in Neural Information Processing Systems, 2010. 2024–2432
Li Q Q, Li Y L, Gao J, et al. A confidence-aware approach for truth discovery on long-tail data. Proc VLDB Endow, 2014, 8: 425–436
Article Google Scholar
Dawid A P, Skene A M. Maximum likelihood estimation of observer error-rates using the EM algorithm. Appl Stat, 1979, 28: 20
Article Google Scholar
Whitehill J, Ruvolo P, Wu T, et al. Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In: Proceedings of Advances in Neural Information Processing Systems, 2009. 2035–2043
Zhou D Y, Basu S, Mao Y, et al. Learning from the wisdom of crowds by minimax entropy. In: Proceedings of Advances in Neural Information Processing Systems, 2012. 2195–2203
Li Y, Rubinstein B I P, Cohn T. Exploiting worker correlation for label aggregation in crowdsourcing. In: Proceedings of the 36th International Conference on Machine Learning, 2019. 3886–3895
Rodrigues F, Pereira F, Ribeiro B, et al. Gaussian process classification and active learning with multiple annotators. In: Proceedings of the 31st International Conference on Machine Learning, 2014. 433–441
Albarqouni S, Baur C, Achilles F, et al. AggNet: deep learning from crowds for mitosis detection in breast cancer histology images. IEEE Trans Med Imag, 2016, 35: 1313–1321
Article Google Scholar
Atarashi K, Oyama S, Kurihara M. Semi-supervised learning from crowds using deep generative models. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence, 2018. 1555–1562
Rodrigues F, Pereira P C. Deep learning from crowds. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence, 2018. 1611–1618
Tanno R, Saeedi A, Sankaranarayanan S, et al. Learning from noisy labels by regularized estimation of annotator confusion. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019. 11244–11253
Imamura H, Sato I, Sugiyama M. Analysis of minimax error rate for crowdsourcing and its application to worker clustering model. In: Proceedings of the 35th International Conference on Machine Learning, 2018. 2152–2161
Sheng V S, Zhang J, Gu B, et al. Majority voting and pairing with multiple noisy labeling. IEEE Trans Knowl Data Eng, 2019, 31: 1355–1368
Article Google Scholar
Tao F N, Jiang L X, Li C Q. Label similarity-based weighted soft majority voting and pairing for crowdsourcing. Knowl Inf Syst, 2020, 62: 2521–2538
Article Google Scholar
Liu Q, Peng J, Ihler A. Variational inference for crowdsourcing. In: Proceedings of Advances in Neural Information Processing Systems, 2012. 692–700
Kim H C, Ghahramani Z. Bayesian classifier combination. In: Proceedings of the 15th International Conference on Artificial Intelligence and Statistics, 2012. 619–627
Simpson E, Roberts S, Psorakis I, et al. Dynamic Bayesian combination of multiple imperfect classifiers. In: Decision Making and Imperfection. Berlin: Springer, 2013
Google Scholar
Venanzi M, Guiver J, Kazai P, et al. Community-based bayesian aggregation models for crowdsourcing. In: Proceedings of the 23rd International Conference on World Wide Web, 2014. 155–164
Moreno P G, Artes-Rodrguez A, Teh Y W, et al. Bayesian nonparametric crowdsourcing. J Mach Learn Res, 2015, 16: 1607–1627
MathSciNet MATH Google Scholar
Rodrigues F, Lourenco M, Ribeiro B, et al. Learning supervised topic models for classification and regression from crowds. IEEE Trans Pattern Anal Mach Intell, 2017, 39: 2409–2422
Article Google Scholar
Zhang H, Jiang L X, Xu W Q. Multiple noisy label distribution propagation for crowdsourcing. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence, 2019. 1473–1479
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015, 521: 436–444
Article Google Scholar
Luo Y C, Tian T, Shi J X, et al. Semi-crowdsourced clustering with deep generative models. In: Proceedings of Advances in Neural Information Processing Systems, 2018. 3216–3226
Kingma D P, Mohamed S, Rezende D J, et al. Semi-supervised learning with deep generative models. In: Proceedings of Advances in Neural Information Processing Systems, 2014. 3581–3589
Kingma D P, Welling M. Auto-encoding variational bayes. In: Proceedings of the 2nd International Conference on Learning Representations, 2014
Johnson M J, Duvenaud D, Wiltschko A B, et al. Composing graphical models with neural networks for structured representations and fast inference. In: Proceedings of Advances in Neural Information Processing Systems, 2016. 2946–2954
Hoffman M D, Blei D M, Wang C, et al. Stochastic variational inference. J Mach Learn Res, 2013, 14: 1303–1347
MathSciNet MATH Google Scholar
Li S Y, Jiang Y, Chawla N V, et al. Multi-label learning from crowds. IEEE Trans Knowl Data Eng, 2019, 31: 1369–1382
Article Google Scholar

Download references

Acknowledgements

This work was supported by Fundamental Research Funds for the Central Universities (Grant No. NJ2019010), National Natural Science Foundation of China (Grant No. 61906089), Jiangsu Province Basic Research Program (Grant No. BK20190408), and China Postdoc Science Foundation (the First Pre-station Special Grant).

Author information

Authors and Affiliations

College of Computer Science and Technology, College of Artificial Intelligence, Nanjing University of Aeronautics and Astronautics, Nanjing, 211106, China
Shao-Yuan Li, Sheng-Jun Huang & Songcan Chen
State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210093, China
Sheng-Jun Huang

Authors

Shao-Yuan Li
View author publications
You can also search for this author in PubMed Google Scholar
Sheng-Jun Huang
View author publications
You can also search for this author in PubMed Google Scholar
Songcan Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shao-Yuan Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, SY., Huang, SJ. & Chen, S. Crowdsourcing aggregation with deep Bayesian learning. Sci. China Inf. Sci. 64, 130104 (2021). https://doi.org/10.1007/s11432-020-3118-7

Download citation

Received: 30 June 2020
Accepted: 30 July 2020
Published: 07 February 2021
DOI: https://doi.org/10.1007/s11432-020-3118-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Crowdsourcing aggregation with deep Bayesian learning

Abstract

Access this article

Similar content being viewed by others

Microsoft COCO: Common Objects in Context

Learning from positive and unlabeled data: a survey

Emerging trends in federated learning: from model fusion to federated X learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Crowdsourcing aggregation with deep Bayesian learning

Abstract

Access this article

Similar content being viewed by others

Microsoft COCO: Common Objects in Context

Learning from positive and unlabeled data: a survey

Emerging trends in federated learning: from model fusion to federated X learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation