Skip to main content
Log in

Improving crowd labeling using Stackelberg models

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Crowdsourcing systems provide an easy means of acquiring labeled training data for supervised learning. However, the labels provided by non-expert crowd workers (labelers) often appear low quality. In order to solve this problem, in practice each sample always obtains a multiple noisy label set from multiple different labelers, then ground truth inference algorithms are employed to obtain integrated labels of samples. So ground truth inference methods directly determine the label quality of samples. In this paper, we propose a novel label integration method based on game theory. We assume that there is an adversary in crowdsourcing system who intentionally provides incorrect integrated labels. We model the interaction between the data miner and the adversary as a Stackelberg game in which one player (the data miner) controls the predictive model whereas another (the adversary) tries to choose the integrated labels which would be most harmful for the current classifier. On this basis, we transform the label integration problem into a repeated Stackelberg model. We call our method Stackelberg label inference (SLI). SLI does not need to estimate the quality of labelers, and avoids the chicken-egg problem that can lead to poor result. Moreover, because SLI has little involvement of multiple noisy label sets on the noise data set, it is not very sensitive to the number of labelers. SLI shows better performance when the number of labelers is relatively small. In term of both label quality and model quality, the experimental results show that SLI is superior to the other state-of-the-art ground truth inference methods used to compare.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Alcalá-Fdez J, Fernández A, Luengo J, Derrac J, García S (2011) KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Mult Valued Logic Soft Comput 17(2–3):255–287

    Google Scholar 

  2. Bard JF (2013) Practical bilevel optimization: algorithms and applications, vol 30. Springer Science & Business Media, Berlin

    MATH  Google Scholar 

  3. Brückner M, Scheffer T (2011) Stackelberg games for adversarial prediction problems. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining, San Diego, August 21–24, 2011. ACM, pp 547–555

  4. Collins M, Schapire RE, Singer Y (2002) Logistic regression, adaboost and Bregman distances. Mach Learn 48(1–3):253–285

    Article  Google Scholar 

  5. Colson B, Marcotte P, Savard G (2007) An overview of bilevel optimization. Ann Oper Res 153(1):235–256

    Article  MathSciNet  Google Scholar 

  6. Dalvi NN, Dasgupta A, Kumar R, Rastogi V (2013) Aggregating crowdsourced binary ratings. In: 22nd international world wide web conference, WWW ’13, Rio de Janeiro, Brazil, May 13–17, 2013. International World Wide Web Conferences Steering Committee/ACM, pp 285–294

  7. Dalvi NN, Domingos PM, Mausam, Sanghai SK, Verma D (2004) Adversarial classification. In: Proceedings of the 10th ACM sigkdd international conference on knowledge discovery and data mining, Seattle, August 22–25, 2004. ACM, pp 99–108

  8. Dawid AP, Skene AM (1979) Maximum likelihood estimation of observer error-rates using the em algorithm. Appl Stat 28(1):20–28

    Article  Google Scholar 

  9. Demartini G, Difallah DE, Cudré-Mauroux P (2012) Zencrowd: leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking. In: Proceedings of the 21st world wide web conference 2012, WWW 2012, Lyon, April 16–20, 2012. ACM, pp 469–478

  10. Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MathSciNet  MATH  Google Scholar 

  11. Dheeru D, Casey G (2017) UCI machine learning repository. http://archive.ics.uci.edu/ml

  12. Dixit AK, Skeath S (2015) Games of strategy: fourth international student edition. WW Norton & Company, New York

    Google Scholar 

  13. Donmez P, Carbonell JG, Schneider JG (2009) Efficiently learning the accuracy of labeling sources for selective sampling. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, Paris, June 28–July 1, 2009. ACM, pp 259–268

  14. Estellés-Arolas E (2018) The need of co-utility for successful crowdsourcing. In: Co-utility. Springer, pp 189–200

  15. Garcia S, Herrera F (2008) An extension on statistical comparisons of classifiers over multiple data sets for all pairwise comparisons. J Mach Learn Res 9:2677–2694

    MATH  Google Scholar 

  16. Ghosh A, Kale S, McAfee RP (2011) Who moderates the moderators?: crowdsourcing abuse detection in user-generated content. In: Proceedings 12th ACM conference on electronic commerce (EC-2011), San Jose, June 5–9, 2011. ACM, pp 167–176

  17. Globerson A, Teo CH, Smola A, Roweis S et al (2009) An adversarial view of covariate shift and a minimax approach. In: Dataset shift in machine learning. MIT Press, pp 179–197

  18. Großhans M, Sawade C, Brückner M, Scheffer T (2013) Bayesian games for adversarial regression problems. In: Proceedings of the 30th international conference on machine learning, ICML 2013, Atlanta, 16–21 June 2013, JMLR workshop and conference proceedings, vol 28. JMLR.org, pp 55–63

  19. Ho C, Chang T, Hsu JY (2007) Photoslap: a multi-player online game for semantic annotation. In: Proceedings of the twenty-second AAAI conference on artificial intelligence, July 22–26, 2007, Vancouver. AAAI Press, pp 1359–1364

  20. Jeroslow RG (1985) The polynomial hierarchy and a simple model for competitive analysis. Math Program 32(2):146–164

    Article  MathSciNet  Google Scholar 

  21. Jiang L, Zhang L, Li C, Wu J (2019) A correlation-based feature weighting filter for naive bayes. IEEE Trans Knowl Data Eng 31(2):201–213

    Article  Google Scholar 

  22. Jiang L, Zhang L, Yu L, Wang D (2019) Class-specific attribute weighted naive bayes. Pattern Recognit 88:321–330

    Article  Google Scholar 

  23. Kantarcıoğlu M, Xi B, Clifton C (2011) Classifier evaluation and attribute selection against active adversaries. Data Min Knowl Discov 22(1–2):291–335

    Article  MathSciNet  Google Scholar 

  24. Karger DR, Oh S, Shah D (2011) Iterative learning for reliable crowdsourcing systems. In: Advances in neural information processing systems 24: 25th annual conference on neural information processing systems 2011. Proceedings of a meeting held 12–14 December 2011, Granada. ACM, pp 1953–1961

  25. Keerthi SS, DeCoste D (2005) A modified finite newton method for fast solution of large scale linear svms. J Mach Learn Res 6(Mar):341–361

    MathSciNet  MATH  Google Scholar 

  26. Li B, Vorobeychik Y (2014) Feature cross-substitution in adversarial classification. In: Advances in neural information processing systems 27: annual conference on neural information processing systems 2014, December 8–13 2014, Montreal, pp 2087–2095

  27. Li C, Jiang L, Xu W (2019) Noise correction to improve data and model quality for crowdsourcing. Eng Appl Artif Intell 82:184–191

    Article  Google Scholar 

  28. Li C, Sheng VS, Jiang L, Li H (2016) Noise filtering to improve data and model quality for crowdsourcing. Knowl Based Syst 107:96–103

    Article  Google Scholar 

  29. Liu W, Chawla S (2009) A game theoretical model for adversarial learning. In: ICDM workshops 2009, IEEE international conference on data mining workshops, Miami, 6 December 2009. IEEE Computer Society, pp 25–30

  30. Lowd D, Meek C (2005) Adversarial learning. In: Proceedings of the 11th ACM SIGKDD international conference on knowledge discovery and data mining, Chicago, August 21–24, 2005. ACM, pp 641–647

  31. Lu J, Tang C, Li X, Wu Q (2017) Designing socially-optimal rating protocols for crowdsourcing contest dilemma. IEEE Trans Inf Forensics Secur 12(6):1330–1344

    Article  Google Scholar 

  32. Qiu C, Jiang L, Cai Z (2018) Using differential evolution to estimate labeler quality for crowdsourcing. In: PRICAI 2018: trends in artificial intelligence—15th pacific rim international conference on artificial intelligence, Nanjing, August 28–31, 2018, Proceedings, Part II, lecture notes in computer science, vol 11013. Springer, pp 165–173

  33. Raykar VC, Yu S, Zhao LH, Valadez GH, Florin C, Bogoni L, Moy L (2010) Learning from crowds. J Mach Learn Res 11:1297–1322

    MathSciNet  Google Scholar 

  34. Rodrigues F, Pereira F, Ribeiro B (2013) Learning from multiple annotators: distinguishing good from random labelers. Pattern Recognit Lett 34(12):1428–1436

    Article  Google Scholar 

  35. Sáez JA, Galar M, Luengo J, Herrera F (2014) Analyzing the presence of noise in multi-class problems: alleviating its influence with the one-vs-one decomposition. Knowl Inf Syst 38(1):179–206

    Article  Google Scholar 

  36. Sheng VS, Zhang J, Gu B, Wu X (2019) Majority voting and pairing with multiple noisy labeling. IEEE Trans Knowl Data Eng 31(7):1355–1368

    Article  Google Scholar 

  37. Smyth P, Burl MC, Fayyad UM, Perona P (1994) Knowledge discovery in large image databases: dealing with uncertainties in ground truth. In: Knowledge discovery in databases: papers from the 1994 AAAI workshop, Seattle, July 1994. Technical Report WS-94-03. AAAI Press, pp 109–120

  38. Smyth P, Fayyad UM, Burl MC, Perona P, Baldi P (1994) Inferring ground truth from subjective labelling of venus images. In: Advances in neural information processing systems 7, NIPS Conference, Denver. MIT Press, pp 1085–1092

  39. Stempfel G, Ralaivola L (2009) Learning svms from sloppily labeled data. In: Artificial neural networks—ICANN 2009, 19th international conference, Limassol, September 14–17, 2009, Proceedings, Part I, Lecture notes in computer science, vol 5768. Springer, pp 884–893

  40. Tao F, Jiang L, Li C (2020) Label similarity-based weighted soft majority voting and pairing for crowdsourcing. Knowl Inf Syst 62(7):2521–2538

    Article  Google Scholar 

  41. Von Stackelberg H (1934) Marktform und gleichgewicht. Springer, Berlin

    MATH  Google Scholar 

  42. Wei L, Chawla S (2010) Mining adversarial patterns via regularized loss minimization. Mach Learn 81(1):69–83

    Article  MathSciNet  Google Scholar 

  43. Whitehill J, Ruvolo P, Wu T, Bergsma J, Movellan JR (2009) Whose vote should count more: Optimal integration of labels from labelers of unknown expertise. In: Advances in neural information processing systems 22: 23rd annual conference on neural information processing systems 2009. Proceedings of a meeting held 7–10 December 2009, Vancouver. Curran Associates, Inc., pp 2035–2043

  44. Witten IH, Frank E, Hall MA (2011) Data mining: practical machine learning tools and techniques, 3rd edn. Morgan Kaufmann, Elsevier, Amsterdam

    Google Scholar 

  45. Wu M, Li Q, Zhang J, Cui S, Li D, Qi Y (2017) A robust inference algorithm for crowd sourced categorization. In: 12th international conference on intelligent systems and knowledge engineering, ISKE 2017, Nanjing, November 24–26, 2017. IEEE, pp 1–6

  46. Xiao H, Biggio B, Nelson B, Xiao H, Eckert C, Roli F (2015) Support vector machines under adversarial label contamination. Neurocomputing 160(C):53–62

    Article  Google Scholar 

  47. Zhang H, Jiang L, Xu W (2018) Differential evolution-based weighted majority voting for crowdsourcing. In: PRICAI 2018: trends in artificial intelligence—15th pacific rim international conference on artificial intelligence, Nanjing, August 28–31, 2018, Proceedings, Part II, Lecture notes in computer science, vol 11013. Springer, pp 228–236

  48. Zhang H, Jiang L, Xu W (2019) Multiple noisy label distribution propagation for crowdsourcing. In: Proceedings of the twenty-eighth international joint conference on artificial intelligence, IJCAI 2019, Macao, August 10–16, 2019. Morgan Kaufmann, pp 1473–1479

  49. Zhang J, Sheng VS, Nicholson B, Wu X (2015) CEKA: a tool for mining the wisdom of crowds. J Mach Learn Res 16:2853–2858

    MathSciNet  Google Scholar 

  50. Zhang J, Sheng VS, Wu J, Wu X (2016) Multi-class ground truth inference in crowdsourcing with clustering. IEEE Trans Knowl Data Eng 28(4):1080–1085

    Article  Google Scholar 

  51. Zhang J, Wu X, Sheng VS (2016) Learning from crowdsourced labeled data: a survey. Artif Intell Rev 46(4):543–576

    Article  Google Scholar 

  52. Zhang Y, Chen X, Zhou D, Jordan MI (2016) Spectral methods meet em: a provably optimal algorithm for crowdsourcing. J Mach Learn Res 17:102:1–102:44

    MathSciNet  MATH  Google Scholar 

  53. Zhu X, Wu X (2004) Class noise vs. attribute noise: a quantitative study. Artif Intell Rev 22(3):177–210

    Article  Google Scholar 

Download references

Acknowledgements

The work was partially supported by the National Natural Science Foundation of China (U1711267), the Fundamental Research Funds for the Central Universities (CUG2018JM18), and the Open Research Project of Hubei Key Laboratory of Intelligent Geo-Information Processing (KLIGIP201601).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chaoqun Li.

Ethics declarations

Conflict of interest

I confirm that there is no conflict-of-interest in the submission, and the manuscript has been approved by all authors for publication.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, W., Li, C. Improving crowd labeling using Stackelberg models. Int. J. Mach. Learn. & Cyber. 12, 1825–1838 (2021). https://doi.org/10.1007/s13042-021-01276-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-021-01276-x

Keywords

Navigation