Skip to main content
Log in

Network structure and requirements crowdsourcing for OSS projects

  • Original Article
  • Published:
Requirements Engineering Aims and scope Submit manuscript

Abstract

Crowdsourcing system requirements enables project managers to elicit feedback from a broader range of stakeholders. The advantages of crowdsourcing include a higher volume of requirements reflecting a more comprehensive array of use cases and a more engaged and committed user base. Researchers cite the inability of project teams to effectively manage an increasing volume of system requirements as a possible drawback. This paper analyzes a data set consisting of project management artifacts from 562 open-source software (OSS) projects to determine how OSS project performance varies as the share of crowdsourced requirements increases using six measures of effectiveness: requirement close-out time, requirement response time, average comment activity, the average number of requirements per crowd member, the average retention time for crowd members, and the total volume of requirements. Additionally, the models measure how the impact of increasing the share of crowdsourced requirements changes with stakeholder network structure. The analysis shows that stakeholder network structure impacts OSS performance outcomes and that the effect changes with the share of crowdsourced requirements. OSS projects with more concentrated stakeholder networks perform the best. The results indicate that requirements crowdsourcing faces diminishing marginal returns. OSS projects that crowdsource more than 70% of their requirements benefit more from implementing processes to organize and prioritize existing requirements than from incentivizing the crowd to generate additional requirements. Analysis in the paper also suggests that OSS projects could benefit from employing CrowdRE techniques and assigning dedicated community managers to more effectively channel input from the crowd.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Data availability

Data are available at https://github.com/MthwRobinson/sh-network-paper/tree/master/code.

References

  1. Anscombe FJ, Tukey JW (1963) The examination and analysis of residuals. Technometrics 5(2):141–160

    Article  MathSciNet  Google Scholar 

  2. Brabham D (2013) Crowdsourcing. MIT Press, Cambridge

    Book  Google Scholar 

  3. Brabham D (2008) Crowdsourcing as a model for problem solving: an introduction and cases. Science 14(1):75–90

    Google Scholar 

  4. Camara G, Fonseca F (2007) Information policies and open source software in developing countries. J Am Soc Inform Sci Technol 58(1):121–132

    Article  Google Scholar 

  5. Chen, C Awesome JavaScript, GitHub Repository, https://github.com/sorrycc/awesome-javascript. Accessed 15 June 2019

  6. Chen, V (2019) Awesome Python GitHub Repository. https://github.com/vinta/awesome-python. Accessed 15 June 2019

  7. Crowston K, Howison J (2016) FLOSS project effectiveness measures. In: Successful OSS project design and implementation: requirements, tools, social designs and reward structures, pp 149

  8. Damian D, Marczak S, & Kwan I (2007) Collaboration patterns and the impact of distance on awareness in requirements-centred social networks. In: 15th IEEE international requirements engineering conference (RE 2007), pp 59–68

  9. Derksen S, Keselman HJ (1992) Backward, forward and stepwise automated subset selection algorithms: frequency of obtaining authentic and noise variables. Br J Math Stat Psychol 45(2):265–282

    Article  Google Scholar 

  10. Fahrmeir L, Kneib T, Lang S, Marx B (2007) Regression. Springer, Berlin, pp 300–301

    Google Scholar 

  11. Fallahi, F, (2019) Awesome C++ GitHub repository. https://github.com/fffaraz/awesome-cpp Accessed 15 June 2019

  12. Finkelstein A (1994) Requirements Engineering: a review and research agenda. In: Proceedings of 1st Asia-Pacific software engineering conference, pp 10–19

  13. Ferrari S, Cribari-Neto F (2004) Beta regression for modelling rates and proportions. J Appl Stat 31(7):799–815

    Article  MathSciNet  Google Scholar 

  14. Friedman J, Hastie T, Tibshirani R (2001) The elements of statistical learning, vol 1. Springer, New York, pp 60–61

    Google Scholar 

  15. Ghapanchi A , Aurum A, Low G (2011) A taxonomy for measuring the success of open source software projects. First Monday

  16. Gerth RJ, Burnap A, Papalambros P (2012) Crowdsourcing: a primer and its implications for project managering. Michigan University, Ann Arbor

    Google Scholar 

  17. Gini C (1912) Variabilità e mutabilità. Reprinted in Memorie di metodologica statistica. In: Pizetti E, Salvemini T (eds.) Libreria Eredi Virgilio Veschi, Rome

  18. Gini C (1921) Measurement of inequality of incomes. Econ J 31(121):124–126

    Article  Google Scholar 

  19. Glinz C (2019) CrowdRE: Achievements. In: Proceedings of REW, Opportunities and Pitfalls, p 2019

  20. Groen EC, Seyff N, Ali R, Dalpiaz F, Doerr J, Guzman E, Stade M (2017) The crowd in requirements engineering: the landscape and challenges. IEEE Softw 34(2):44–52

    Article  Google Scholar 

  21. Holland PW, Leinhardt S (1971) Transitivity in structural models of small groups. Comp Group Stud 2(2):107–124

    Article  Google Scholar 

  22. Howe J (2008) Crowdsourcing: How the power of the crowd is driving the future of business. Random House

  23. Howe J (2006) The rise of crowdsourcing. Wired Mag 14(6):1–4

    Google Scholar 

  24. Hosseini M, Phalp KT, Taylor J, Ali R (2014) Towards crowdsourcing for requirements engineering

  25. INCOSE (2015) Systems engineering handbook: a guide for system life cycle processes and activities, 4th Edn. Wiley

  26. Iyer DG, Lyytinen K (2019) Requirements engineering (RE) effectiveness in open source software: the role of social network configurations and requirements properties. In: Proceedings of the 27th European conference on information systems (ECIS), Stockholm & Uppsala, Sweden

  27. Kuriakose J, Parsons J (2015) How do open source software (OSS) developers practice and perceive requirements engineering? an empirical study. In: 2015 IEEE Fifth international workshop on empirical requirements engineering (EmpiRE). IEEE, pp 49–56

  28. Kull A (2019) Awesome Java GitHub repository. https://github.com/akullpp/awesome-java Accessed 15 June 2019

  29. Khan JA, Liu L, Wen L, Ali R (2019) Crowd intelligence in requirements engineering: current status and future directions. Proc REFSQ 2019 LNCS 11412:45–261

    Google Scholar 

  30. Kluender J, Schneider K, Kortum F, Straube J, Handke L, Kauffeld S (2016) Communication in teams-an expression of social conflicts. In: Human-centered and error-resilient systems development, pp 111–129

  31. LaToza TD, Van Der Hoek A (2015) Crowdsourcing in software engineering: models, motivations, and challenges. IEEE Softw 33(1):74–80

    Article  Google Scholar 

  32. Levy M, Hadar I, Te’eni D (2015) A gradual approach to crowd-based requirements engineering: the case of conference online social networks. In: 2015 IEEE 1st international workshop on crowd-based requirements engineering (CrowdRE), pp 25–30, IEEE

  33. Lim SL, Finkelstein A (2011) StakeRare: using social networks and collaborative filtering for large-scale requirements elicitation. IEEE Trans Software Eng 38(3):707–735

    Google Scholar 

  34. Lim SL, Quercia D, Finkelstein A (2010) StakeNet: using social networks to analyse the stakeholders of large-scale software projects. In: 2010 ACM/IEEE 32nd international conference on software engineering, vol 1, pp 295–304

  35. Lim SL, Quercia D, Finkelstein A (2010) StakeSource: harnessing the power of crowdsourcing and social networks in stakeholder analysis. In: 2010 ACM/IEEE 32nd international conference on software engineering, vol 2

  36. Lim SL, Ncube C (2013) Social networks and crowdsourcing for stakeholder analysis in system of systems projects. In: 2013 8th international conference on system of project managering, pp 13–18

  37. Linåker J, Regnell B, Damian D (2020) A method for analyzing stakeholders’ influence on an open source software ecosystem’s requirements engineering process. Requir Eng 25(1):115–130

    Article  Google Scholar 

  38. Lopez-Fernandez L, Robles G, Gonzalez-Barahona J (2004) Applying social network analysis to the information in CVS repositories. In: MSR, vol 2004

  39. Ma Q (2009) The effectiveness of requirements prioritization techniques for a medium to large number of requirements: a systematic literature review (Doctoral dissertation, Auckland University of Technology)

  40. Mao K, Capra L, Harman M, Jia Y (2017) A survey of the use of crowdsourcing in software engineering. J Syst Softw 126:57–84

    Article  Google Scholar 

  41. Massey FJ Jr (1951) The Kolmogorov-Smirnov test for goodness of fit. J Am Stat Assoc 46(253):68–78

    Article  Google Scholar 

  42. Missonier S, Loufrani-Fedida S (2014) Stakeholder analysis and engagement in projects: from stakeholder relational perspective to stakeholder relational ontology. Int J Project Manage 32(7):1108–1122

    Article  Google Scholar 

  43. Mitchell RK, Agle BR, Wood DJ (1997) Toward a theory of stakeholder identification and salience: defining the principle of who and what really counts. Acad Manag Rev 22(4):853–886

    Article  Google Scholar 

  44. Mobasher B, Cleland-Huang J (2011) Recommender systems in requirements engineering. AI Magazine 32(3):81–89

    Article  Google Scholar 

  45. Newcombe R (2003) From client to project stakeholders: a stakeholder mapping approach. Constr Manag Econ 21(8):841–848

    Article  Google Scholar 

  46. Paech B, Reuschenbach B (2006) Open source requirements engineering. In: 14th IEEE international requirements engineering conference (RE’06) IEEE, pp 257–262

  47. Pagano D, Maalej W (2013) User feedback in the appstore: an empirical study. In: Proceedings of RE, pp 125–134

  48. Parnell GS (2016) Trade-off analytics: creating and exploring the system tradespace. Wiley, London

    Google Scholar 

  49. Regnell B, Brinkkemper S (2005) Market-driven requirements engineering for software products. Engineering and managing software requirements. Springer, Berlin, pp 287–308

    Book  Google Scholar 

  50. Robinson W, Vlas R (2015) Requirements evolution and project success: an analysis of SourceForge projects

  51. Setia P, Rajagopalan B, Sambamurthy V, Calantone R (2012) How peripheral developers contribute to open-source software development. Inf Syst Res 23(1):144–163

    Article  Google Scholar 

  52. Scacchi W (2009) Understanding requirements for open source software. Design requirements engineering: a ten-year perspective. Springer, Berlin, pp 467–494

  53. Sen A, Foster JE (1997) On economic inequality. Oxford University Press, Oxford

    Google Scholar 

  54. Sharp H, Finkelstein A, Galal G (1999) Stakeholder identification in the requirements engineering process. In: Proceedings of tenth international workshop on database and expert systems applications. DEXA, vol 99, pp 387–391

  55. Snijders R, Dalpiaz F, Hosseini M, Shahri A, Ali R (2014) Crowd-centric requirements engineering. In: 2014 IEEE/ACM 7th international conference on utility and cloud computing, pp 614–615

  56. Snijders R, Atilla O, Dalpiaz F, Brinkkemper S (2015) Crowd-centric requirements engineering: a method based on crowdsourcing and gamification. Technical Report Series, (UU-CS-2015-004)

  57. Stewart KJ, Gosain S (2006) The impact of ideology on effectiveness in open source software development teams. MIS Quarterly: 291-314

  58. Stol KJ, LaToza TD, Bird C (2017) Crowdsourcing for software engineering. IEEE Softw 34(2):30–36

    Article  Google Scholar 

  59. Toral SL, Martínez-Torres MDR, Barrero F (2010) Analysis of virtual communities supporting OSS projects using social network analysis. Inf Softw Technol 52(3):296–303

    Article  Google Scholar 

  60. Ullman JB, Bentler PM (2003) Structural equation modeling. In: Handbook of psychology, pp 607–634

  61. Veall MR, Zimmermann KF (1996) Pseudo-R2 measures for some common limited dependent variable models. J Econ Surv 10(3):241–259

    Article  Google Scholar 

  62. Wackerly D, Mendenhall W, Scheaffer RL (2014) Mathematical statistics with applications. Cengage Learning, pp 477–485

  63. Watts DJ, Strogatz SH (1998) Collective dynamics of ‘small-world’ networks. Nature 393(6684):440

  64. Wang C, Daneva M, van Sinderen M, Liang P (2019) A systematic mapping study on crowdsourced requirements engineering using user feedback. J Softw Evol Proc 31(10)

  65. Wood J, Sarkani S, Mazzuchi T, Eveleigh T (2013) A framework for capturing the hidden stakeholder system. Project Manag 16(3):251–266

  66. Wooldridge JM (2015) Introductory econometrics: a modern approach: Nelson Education. Scarborough, ON, pp 60–62

    Google Scholar 

  67. Yitzhaki S (1979) Relative deprivation and the Gini coefficient. The Quarterly J Econ 321–324

  68. York J (2019) Awesome PHP GitHub Repository. https://github.com/ziadoz/awesome-php Accessed 15 June 2019

Download references

Author information

Authors and Affiliations

Authors

Contributions

The material in this paper has been abstracted and developed from a dissertation submitted to the George Washington University in partial fulfillment of the requirements for Matthew Robinson’s Ph.D. degree.

Corresponding author

Correspondence to Matthew Robinson.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Code availability

Code is available at https://github.com/MthwRobinson/sh-network-paper/tree/master/code.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

1.1 Edge direction in stakeholder networks

Using undirected, unweighted graphs follows a pattern established in most of the studies cited in Sect. 2.4. Weighted edges provide an advantage insofar as they allow the model to capture the intensity of the relationship between stakeholders. Toral, Martinez-Torres, and Barrero [59], for instance, use weighted edges to study the influence of knowledge brokers within networks and argue that more frequent interactions indicates a more robust connection between stakeholders. The network models in this research do not include weights because, for some metrics, their inclusion produces counter-intuitive results. Higher edge weights, for instance, typically reflect longer distances between nodes, whereas more frequent interactions should reduce the distance between stakeholders. Edge weights do improve fidelity for centrality measures, such as the edge degree of nodes. However, weighting would also reduce the variability of concentration measurements within the data set, making statistical inference more difficult. Given the methodological difficulty involved in incorporating edge weights and the limited potential benefit, the research team decided to use unweighted edges.

Toral, Martinez-Torres, and Barrero [59] use directed edges to represent information flow within their networks. Specifically, they use discussion threads as the primary unit of analysis and construct networks that represent a directed chain of replies. Within the context of their work, directed edges add value due to their ability to model information flows. In the current study, however, networks represent spontaneous two-way collaborations rather than a series of replies, meaning undirected edges have a more intuitive interpretation. As a result, the research team decided to construct the stakeholder networks using undirected edges.

1.2 Additional notes on the Gini coefficient

Using the Gini coefficient to measure concentration in stakeholder networks results in some interesting properties. In particular, the value of the Gini coefficient [17, 18] for the degree distribution of nodes in a network cannot reach one because every edge connects to a pair of nodes. As a result, a single node cannot collect degrees without sharing degrees with other nodes. As a result, the maximum value for the Gini coefficient for a network grows with the number of nodes, but never reaches one. Similarly, the effective lower bound of the Gini coefficient for networks remains above zero. Although this property makes the Gini coefficient an imperfect measure of network concentration, as a practical matter, it only impacts relatively small networks.

The original formulation defines the Gini coefficient using the Lorenz curve [17]. However, Sen provides an equivalent formulation, which calculates the Gini coefficient as half the mean absolute difference [53]. This formulation appears in Eq. 3. Due to its simplicity, the research team decided to compute the Gini coefficient using the Sen formulation.

$$\begin{aligned} G = \frac{\sum _{i=1}^{n} \sum _{j=i}^{n} | x_{i} - x{j} |}{2n^2 {\bar{x}}} \end{aligned}$$
(3)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Robinson, M., Sarkani, S. & Mazzuchi, T. Network structure and requirements crowdsourcing for OSS projects. Requirements Eng 26, 509–534 (2021). https://doi.org/10.1007/s00766-021-00353-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00766-021-00353-5

Keywords

Navigation