Abstract
Crowdsourcing system requirements enables project managers to elicit feedback from a broader range of stakeholders. The advantages of crowdsourcing include a higher volume of requirements reflecting a more comprehensive array of use cases and a more engaged and committed user base. Researchers cite the inability of project teams to effectively manage an increasing volume of system requirements as a possible drawback. This paper analyzes a data set consisting of project management artifacts from 562 open-source software (OSS) projects to determine how OSS project performance varies as the share of crowdsourced requirements increases using six measures of effectiveness: requirement close-out time, requirement response time, average comment activity, the average number of requirements per crowd member, the average retention time for crowd members, and the total volume of requirements. Additionally, the models measure how the impact of increasing the share of crowdsourced requirements changes with stakeholder network structure. The analysis shows that stakeholder network structure impacts OSS performance outcomes and that the effect changes with the share of crowdsourced requirements. OSS projects with more concentrated stakeholder networks perform the best. The results indicate that requirements crowdsourcing faces diminishing marginal returns. OSS projects that crowdsource more than 70% of their requirements benefit more from implementing processes to organize and prioritize existing requirements than from incentivizing the crowd to generate additional requirements. Analysis in the paper also suggests that OSS projects could benefit from employing CrowdRE techniques and assigning dedicated community managers to more effectively channel input from the crowd.
Similar content being viewed by others
Data availability
Data are available at https://github.com/MthwRobinson/sh-network-paper/tree/master/code.
References
Anscombe FJ, Tukey JW (1963) The examination and analysis of residuals. Technometrics 5(2):141–160
Brabham D (2013) Crowdsourcing. MIT Press, Cambridge
Brabham D (2008) Crowdsourcing as a model for problem solving: an introduction and cases. Science 14(1):75–90
Camara G, Fonseca F (2007) Information policies and open source software in developing countries. J Am Soc Inform Sci Technol 58(1):121–132
Chen, C Awesome JavaScript, GitHub Repository, https://github.com/sorrycc/awesome-javascript. Accessed 15 June 2019
Chen, V (2019) Awesome Python GitHub Repository. https://github.com/vinta/awesome-python. Accessed 15 June 2019
Crowston K, Howison J (2016) FLOSS project effectiveness measures. In: Successful OSS project design and implementation: requirements, tools, social designs and reward structures, pp 149
Damian D, Marczak S, & Kwan I (2007) Collaboration patterns and the impact of distance on awareness in requirements-centred social networks. In: 15th IEEE international requirements engineering conference (RE 2007), pp 59–68
Derksen S, Keselman HJ (1992) Backward, forward and stepwise automated subset selection algorithms: frequency of obtaining authentic and noise variables. Br J Math Stat Psychol 45(2):265–282
Fahrmeir L, Kneib T, Lang S, Marx B (2007) Regression. Springer, Berlin, pp 300–301
Fallahi, F, (2019) Awesome C++ GitHub repository. https://github.com/fffaraz/awesome-cpp Accessed 15 June 2019
Finkelstein A (1994) Requirements Engineering: a review and research agenda. In: Proceedings of 1st Asia-Pacific software engineering conference, pp 10–19
Ferrari S, Cribari-Neto F (2004) Beta regression for modelling rates and proportions. J Appl Stat 31(7):799–815
Friedman J, Hastie T, Tibshirani R (2001) The elements of statistical learning, vol 1. Springer, New York, pp 60–61
Ghapanchi A , Aurum A, Low G (2011) A taxonomy for measuring the success of open source software projects. First Monday
Gerth RJ, Burnap A, Papalambros P (2012) Crowdsourcing: a primer and its implications for project managering. Michigan University, Ann Arbor
Gini C (1912) Variabilità e mutabilità. Reprinted in Memorie di metodologica statistica. In: Pizetti E, Salvemini T (eds.) Libreria Eredi Virgilio Veschi, Rome
Gini C (1921) Measurement of inequality of incomes. Econ J 31(121):124–126
Glinz C (2019) CrowdRE: Achievements. In: Proceedings of REW, Opportunities and Pitfalls, p 2019
Groen EC, Seyff N, Ali R, Dalpiaz F, Doerr J, Guzman E, Stade M (2017) The crowd in requirements engineering: the landscape and challenges. IEEE Softw 34(2):44–52
Holland PW, Leinhardt S (1971) Transitivity in structural models of small groups. Comp Group Stud 2(2):107–124
Howe J (2008) Crowdsourcing: How the power of the crowd is driving the future of business. Random House
Howe J (2006) The rise of crowdsourcing. Wired Mag 14(6):1–4
Hosseini M, Phalp KT, Taylor J, Ali R (2014) Towards crowdsourcing for requirements engineering
INCOSE (2015) Systems engineering handbook: a guide for system life cycle processes and activities, 4th Edn. Wiley
Iyer DG, Lyytinen K (2019) Requirements engineering (RE) effectiveness in open source software: the role of social network configurations and requirements properties. In: Proceedings of the 27th European conference on information systems (ECIS), Stockholm & Uppsala, Sweden
Kuriakose J, Parsons J (2015) How do open source software (OSS) developers practice and perceive requirements engineering? an empirical study. In: 2015 IEEE Fifth international workshop on empirical requirements engineering (EmpiRE). IEEE, pp 49–56
Kull A (2019) Awesome Java GitHub repository. https://github.com/akullpp/awesome-java Accessed 15 June 2019
Khan JA, Liu L, Wen L, Ali R (2019) Crowd intelligence in requirements engineering: current status and future directions. Proc REFSQ 2019 LNCS 11412:45–261
Kluender J, Schneider K, Kortum F, Straube J, Handke L, Kauffeld S (2016) Communication in teams-an expression of social conflicts. In: Human-centered and error-resilient systems development, pp 111–129
LaToza TD, Van Der Hoek A (2015) Crowdsourcing in software engineering: models, motivations, and challenges. IEEE Softw 33(1):74–80
Levy M, Hadar I, Te’eni D (2015) A gradual approach to crowd-based requirements engineering: the case of conference online social networks. In: 2015 IEEE 1st international workshop on crowd-based requirements engineering (CrowdRE), pp 25–30, IEEE
Lim SL, Finkelstein A (2011) StakeRare: using social networks and collaborative filtering for large-scale requirements elicitation. IEEE Trans Software Eng 38(3):707–735
Lim SL, Quercia D, Finkelstein A (2010) StakeNet: using social networks to analyse the stakeholders of large-scale software projects. In: 2010 ACM/IEEE 32nd international conference on software engineering, vol 1, pp 295–304
Lim SL, Quercia D, Finkelstein A (2010) StakeSource: harnessing the power of crowdsourcing and social networks in stakeholder analysis. In: 2010 ACM/IEEE 32nd international conference on software engineering, vol 2
Lim SL, Ncube C (2013) Social networks and crowdsourcing for stakeholder analysis in system of systems projects. In: 2013 8th international conference on system of project managering, pp 13–18
Linåker J, Regnell B, Damian D (2020) A method for analyzing stakeholders’ influence on an open source software ecosystem’s requirements engineering process. Requir Eng 25(1):115–130
Lopez-Fernandez L, Robles G, Gonzalez-Barahona J (2004) Applying social network analysis to the information in CVS repositories. In: MSR, vol 2004
Ma Q (2009) The effectiveness of requirements prioritization techniques for a medium to large number of requirements: a systematic literature review (Doctoral dissertation, Auckland University of Technology)
Mao K, Capra L, Harman M, Jia Y (2017) A survey of the use of crowdsourcing in software engineering. J Syst Softw 126:57–84
Massey FJ Jr (1951) The Kolmogorov-Smirnov test for goodness of fit. J Am Stat Assoc 46(253):68–78
Missonier S, Loufrani-Fedida S (2014) Stakeholder analysis and engagement in projects: from stakeholder relational perspective to stakeholder relational ontology. Int J Project Manage 32(7):1108–1122
Mitchell RK, Agle BR, Wood DJ (1997) Toward a theory of stakeholder identification and salience: defining the principle of who and what really counts. Acad Manag Rev 22(4):853–886
Mobasher B, Cleland-Huang J (2011) Recommender systems in requirements engineering. AI Magazine 32(3):81–89
Newcombe R (2003) From client to project stakeholders: a stakeholder mapping approach. Constr Manag Econ 21(8):841–848
Paech B, Reuschenbach B (2006) Open source requirements engineering. In: 14th IEEE international requirements engineering conference (RE’06) IEEE, pp 257–262
Pagano D, Maalej W (2013) User feedback in the appstore: an empirical study. In: Proceedings of RE, pp 125–134
Parnell GS (2016) Trade-off analytics: creating and exploring the system tradespace. Wiley, London
Regnell B, Brinkkemper S (2005) Market-driven requirements engineering for software products. Engineering and managing software requirements. Springer, Berlin, pp 287–308
Robinson W, Vlas R (2015) Requirements evolution and project success: an analysis of SourceForge projects
Setia P, Rajagopalan B, Sambamurthy V, Calantone R (2012) How peripheral developers contribute to open-source software development. Inf Syst Res 23(1):144–163
Scacchi W (2009) Understanding requirements for open source software. Design requirements engineering: a ten-year perspective. Springer, Berlin, pp 467–494
Sen A, Foster JE (1997) On economic inequality. Oxford University Press, Oxford
Sharp H, Finkelstein A, Galal G (1999) Stakeholder identification in the requirements engineering process. In: Proceedings of tenth international workshop on database and expert systems applications. DEXA, vol 99, pp 387–391
Snijders R, Dalpiaz F, Hosseini M, Shahri A, Ali R (2014) Crowd-centric requirements engineering. In: 2014 IEEE/ACM 7th international conference on utility and cloud computing, pp 614–615
Snijders R, Atilla O, Dalpiaz F, Brinkkemper S (2015) Crowd-centric requirements engineering: a method based on crowdsourcing and gamification. Technical Report Series, (UU-CS-2015-004)
Stewart KJ, Gosain S (2006) The impact of ideology on effectiveness in open source software development teams. MIS Quarterly: 291-314
Stol KJ, LaToza TD, Bird C (2017) Crowdsourcing for software engineering. IEEE Softw 34(2):30–36
Toral SL, Martínez-Torres MDR, Barrero F (2010) Analysis of virtual communities supporting OSS projects using social network analysis. Inf Softw Technol 52(3):296–303
Ullman JB, Bentler PM (2003) Structural equation modeling. In: Handbook of psychology, pp 607–634
Veall MR, Zimmermann KF (1996) Pseudo-R2 measures for some common limited dependent variable models. J Econ Surv 10(3):241–259
Wackerly D, Mendenhall W, Scheaffer RL (2014) Mathematical statistics with applications. Cengage Learning, pp 477–485
Watts DJ, Strogatz SH (1998) Collective dynamics of ‘small-world’ networks. Nature 393(6684):440
Wang C, Daneva M, van Sinderen M, Liang P (2019) A systematic mapping study on crowdsourced requirements engineering using user feedback. J Softw Evol Proc 31(10)
Wood J, Sarkani S, Mazzuchi T, Eveleigh T (2013) A framework for capturing the hidden stakeholder system. Project Manag 16(3):251–266
Wooldridge JM (2015) Introductory econometrics: a modern approach: Nelson Education. Scarborough, ON, pp 60–62
Yitzhaki S (1979) Relative deprivation and the Gini coefficient. The Quarterly J Econ 321–324
York J (2019) Awesome PHP GitHub Repository. https://github.com/ziadoz/awesome-php Accessed 15 June 2019
Author information
Authors and Affiliations
Contributions
The material in this paper has been abstracted and developed from a dissertation submitted to the George Washington University in partial fulfillment of the requirements for Matthew Robinson’s Ph.D. degree.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Code availability
Code is available at https://github.com/MthwRobinson/sh-network-paper/tree/master/code.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
1.1 Edge direction in stakeholder networks
Using undirected, unweighted graphs follows a pattern established in most of the studies cited in Sect. 2.4. Weighted edges provide an advantage insofar as they allow the model to capture the intensity of the relationship between stakeholders. Toral, Martinez-Torres, and Barrero [59], for instance, use weighted edges to study the influence of knowledge brokers within networks and argue that more frequent interactions indicates a more robust connection between stakeholders. The network models in this research do not include weights because, for some metrics, their inclusion produces counter-intuitive results. Higher edge weights, for instance, typically reflect longer distances between nodes, whereas more frequent interactions should reduce the distance between stakeholders. Edge weights do improve fidelity for centrality measures, such as the edge degree of nodes. However, weighting would also reduce the variability of concentration measurements within the data set, making statistical inference more difficult. Given the methodological difficulty involved in incorporating edge weights and the limited potential benefit, the research team decided to use unweighted edges.
Toral, Martinez-Torres, and Barrero [59] use directed edges to represent information flow within their networks. Specifically, they use discussion threads as the primary unit of analysis and construct networks that represent a directed chain of replies. Within the context of their work, directed edges add value due to their ability to model information flows. In the current study, however, networks represent spontaneous two-way collaborations rather than a series of replies, meaning undirected edges have a more intuitive interpretation. As a result, the research team decided to construct the stakeholder networks using undirected edges.
1.2 Additional notes on the Gini coefficient
Using the Gini coefficient to measure concentration in stakeholder networks results in some interesting properties. In particular, the value of the Gini coefficient [17, 18] for the degree distribution of nodes in a network cannot reach one because every edge connects to a pair of nodes. As a result, a single node cannot collect degrees without sharing degrees with other nodes. As a result, the maximum value for the Gini coefficient for a network grows with the number of nodes, but never reaches one. Similarly, the effective lower bound of the Gini coefficient for networks remains above zero. Although this property makes the Gini coefficient an imperfect measure of network concentration, as a practical matter, it only impacts relatively small networks.
The original formulation defines the Gini coefficient using the Lorenz curve [17]. However, Sen provides an equivalent formulation, which calculates the Gini coefficient as half the mean absolute difference [53]. This formulation appears in Eq. 3. Due to its simplicity, the research team decided to compute the Gini coefficient using the Sen formulation.
Rights and permissions
About this article
Cite this article
Robinson, M., Sarkani, S. & Mazzuchi, T. Network structure and requirements crowdsourcing for OSS projects. Requirements Eng 26, 509–534 (2021). https://doi.org/10.1007/s00766-021-00353-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00766-021-00353-5