Machine learning based success prediction for crowdsourcing software projects,Journal of Systems and Software

当前位置： X-MOL 学术 › J. Syst. Softw. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Machine learning based success prediction for crowdsourcing software projects
Journal of Systems and Software ( IF 3.7 ) Pub Date : 2021-04-20 , DOI: 10.1016/j.jss.2021.110965
Inam Illahi , Hui Liu , Qasim Umer , Nan Niu

Competitive Crowdsourcing Software Development is an online software development paradigm, promises the innovative, cost effective and high quality solutions on time. However, the paradigm is still in infancy and does not address the key challenges such as low rate of submissions and high risk of project failure. A significant number of software projects fail to receive a satisfactory solution and end up wasting the time and efforts of stakeholders. Therefore, the success prediction of a new software project may help stakeholders in the project crowdsourcing decision, saving their time and efforts. To this end, this study proposes a novel approach based on machine learning to predict the success of a software project for crowdsourcing platforms in terms of whether the given project will reach its completion or otherwise. First, the textual description and important attributes of software projects from TopCoder is extracted. Next, the description is preprocessed using natural language processing technologies. Then, keywords are identified using a modified keyword ranking algorithm and each software project is awarded a ranking score. Every software project is modeled as a vector that is based on the extracted attributes, its identified keywords and ranking scores. Using these vectors with their associated solution status, a support vector machine classifier is trained to predict the success of a given software project. Different machine learning classifiers are applied and it turns out that support vector machine yields the highest performance on the given dataset. Finally, the proposed approach is evaluated with history data of real software projects. The results of hold-out validation suggest that the average precision, recall, and f-measure are up to 94.53%, 99.30% and 96.85%, respectively.

中文翻译：

基于机器学习的众包软件项目成功预测

竞争性众包软件开发是一种在线软件开发范例，承诺按时提供创新，具有成本效益和高质量的解决方案。但是，该范式仍处于起步阶段，并未解决关键挑战，例如提交率低和项目失败的风险高。大量软件项目未能获得令人满意的解决方案，最终浪费了利益相关者的时间和精力。因此，新软件项目的成功预测可以帮助利益相关者做出项目众包决策，从而节省他们的时间和精力。为此，本研究提出了一种基于机器学习的新颖方法，可以根据给定项目是否会完成来预测用于众包平台的软件项目的成功。第一的，从TopCoder中提取了软件项目的文本描述和重要属性。接下来，使用自然语言处理技术对描述进行预处理。然后，使用修改后的关键字排名算法识别关键字，并为每个软件项目授予一个排名分数。每个软件项目都基于提取的属性，其标识的关键字和排名分数，被建模为一个向量。使用这些向量及其关联的解决方案状态，对支持向量机分类器进行了训练，以预测给定软件项目的成功。应用了不同的机器学习分类器，事实证明，支持向量机在给定的数据集上表现出最高的性能。最后，利用真实软件项目的历史数据对提出的方法进行了评估。精确度，召回率和f量度分别高达94.53％，99.30％和96.85％。

更新日期：2021-04-24

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11