当前位置: X-MOL 学术Knowl. Based Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Learning to detect community smells in open source software projects
Knowledge-Based Systems ( IF 8.8 ) Pub Date : 2020-07-03 , DOI: 10.1016/j.knosys.2020.106201
Nuri Almarimi , Ali Ouni , Mohamed Wiem Mkaouer

Community smells are symptoms of organizational and social issues within the software development community that often lead to additional project costs. Recent studies identified a variety of community smells and defined them as sub-optimal patterns connected to organizational-social structures in the software development community. To early detect and discover existence of potential community smells in a software project, we introduce, in this paper, a novel machine learning-based detection approach, named csDetector, that learns from various existing bad community development practices to provide automated support in detecting such community smells. In particular, our approach learns from a set of organizational-social symptoms that characterize the existence of potential instances of community smells in a software project. We built a detection model using Decision Tree by adopting the C4.5 classifier to identify eight commonly occurring community smells in software projects. To evaluate the performance of our approach, we conduct an empirical study on a benchmark of 74 open source projects from Github. Our statistical results show a high performance of csDetector, achieving an average accuracy of 96% and AUC of 0.94. Moreover, our results indicate that the csDetector outperforms two recent state-of-the-art techniques in terms of detection accuracy. Finally, we investigate the most influential community-related metrics to identify each community smell type. We found that the number of commits and developers per time zone, the number of developers per community, and the social network betweenness and closeness centrality are the most influential community characteristics.



中文翻译:

学习发现开源软件项目中的社区气味

社区气味是软件开发社区中组织和社会问题的症状,通常会导致额外的项目成本。最近的研究发现了各种社区气味,并将它们定义为与软件开发社区中的组织社会结构相关的次优模式。为了尽早发现并发现软件项目中潜在的社区气味,我们在本文中介绍了一种新颖的基于机器学习的检测方法,称为cSDetector,从各种现有的不良社区开发实践中学习,以提供自动支持来检测此类社区气味。特别是,我们的方法从一组组织社会症状中学习,这些症状表征了软件项目中潜在的社区气味实例的存在。通过采用C4.5分类器,我们使用决策树构建了检测模型,以识别软件项目中的八种常见社区气味。为了评估我们方法的效果,我们对来自Github的74个开源项目的基准进行了实证研究。我们的统计结果表明,csDetector的性能很高,平均准确度达96%,AUC为0.94。此外,我们的结果表明,csDetector就检测精度而言,其性能优于两种最新技术。最后,我们调查最有影响力的社区相关指标,以识别每种社区的气味类型。我们发现,每个时区的提交和开发人员数量,每个社区的开发人员数量以及社交网络之间的紧密联系是最有影响力的社区特征。

更新日期:2020-07-10
down
wechat
bug