当前位置: X-MOL 学术arXiv.cs.IR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Joint Autoregressive and Graph Models for Software and Developer Social Networks
arXiv - CS - Information Retrieval Pub Date : 2021-01-21 , DOI: arxiv-2101.08729
Rima Hazra, Hardik Aggarwal, Pawan Goyal, Animesh Mukherjee, Soumen Chakrabarti

Social network research has focused on hyperlink graphs, bibliographic citations, friend/follow patterns, influence spread, etc. Large software repositories also form a highly valuable networked artifact, usually in the form of a collection of packages, their developers, dependencies among them, and bug reports. This "social network of code" is rarely studied by social network researchers. We introduce two new problems in this setting. These problems are well-motivated in the software engineering community but not closely studied by social network scientists. The first is to identify packages that are most likely to be troubled by bugs in the immediate future, thereby demanding the greatest attention. The second is to recommend developers to packages for the next development cycle. Simple autoregression can be applied to historical data for both problems, but we propose a novel method to integrate network-derived features and demonstrate that our method brings additional benefits. Apart from formalizing these problems and proposing new baseline approaches, we prepare and contribute a substantial dataset connecting multiple attributes built from the long-term history of 20 releases of Ubuntu, growing to over 25,000 packages with their dependency links, maintained by over 3,800 developers, with over 280k bug reports.

中文翻译:

用于软件和开发人员社交网络的联合自回归和图模型

社交网络研究的重点是超链接图,书目引文,朋友/关注模式,影响力扩散等。大型软件存储库还形成了非常有价值的网络工件,通常以软件包,开发人员,其中的依存关系,和错误报告。社交网络研究人员很少研究这种“代码社交网络”。在这种情况下,我们引入了两个新问题。这些问题是在软件工程界引起的,但尚未被社会网络科学家进行深入研究。首先是确定在不久的将来最有可能受到错误困扰的软件包,因此需要最大的关注。第二个是建议开发人员为下一个开发周期打包。简单的自回归可以应用于这两个问题的历史数据,但是我们提出了一种新颖的方法来集成网络派生的功能,并证明我们的方法带来了其他好处。除了将这些问题形式化并提出新的基准方法之外,我们还准备并贡献一个庞大的数据集,该数据集连接了从20发行版Ubuntu的长期历史中构建的多个属性,到35,000多个软件包及其依赖关系,由3,800多个开发人员维护,超过28万个错误报告。
更新日期:2021-01-22
down
wechat
bug