当前位置: X-MOL 学术arXiv.cs.SE › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The SmartSHARK Repository Mining Data
arXiv - CS - Software Engineering Pub Date : 2021-02-23 , DOI: arxiv-2102.11540
Alexander Trautsch, Steffen Herbold

The SmartSHARK repository mining data is a collection of rich and detailed information about the evolution of software projects. The data is unique in its diversity and contains detailed information about each change, issue tracking data, continuous integration data, as well as pull request and code review data. Moreover, the data does not contain only raw data scraped from repositories, but also annotations in form of labels determined through a combination of manual analysis and heuristics, as well as links between the different parts of the data set. The SmartSHARK data set provides a rich source of data that enables us to explore research questions that require data from different sources and/or longitudinal data over time.

中文翻译:

SmartSHARK存储库挖掘数据

SmartSHARK存储库挖掘数据是有关软件项目演变的丰富而详细的信息的集合。数据的多样性是唯一的,并且包含有关每个更改的详细信息,问题跟踪数据,持续集成数据以及拉取请求和代码审查数据。此外,数据不仅包含从存储库中抓取的原始数据,还包含通过手动分析和试探法的组合以及数据集不同部分之间的链接确定的标签形式的注释。SmartSHARK数据集提供了丰富的数据源,使我们能够探索研究问题,这些问题需要随着时间的推移来自不同来源的数据和/或纵向数据。
更新日期:2021-02-24
down
wechat
bug