当前位置: X-MOL 学术arXiv.cs.SE › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
What's in a GitHub Repository? -- A Software Documentation Perspective
arXiv - CS - Software Engineering Pub Date : 2021-02-25 , DOI: arxiv-2102.12727
Akhila Sri Manasa Venigalla, Sridhar Chimalakonda

Developers use and contribute to repositories on GitHub. Documentation present in the repositories serves as an important source by helping developers to understand, maintain and contribute to the project. Currently, documentation in a repository is diversified, among various files, with most of it present in ReadMe files. However, other software artifacts in the repository, such as issue reports and pull requests could also contribute to documentation, without documentation being explicitly specified. Hence, in this paper, we propose a taxonomy of documentation sources by analyzing different software artifacts, developer interviews and card-sorting approach. We inspected multiple artifacts of 950 public GitHub repositories, written in four different programming languages, C++, C#, Python and Java, and analyzed the type and amount of documentation that could be extracted from these artifacts. To this end, we observe that, about 25.93% of information extracted from all sources proposed in the taxonomy contains error-related documentation, and that pull requests contribute to around 18.21% of extracted information.

中文翻译:

GitHub存储库中有什么?-软件文档的观点

开发人员使用并为GitHub上的存储库做出贡献。存储库中的文档通过帮助开发人员理解,维护项目并为项目做出贡献而成为重要的资料来源。当前,存储库中的文档在各种文件中是多种多样的,其中大多数都存在于自述文件中。但是,存储库中的其他软件工件(例如问题报告和拉取请求)也可能构成文档,而没有明确指定文档。因此,在本文中,我们通过分析不同的软件工件,开发人员访谈和卡片分类方法,提出了一种文档来源分类法。我们检查了950个公共GitHub存储库的多个工件,这些工件以四种不同的编程语言(C ++,C#,Python和Java)编写,并分析了可以从这些工件中提取的文档的类型和数量。为此,我们观察到,从分类法中提议的所有来源中提取的信息中约有25.93%包含与错误相关的文档,并且拉取请求约占提取的信息中的18.21%。
更新日期:2021-02-26
down
wechat
bug