当前位置: X-MOL 学术J. Comput. Sci. Tech. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Learning Human-Written Commit Messages to Document Code Changes
Journal of Computer Science and Technology ( IF 1.2 ) Pub Date : 2020-11-01 , DOI: 10.1007/s11390-020-0496-0
Yuan Huang , Nan Jia , Hao-Jie Zhou , Xiang-Ping Chen , Zi-Bin Zheng , Ming-Dong Tang

Commit messages are important complementary information used in understanding code changes. To address message scarcity, some work is proposed for automatically generating commit messages. However, most of these approaches focus on generating summary of the changed software entities at the superficial level, without considering the intent behind the code changes (e.g., the existing approaches cannot generate such message: “fixing null pointer exception”). Considering developers often describe the intent behind the code change when writing the messages, we propose ChangeDoc, an approach to reuse existing messages in version control systems for automatical commit message generation. Our approach includes syntax, semantic, pre-syntax, and pre-semantic similarities. For a given commit without messages, it is able to discover its most similar past commit from a large commit repository, and recommend its message as the message of the given commit. Our repository contains half a million commits that were collected from SourceForge. We evaluate our approach on the commits from 10 projects. The results show that 21.5% of the recommended messages by ChangeDoc can be directly used without modification, and 62.8% require minor modifications. In order to evaluate the quality of the commit messages recommended by ChangeDoc, we performed two empirical studies involving a total of 40 participants (10 professional developers and 30 students). The results indicate that the recommended messages are very good approximations of the ones written by developers and often include important intent information that is not included in the messages generated by other tools.

中文翻译:

学习人工编写的提交消息以记录代码更改

提交消息是用于理解代码更改的重要补充信息。为了解决消息稀缺的问题,有人提出了一些自动生成提交消息的工作。然而,这些方法中的大多数侧重于在表面级别生成更改的软件实体的摘要,而没有考虑代码更改背后的意图(例如,现有方法无法生成这样的消息:“修复空指针异常”)。考虑到开发人员在编写消息时经常描述代码更改背后的意图,我们提出了 ChangeDoc,这是一种在版本控制系统中重用现有消息以自动生成提交消息的方法。我们的方法包括语法、语义、语法前和语义前的相似性。对于没有消息的给定提交,它能够从大型提交存储库中发现其最相似的过去提交,并将其消息推荐为给定提交的消息。我们的存储库包含从 SourceForge 收集的 50 万次提交。我们根据 10 个项目的提交评估我们的方法。结果表明,ChangeDoc 推荐的消息中有 21.5% 可以不修改直接使用,62.8% 需要稍作修改。为了评估 ChangeDoc 推荐的提交消息的质量,我们进行了两项实证研究,共涉及 40 名参与者(10 名专业开发人员和 30 名学生)。
更新日期:2020-11-01
down
wechat
bug