Quality Evaluation of Modern Code Reviews Through Intelligent Biometric Program Comprehension,IEEE Transactions on Software Engineering

当前位置： X-MOL 学术 › IEEE Trans. Softw. Eng. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Quality Evaluation of Modern Code Reviews Through Intelligent Biometric Program Comprehension
IEEE Transactions on Software Engineering ( IF 6.5 ) Pub Date : 2022-03-11 , DOI: 10.1109/tse.2022.3158543
Haytham Hijazi ₁ , Joao Duraes ₁ , Ricardo Couceiro ₁ , Joao Castelhano ₂ , Raul Barbosa ₁ , Julio Medeiros ₁ , Miguel Castelo-Branco ₃ , Paulo de Carvalho ₁ , Henrique Madeira ₁

Affiliation

Code review is an essential practice in software engineering to spot code defects in the early stages of software development. Modern code reviews (e.g., acceptance or rejection of pull requests with Git) have become less formal than classic Fagan's inspections, lightweight, and more reliant on individuals (i.e., reviewers). However, reviewers may encounter mentally demanding challenges during the code review, such as code comprehension difficulties or distractions that might affect the code review quality. This work proposes a novel approach that evaluates the quality of code reviews in terms of bug-finding effectiveness and provides the reviewers with a clear message of whether the review should be repeated, indicating the code regions that may not have been well-reviewed. The proposed approach utilizes biometric information collected from the reviewer during the review process using non-intrusive biofeedback devices (e.g., smartwatches). Biometric measures such as Heart Rate Variability (HRV) and task-evoked pupillary response are captured as a surrogate of the cognitive state of the reviewer (e.g., mental workload) and inexpensive desktop eye-trackers compatible with the software development settings. This work uses Artificial Intelligence techniques to predict the cognitive load from the extracted biomarkers and classify each code region according to a set of features. The final evaluation considers various factors such as code complexity, time of the code review, the experience level of the reviewer, and other factors. Our experimental results show the approach could predict the review quality with 87.77%±4.65 accuracy and a Spearman correlation coefficient of 0.85 (p-value < 0.001) between the predicted and the actual review performance. This evaluation validates the cognitive load measurement using electroencephalography (EEG) signals as ground truth for the HRV and pupil signals.

中文翻译：

通过智能生物识别程序理解对现代代码审查进行质量评估

代码审查是软件工程中的一项重要实践，用于在软件开发的早期阶段发现代码缺陷。现代代码审查（例如，使用 Git 接受或拒绝拉取请求）已经变得不像经典的 Fagan 检查那么正式、轻量级，并且更加依赖个人（即审查者）。然而，审阅者在代码审阅过程中可能会遇到精神上的挑战，例如代码理解困难或可能影响代码审阅质量的干扰。这项工作提出了一种新颖的方法，可以根据错误查找的有效性来评估代码审查的质量，并向审查者提供是否应该重复审查的明确信息，指出可能没有得到充分审查的代码区域。所提出的方法利用在审阅过程中使用非侵入式生物反馈设备（例如智能手表）从审阅者收集的生物识别信息。心率变异性 (HRV) 和任务引起的瞳孔反应等生物测量指标被捕获，作为审核者认知状态（例如，精神工作负荷）的替代指标，并且与软件开发设置兼容的廉价桌面眼动仪。这项工作使用人工智能技术从提取的生物标记中预测认知负荷，并根据一组特征对每个代码区域进行分类。最终评估考虑了各种因素，例如代码复杂性、代码审查时间、审查者的经验水平等因素。我们的实验结果表明，该方法可以以 87.77%±4.65 的准确度预测审稿质量，预测审稿表现与实际审稿表现之间的 Spearman 相关系数为 0.85（p 值 < 0.001）。该评估使用脑电图 (EEG) 信号作为 HRV 和瞳孔信号的基本事实来验证认知负荷测量。

更新日期：2022-03-11

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11