Prioritizing versions for performance regression testing: The Pharo case,Science of Computer Programming

当前位置： X-MOL 学术 › Sci. Comput. Program. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Prioritizing versions for performance regression testing: The Pharo case
Science of Computer Programming ( IF 1.3 ) Pub Date : 2020-02-06 , DOI: 10.1016/j.scico.2020.102415
Juan Pablo Sandoval Alcocer , Alexandre Bergel , Marco Tulio Valente

Context

Software performance may suffer regressions caused by source code changes. Measuring performance at each new software version is useful for early detection of performance regressions. However, systematically running benchmarks is often impractical (e.g., long running execution, prioritizing functional correctness over non-functional).

Objective

In this article, we propose Horizontal Profiling, a sampling technique to predict when a new revision may cause a regression by analyzing the source code and using run-time information of a previous version. The goal of Horizontal Profiling is to reduce the performance testing overhead by benchmarking just software versions that contain costly source code changes.

Method

We present an evaluation in which we apply Horizontal Profiling to identify performance regressions of 17 software projects written in the Pharo programming language, totaling 1,288 software versions.

Results

Horizontal Profiling detects more than 80% of the regressions by benchmarking less than 20% of the versions. In addition, our experiments show that Horizontal Profiling has better precision and executes the benchmarks in less versions that the state of the art tools, under our benchmarks.

Conclusions

We conclude that by adequately characterizing the run-time information of a previous version, it is possible to determine if a new version is likely to introduce a performance regression or not. As a consequence, a significant fraction of the performance regressions are identified by benchmarking only a small fraction of the software versions.

中文翻译：