A Large-Scale Empirical Study of Real-Life Performance Issues in Open Source Projects,IEEE Transactions on Software Engineering

当前位置： X-MOL 学术 › IEEE Trans. Softw. Eng. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A Large-Scale Empirical Study of Real-Life Performance Issues in Open Source Projects
IEEE Transactions on Software Engineering ( IF 6.5 ) Pub Date : 4-14-2022 , DOI: 10.1109/tse.2022.3167628
Yutong Zhao ₁ , Lu Xiao ₁ , Andre B. Bondi ₁ , Bihuan Chen ₂ , Yang Liu ₃

Affiliation

Software performance is a critical quality attribute that determines the success of a software system. However, practitioners lack comprehensive and holistic understanding of how real-life performance issues are caused and resolved in practice from the technical, engineering, and economic perspectives. This paper presents a large-scale empirical study of 570 real-life performance issues from 13 open source projects from various problem domains, and implemented in three popular programming languages, Java (192 issues), C/C++ (162 issues), and Python (216 issues). From the technical perspective, we summarize eight general types of performance issues with corresponding root causes and resolutions that apply for all three languages. We also identify available tools for detecting and resolving different types of issues from the literature. In addition, we found that 27% of the 570 issues are resolved by design-level optimization—coordinated revision of a group of related source files and their design structure. We reveal four typical design-level optimization patterns, including classic design patterns, change propagation, optimization clone, and parallel optimization that practitioners should be aware of in resolving performance issues. From the engineering perspective, this study analyzes how test code changes in performance optimization. We found that only 15% of the 570 performance issues involve revision of test code. In most cases, the revised test cases focus on the functional logic of the performance optimization, rather than directly evaluate the performance improvement. This finding points to the potential lack of engineering standard for formally verifying performance optimization in regression testing. Finally, from the economic perspective, we analyze the “Return On Investment” of performance optimization. We found that design-level optimization usually requires more investment, but not always yields to higher performance improvement. However, developers tend to use design-level optimization when they concern about other quality attributes, such as maintainability and readability.

中文翻译：

对开源项目中实际性能问题的大规模实证研究

软件性能是决定软件系统成功的关键质量属性。然而，从技术、工程和经济的角度，实践者对现实生活中的性能问题是如何产生和解决的缺乏全面、整体的理解。本文对来自不同问题领域的 13 个开源项目的 570 个现实性能问题进行了大规模实证研究，并以三种流行的编程语言 Java（192 个问题）、C/C++（162 问题）和 Python 实现（216 期）。从技术角度来看，我们总结了八种常见的性能问题，以及适用于所有三种语言的相应根本原因和解决方案。我们还从文献中确定了用于检测和解决不同类型问题的可用工具。此外，我们发现 570 个问题中有 27% 是通过设计级优化解决的，即对一组相关源文件及其设计结构进行协调修订。我们揭示了四种典型的设计级优化模式，包括实践者在解决性能问题时应注意的经典设计模式、变更传播、优化克隆和并行优化。本研究从工程角度分析测试代码在性能优化中如何变化。我们发现，570 个性能问题中，只有 15% 涉及测试代码的修改。大多数情况下，修改后的测试用例侧重于性能优化的功能逻辑，而不是直接评估性能提升情况。这一发现表明，回归测试中可能缺乏正式验证性能优化的工程标准。最后，我们从经济角度分析性能优化的“投资回报率”。我们发现设计级优化通常需要更多的投资，但并不总是能带来更高的性能提升。然而，当开发人员关注其他质量属性（例如可维护性和可读性）时，他们倾向于使用设计级优化。

更新日期：2024-08-28

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11