Benchmarking the Capability of Symbolic Execution Tools with Logic Bombs,IEEE Transactions on Dependable and Secure Computing

当前位置： X-MOL 学术 › IEEE Trans. Dependable Secure Comput. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Benchmarking the Capability of Symbolic Execution Tools with Logic Bombs
IEEE Transactions on Dependable and Secure Computing ( IF 7.0 ) Pub Date : 2020-11-01 , DOI: 10.1109/tdsc.2018.2866469
Hui Xu , Zirui Zhao , Yangfan Zhou , Michael R. Lyu

Symbolic execution has become an indispensable technique for software testing and program analysis. However, since several symbolic execution tools are presently available off-the-shelf, there is a need for a practical benchmarking approach. This paper introduces a fresh approach that can help benchmark symbolic execution tools in a fine-grained and efficient manner. The approach evaluates the performance of such tools against known challenges faced by general symbolic execution techniques, e.g., floating-point numbers and symbolic memories. We first survey related papers and systematize the challenges of symbolic execution. We extract 12 distinct challenges from the literature and categorize them into two categories: symbolic-reasoning challenges and path-explosion challenges. Next, we develop a dataset of logic bombs and a framework for benchmarking symbolic execution tools automatically. For each challenge, our dataset contains several logic bombs, each addressing a specific challenging problem. Triggering one or more logic bombs confirms that the symbolic execution tool in question is able to handle the corresponding problem. Real-world experiments with three popular symbolic execution tools, namely, KLEE, angr, and Triton have shown that our approach can reveal the capabilities and limitations of the tools in handling specific issues accurately and efficiently. The benchmarking process generally takes only a few dozens of minutes to evaluate a tool. We have released our dataset on GitHub as open source, with an aim to better facilitate the community to conduct future work on benchmarking symbolic execution tools.

中文翻译：

使用逻辑炸弹对符号执行工具的能力进行基准测试

符号执行已成为软件测试和程序分析不可或缺的技术。然而，由于目前有几种现成的符号执行工具，因此需要一种实用的基准测试方法。本文介绍了一种新方法，可以帮助以细粒度和高效的方式对符号执行工具进行基准测试。该方法针对一般符号执行技术（例如，浮点数和符号存储器）面临的已知挑战评估此类工具的性能。我们首先调查相关论文并将符号执行的挑战系统化。我们从文献中提取了 12 个不同的挑战，并将它们分为两类：符号推理挑战和路径爆炸挑战。下一个，我们开发了一个逻辑炸弹数据集和一个自动对符号执行工具进行基准测试的框架。对于每个挑战，我们的数据集包含几个逻辑炸弹，每个都解决一个特定的挑战性问题。触发一个或多个逻辑炸弹证实了所讨论的符号执行工具能够处理相应的问题。使用三种流行的符号执行工具，即 KLEE、angr 和 Triton 进行的实际实验表明，我们的方法可以揭示这些工具在准确有效地处理特定问题方面的能力和局限性。基准测试过程通常只需要几十分钟来评估一个工具。我们已在 GitHub 上以开源形式发布了我们的数据集，旨在更好地促进社区开展未来对符号执行工具进行基准测试的工作。

更新日期：2020-11-01

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11