Machine learning steered symbolic execution framework for complex software code,Formal Aspects of Computing

当前位置： X-MOL 学术 › Form. Asp. Comput. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Machine learning steered symbolic execution framework for complex software code
Formal Aspects of Computing ( IF 1 ) Pub Date : 2021-05-26 , DOI: 10.1007/s00165-021-00538-3
Lei Bu ₁ , Yongjuan Liang ₁ , Zhunyi Xie ₁ , Hong Qian ₁ , Yi-Qi Hu ₁ , Yang Yu ₁ , Xin Chen ₁ , Xuandong Li ₁

Affiliation

During program traversing, symbolic execution collects path conditions and feeds them to a constraint solver to obtain feasible solutions. However, complex path conditions, like nonlinear constraints, which widely appear in programs, are hard to be handled efficiently by the existing solvers. In this paper, we adapt the classical symbolic execution framework with a machine learning approach for constraint satisfaction. The approach samples and learns from different solutions to identify potentially feasible area. This sampling-learning style solving can be applied in different class of complex problems easily. Therefore, incorporating this approach, our framework, MLBSE, supports the symbolic execution of not only simple linear path conditions, but also nonlinear arithmetic operations, and even black-box function calls of library methods. Meanwhile, thanks to the theoretical foundation of the machine learning based approach, when the solver fails to solve a path condition, we can have an estimation of the confidence in the satisfiability (ECS) of the problem to give users insights about how the problem is analyzed and whether they could ultimately find a solution. We implement MLBSE on the basis of Symbolic Path Finder (SPF) into a fully automatic Java symbolic execution engine. Users can feed their code to MLBSE directly, which is very convenient to use. To evaluate its performance, 22 real case programs are used as the benchmarks for MLBSE to generate test cases, which involve a total number of 1042 methods that are full of nonlinear operations, floating-point arithmetic as well as native method calls. Experiment results show that the coverage achieved by MLBSE is much higher than the state-of-the-art tools.

中文翻译：

机器学习引导复杂软件代码的符号执行框架

在程序遍历期间，符号执行收集路径条件并将它们提供给约束求解器以获得可行的解决方案。然而，复杂的路径条件，如广泛出现在程序中的非线性约束，很难被现有的求解器有效处理。在本文中，我们将经典的符号执行框架与机器学习方法相结合，以实现约束满足。该方法对不同的解决方案进行采样和学习，以识别潜在的可行区域。这种采样学习风格的求解可以很容易地应用于不同类别的复杂问题。因此，结合这种方法，我们的框架 MLBSE 不仅支持简单的线性路径条件的符号执行，还支持非线性算术运算，甚至库方法的黑盒函数调用。同时，得益于基于机器学习的方法的理论基础，当求解器无法解决路径条件时，我们可以对问题的可满足性（ECS）进行置信度估计，从而让用户了解问题是如何发生的。分析他们是否最终能找到解决方案。我们在符号路径查找器 (SPF) 的基础上将 MLBSE 实现为全自动 Java 符号执行引擎。用户可以直接将自己的代码提供给 MLBSE，使用起来非常方便。为了评估其性能，以 22 个真实案例程序作为 MLBSE 生成测试用例的基准，总共涉及 1042 种方法，其中充满了非线性运算、浮点运算以及原生方法调用。

更新日期：2021-05-26

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>