Simulating SQL injection vulnerability exploitation using Q-learning reinforcement learning agents,Journal of Information Security and Applications

当前位置： X-MOL 学术 › J. Inf. Secur. Appl. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Simulating SQL injection vulnerability exploitation using Q-learning reinforcement learning agents
Journal of Information Security and Applications ( IF 3.8 ) Pub Date : 2021-07-03 , DOI: 10.1016/j.jisa.2021.102903
László Erdődi ₁ , Åvald Åslaugson Sommervoll ₁ , Fabio Massimo Zennaro ₁

Affiliation

In this paper, we propose a formalization of the process of exploitation of SQL injection vulnerabilities. We consider a simplification of the dynamics of SQL injection attacks by casting this problem as a security capture-the-flag challenge. We model it as a Markov decision process, and we implement it as a reinforcement learning problem. We then deploy reinforcement learning agents tasked with learning an effective policy to perform SQL injection; we design our training in such a way that the agent learns not just a specific strategy to solve an individual challenge but a more generic policy that may be applied to perform SQL injection attacks against any system instantiated randomly by our problem generator. We analyze the results in terms of the quality of the learned policy and in terms of convergence time as a function of the complexity of the challenge and the learning agent’s complexity. Our work fits in the wider research on the development of intelligent agents for autonomous penetration testing and white-hat hacking, and our results aim to contribute to understanding the potential and the limits of reinforcement learning in a security environment.

中文翻译：

使用 Q-learning 强化学习代理模拟 SQL 注入漏洞利用

在本文中，我们提出了利用 SQL 注入漏洞的过程的形式化。我们考虑通过将此问题作为安全捕获标志挑战来简化 SQL 注入攻击的动态。我们将其建模为马尔可夫决策过程，并将其实现为强化学习问题。然后，我们部署强化学习代理，其任务是学习执行 SQL 注入的有效策略；我们以这样一种方式设计我们的训练，即代理不仅学习解决单个挑战的特定策略，而且学习更通用的策略，该策略可用于对我们的问题生成器随机实例化的任何系统执行 SQL 注入攻击。我们根据学习策略的质量和作为挑战复杂性和学习代理复杂性的函数的收敛时间来分析结果。我们的工作适合对自主渗透测试和白帽黑客的智能代理开发的更广泛研究，我们的结果旨在有助于理解强化学习在安全环境中的潜力和局限性。

更新日期：2021-07-04

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文