Finding Effective Security Strategies through Reinforcement Learning and Self-Play,arXiv - CS - Networking and Internet Architecture

当前位置： X-MOL 学术 › arXiv.cs.NI › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Finding Effective Security Strategies through Reinforcement Learning and Self-Play
arXiv - CS - Networking and Internet Architecture Pub Date : 2020-09-17 , DOI: arxiv-2009.08120
Kim Hammar and Rolf Stadler

We present a method to automatically find security strategies for the use case of intrusion prevention. Following this method, we model the interaction between an attacker and a defender as a Markov game and let attack and defense strategies evolve through reinforcement learning and self-play without human intervention. Using a simple infrastructure configuration, we demonstrate that effective security strategies can emerge from self-play. This shows that self-play, which has been applied in other domains with great success, can be effective in the context of network security. Inspection of the converged policies show that the emerged policies reflect common-sense knowledge and are similar to strategies of humans. Moreover, we address known challenges of reinforcement learning in this domain and present an approach that uses function approximation, an opponent pool, and an autoregressive policy representation. Through evaluations we show that our method is superior to two baseline methods but that policy convergence in self-play remains a challenge.

中文翻译：

通过强化学习和自我游戏寻找有效的安全策略

我们提出了一种为入侵防御用例自动查找安全策略的方法。遵循这种方法，我们将攻击者和防御者之间的交互建模为马尔可夫博弈，并让攻击和防御策略通过强化学习和自我博弈而发展，而无需人工干预。使用简单的基础设施配置，我们证明了有效的安全策略可以从自我对弈中产生。这表明已经在其他领域取得巨大成功的自我对弈在网络安全的背景下是有效的。对融合策略的检查表明，出现的策略反映了常识性知识，类似于人类的策略。而且，我们解决了该领域强化学习的已知挑战，并提出了一种使用函数近似、对手池和自回归策略表示的方法。通过评估，我们表明我们的方法优于两种基线方法，但自我博弈中的策略收敛仍然是一个挑战。

更新日期：2020-10-06

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>