当前位置: X-MOL 学术ACM Trans. Priv. Secur. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Adaptive Cyber Defense Against Multi-Stage Attacks Using Learning-Based POMDP
ACM Transactions on Privacy and Security ( IF 2.3 ) Pub Date : 2020-11-09 , DOI: 10.1145/3418897
Zhisheng Hu 1 , Minghui Zhu 2 , Peng Liu 2
Affiliation  

Growing multi-stage attacks in computer networks impose significant security risks and necessitate the development of effective defense schemes that are able to autonomously respond to intrusions during vulnerability windows. However, the defender faces several real-world challenges, e.g., unknown likelihoods and unknown impacts of successful exploits. In this article, we leverage reinforcement learning to develop an innovative adaptive cyber defense to maximize the cost-effectiveness subject to the aforementioned challenges. In particular, we use Bayesian attack graphs to model the interactions between the attacker and networks. Then we formulate the defense problem of interest as a partially observable Markov decision process problem where the defender maintains belief states to estimate system states, leverages Thompson sampling to estimate transition probabilities, and utilizes reinforcement learning to choose optimal defense actions using measured utility values. The algorithm performance is verified via numerical simulations based on real-world attacks.

中文翻译:

使用基于学习的 POMDP 对多阶段攻击进行自适应网络防御

计算机网络中越来越多的多阶段攻击带来了重大的安全风险,因此需要开发能够在漏洞窗口期间自主响应入侵的有效防御方案。然而,防御者面临着几个现实世界的挑战,例如,成功利用的未知可能性和未知影响。在本文中,我们利用强化学习来开发创新的自适应网络防御,以最大限度地提高成本效益,以应对上述挑战。特别是,我们使用贝叶斯攻击图来模拟攻击者和网络之间的交互。然后我们将感兴趣的防御问题表述为一个部分可观察的马尔可夫决策过程问题,其中防御者维持信念状态来估计系统状态,利用 Thompson 抽样来估计转移概率,并利用强化学习使用测量的效用值选择最佳防御行动。通过基于真实世界攻击的数值模拟验证算法性能。
更新日期:2020-11-09
down
wechat
bug