Optimal mixed block withholding attacks based on reinforcement learning,International Journal of Intelligent Systems

当前位置： X-MOL 学术 › Int. J. Intell. Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Optimal mixed block withholding attacks based on reinforcement learning
International Journal of Intelligent Systems ( IF 7 ) Pub Date : 2020-09-01 , DOI: 10.1002/int.22282
Yilei Wang ₁ , Guoyu Yang ₁ , Tao Li _{1,

2,

3} , Lifeng Zhang ₄ , Yanli Wang ₁ , Lishan Ke ₅ , Yi Dou ₆ , Shouzhe Li ₇ , Xiaomei Yu ₈

Affiliation

The vulnerabilities in cryptographic currencies facilitate the adversarial attacks. Therefore, the attackers have incentives to increase their rewards by strategic behaviors. Block withholding attacks (BWH) are such behaviors that attackers withhold blocks in the target pools to subvert the blockchain ecosystem. Furthermore, BWH attacks may dwarf the countermeasures by combining with selfish mining attacks or other strategic behaviors, for example, fork after withholding (FAW) attacks and power adaptive withholding (PAW) attacks. That is, the attackers may be intelligent enough such that they can dynamically gear their behaviors to optimal attacking strategies. In this paper, we propose mixed‐BWH attacks with respect to intelligent attackers, who leverage reinforcement learning to pin down optimal strategic behaviors to maximize their rewards. More specifically, the intelligent attackers strategically toggle among BWH, FAW, and PAW attacks. Their main target is to fine‐tune the optimal behaviors, which incur maximal rewards. The attackers pinpoint the optimal attacking actions with reinforcement learning, which is formalized into a Markov decision process. The simulation results show that the rewards of the mixed strategy are much higher than that of honest strategy for the attackers. Therefore, the attackers have enough incentives to adopt the mixed strategy.

中文翻译：

基于强化学习的最优混合块扣留攻击

加密货币中的漏洞促进了对抗性攻击。因此，攻击者有动机通过战略行为来增加他们的奖励。区块扣留攻击（BWH）是攻击者扣留目标池中的区块以颠覆区块链生态系统的行为。此外，BWH 攻击可能与自私挖矿攻击或其他战略行为相结合，例如分叉后扣留（FAW）攻击和功率自适应扣留（PAW）攻击，从而使对策相形见绌。也就是说，攻击者可能足够聪明，以至于他们可以动态地将他们的行为调整为最佳攻击策略。在本文中，我们针对智能攻击者提出了混合 BWH 攻击，他们利用强化学习来确定最佳战略行为以最大化他们的奖励。更具体地说，智能攻击者战略性地在 BWH、FAW 和 PAW 攻击之间切换。他们的主要目标是微调最佳行为，从而获得最大回报。攻击者通过强化学习确定最佳攻击动作，并将其形式化为马尔可夫决策过程。仿真结果表明，混合策略对攻击者的奖励远高于诚实策略。因此，攻击者有足够的动机采取混合策略。仿真结果表明，混合策略对攻击者的奖励远高于诚实策略。因此，攻击者有足够的动机采取混合策略。仿真结果表明，混合策略对攻击者的奖励远高于诚实策略。因此，攻击者有足够的动机采取混合策略。

更新日期：2020-09-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>