当前位置: X-MOL 学术J. Phys. Commun. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Reinforcement learning in discrete action space applied to inverse defect design
Journal of Physics Communications ( IF 1.1 ) Pub Date : 2021-03-22 , DOI: 10.1088/2399-6528/abe591
Troy D Loeffler 1, 2 , Suvo Banik 1, 2 , Tarak K Patra 2, 3 , Michael Sternberg 2 , Subramanian K R S Sankaranarayanan 1, 2
Affiliation  

Reinforcement learning (RL) algorithms that include Monte Carlo Tree Search (MCTS) have found tremendous success in computer games such as Go, Shiga and Chess. Such learning algorithms have demonstrated super-human capabilities in navigating through an exhaustive discrete action search space. Motivated by their success in computer games, we demonstrate that RL can be applied to inverse materials design problems. We deploy RL for a representative case of the optimal atomic scale inverse design of extended defects via rearrangement of chalcogen (e.g. S) vacancies in 2D transition metal dichalcogenides (e.g. MoS2). These defect rearrangements and their dynamics are important from the perspective of tunable phase transition in 2D materials i.e. 2H (semi-conducting) to 1T (metallic) in MoS2. We demonstrate the ability of MCTS interfaced with a reactive molecular dynamics simulator to efficiently sample the defect phase space and perform inverse design—starting from randomly distributed S vacancies, the optimal defect rearrangement of defects corresponds a line defect of S vacancies. We compare MCTS performance with evolutionary optimization i.e. genetic algorithms and show that MCTS converges to a better optimal solution (lower objective) and in fewer evaluations compared to GA. We also comprehensively evaluate and discuss the effect of MCTS hyperparameters on the convergence to solution. Overall, our study demonstrates the effectives of using RL approaches that operate in discrete action space for inverse defect design problems.



中文翻译:

应用于逆缺陷设计的离散动作空间中的强化学习

包括蒙特卡罗树搜索 (MCTS) 在内的强化学习 (RL) 算法在围棋、志贺和国际象棋等计算机游戏中取得了巨大成功。这样的学习算法已经展示了在详尽的离散动作搜索空间中导航的超人类能力。受他们在计算机游戏中的成功启发,我们证明了 RL 可以应用于逆向材料设计问题。我们通过重新排列二维过渡金属二硫属元素化物(例如 MoS 2)中的硫属元素(例如 S)空位,为扩展缺陷的最佳原子尺度逆向设计的代表性案例部署 RL 。从二维材料中的可调相变的角度来看,这些缺陷重排及其动力学很重要,即 MoS 2 中的2H(半导体)到 1T(金属). 我们证明了 MCTS 与反应性分子动力学模拟器接口的能力,能够有效地对缺陷相空间进行采样并执行逆向设计——从随机分布的 S 空位开始,缺陷的最佳缺陷重排对应于 S 空位的线缺陷。我们将 MCTS 性能与进化优化(即遗传算法)进行比较,并表明与 GA 相比,MCTS 收敛到更好的最佳解决方案(较低的目标)和更少的评估。我们还全面评估和讨论了 MCTS 超参数对求解收敛的影响。总的来说,我们的研究证明了使用在离散动作空间中运行的 RL 方法来解决逆缺陷设计问题的有效性。

更新日期:2021-03-22
down
wechat
bug