当前位置: X-MOL 学术arXiv.cs.SY › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Structure-Guided Processing Path Optimization with Deep Reinforcement Learning
arXiv - CS - Systems and Control Pub Date : 2020-09-21 , DOI: arxiv-2009.09706
Johannes Dornheim, Lukas Morand, Samuel Zeitvogel, Tarek Iraki, Norbert Link, Dirk Helm

A major goal of material design is the inverse optimization of processing-structure-property relationships. In this paper, we propose and investigate a deep reinforcement learning approach for the optimization of processing paths. The goal is to find optimal processing paths in the material structure space that lead to target structures, which have been identified beforehand to yield desired material properties. The contribution completes the desired inversion of the processing-structure-property chain in a flexible and generic way. As the relation between properties and structures is generally nonunique, typically a whole set of goal structures can be identified, that lead to desired properties. Our proposed method optimizes processing paths from a start structure to one of the equivalent goal-structures. The algorithm learns to find near-optimal paths by interacting with the structure-generating process. It is guided by structure descriptors as process state features and a reward signal, which is formulated based on a distance function in the structure space. The model-free reinforcement learning algorithm learns through trial and error while interacting with the process and does not rely on a priori sampled processing data. We instantiate and evaluate the proposed method by optimizing paths of a generic metal forming process to reach near-optimal structures, which are represented by one-point statistics of crystallographic textures.

中文翻译:

基于深度强化学习的结构引导处理路径优化

材料设计的一个主要目标是加工-结构-性能关系的逆向优化。在本文中,我们提出并研究了一种用于优化处理路径的深度强化学习方法。目标是在材料结构空间中找到通向目标结构的最佳处理路径,这些目标结构已预先确定以产生所需的材料特性。该贡献以灵活和通用的方式完成了处理-结构-属性链的所需反转。由于属性和结构之间的关系通常是非唯一的,因此通常可以识别出一整套目标结构,从而产生所需的属性。我们提出的方法优化了从起始结构到等效目标结构之一的处理路径。该算法通过与结构生成过程交互来学习找到接近最优的路径。它由结构描述符作为过程状态特征和奖励信号引导,奖励信号是基于结构空间中的距离函数制定的。无模型强化学习算法在与过程交互时通过反复试验来学习,并且不依赖于先验采样的处理数据。我们通过优化通用金属成形过程的路径以达到接近最优的结构来实例化和评估所提出的方法,这些结构由晶体纹理的单点统计数据表示。它是基于结构空间中的距离函数制定的。无模型强化学习算法在与过程交互时通过反复试验来学习,并且不依赖于先验采样的处理数据。我们通过优化通用金属成形过程的路径以达到接近最优的结构来实例化和评估所提出的方法,这些结构由晶体纹理的单点统计数据表示。它是基于结构空间中的距离函数制定的。无模型强化学习算法在与过程交互时通过反复试验来学习,并且不依赖于先验采样的处理数据。我们通过优化通用金属成形过程的路径以达到接近最优的结构来实例化和评估所提出的方法,这些结构由晶体纹理的单点统计数据表示。
更新日期:2020-11-03
down
wechat
bug