当前位置: X-MOL 学术J. Manuf. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A fast decision-making method for process planning with dynamic machining resources via deep reinforcement learning
Journal of Manufacturing Systems ( IF 12.2 ) Pub Date : 2021-01-18 , DOI: 10.1016/j.jmsy.2020.12.015
Wenbo Wu , Zhengdong Huang , Jiani Zeng , Kuan Fan

Mass customized production brings great uncertainty to the computer-aided process planning (CAPP). Current CAPP methods based on heuristic optimization assume in advance that manufacturing resources are static and make a deterministic plan that cannot cope with the uncertainty of the manufacture environment. As a promising method in solving complex and dynamic decision-making problems, deep reinforcement learning is employed in this paper for process planning, aiming at promoting the response speed by exploiting the reusability and expandability of past decision-making experiences. To simplify the decision procedure, two different types of decisions, operation sequencing and resource selection, are fused into one by integrating environment states and agent behaviors in a matrix manner. Then, a masking algorithm is developed to screen out currently inexecutable machining operations at each decision step and process planning datasets are generated for training and testing according to the actual processing logic. Next, the Monte Carlo method and the deep learning algorithm are utilized to evaluate and improve the process policy, respectively. Finally, the searching capability of the proposed method for both static and dynamic manufacturing resources are tested in case studies, and the results are discussed. It is shown that the proposed approach can solve the planning problem more efficiently compared with current optimization-based approaches.



中文翻译:

通过深度强化学习,利用动态加工资源进行工艺计划的快速决策方法

大规模定制生产给计算机辅助过程计划(CAPP)带来了极大的不确定性。当前基于启发式优化的CAPP方法预先假定制造资源是静态的,并制定了无法应对制造环境不确定性的确定性计划。作为解决复杂动态决策问题的一种有前途的方法,本文采用深度强化学习进行过程规划,旨在通过利用过去决策经验的可重用性和可扩展性来提高响应速度。为了简化决策过程,通过以矩阵方式集成环境状态和代理行为,将两种不同类型的决策(操作顺序和资源选择)融合为一种。然后,开发了一种掩蔽算法,以在每个决策步骤中筛选出当前无法执行的加工操作,并根据实际的处理逻辑生成过程计划数据集以进行培训和测试。接下来,分别使用蒙特卡洛方法和深度学习算法来评估和改进过程策略。最后,通过实例研究了该方法对静态和动态制造资源的搜索能力,并讨论了结果。结果表明,与当前基于优化的方法相比,该方法可以更有效地解决规划问题。蒙特卡罗方法和深度学习算法分别用于评估和改进过程策略。最后,通过实例研究了该方法对静态和动态制造资源的搜索能力,并讨论了结果。结果表明,与当前基于优化的方法相比,该方法可以更有效地解决规划问题。蒙特卡罗方法和深度学习算法分别用于评估和改进过程策略。最后,通过实例研究了该方法对静态和动态制造资源的搜索能力,并讨论了结果。结果表明,与当前基于优化的方法相比,该方法可以更有效地解决规划问题。

更新日期:2021-01-19
down
wechat
bug