当前位置: X-MOL 学术J. Ind. Inf. Integr. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Logistics-involved task scheduling in cloud manufacturing with offline deep reinforcement learning
Journal of Industrial Information Integration ( IF 15.7 ) Pub Date : 2023-05-08 , DOI: 10.1016/j.jii.2023.100471
Xiaohan Wang, Lin Zhang, Yongkui Liu, Chun Zhao

As an application of industrial information integration engineering (IIIE) in manufacturing, cloud manufacturing (CMfg) integrates enterprises’ manufacturing information and provides an open and sharing platform for processing manufacturing tasks with distributed manufacturing services. Assigning tasks to manufacturing enterprises in the CMfg platform calls for effective scheduling algorithms. In recent years, deep reinforcement learning (DRL) has been widely applied to tackle cloud manufacturing scheduling problems (CMfg-SPs) because of its high generalization and fast-responding capability. However, the current DRL algorithms need to be trial-and-error through online interaction with the environment, which is costly and not allowed in the real CMfg platform. This paper proposes a novel offline DRL scheduling algorithm that alleviates the online trial-and-error issue while retaining DRL’s original advantages. First, we describe the system model of CMfg-SPs and propose the sequential Markov decision process modeling strategy, where all tasks are regarded as one agent. Then, we introduce the framework of the decision transformer (DT), which converts the online scheduling decision-making problem into an offline classification problem. Finally, we construct an attention-based model as the agent’s policy and train it offline under the DT architecture. Experimental results indicate that the proposed method consistently matches or exceeds online DRL algorithms, including deep double q-network (DDQN), deep recurrent q-network (DRQN), proximal policy optimization (PPO), and the offline learning algorithm behavior cloning (BC) in terms of scheduling performance and model generalization.



中文翻译:

基于离线深度强化学习的云制造中涉及物流的任务调度

云制造(CMfg)作为工业信息集成工程(IIIE)在制造业中的应用,整合了企业的制造信息,为分布式制造服务处理制造任务提供了一个开放共享的平台。在CMfg平台为制造企业分配任务需要高效的调度算法。近年来,深度强化学习(DRL)以其高泛化性和快速响应能力被广泛应用于解决云制造调度问题(CMfg-SPs)。然而,目前的DRL算法需要通过与环境的在线交互进行试错,成本高昂,在真实的CMfg平台中是不允许的。本文提出了一种新颖的离线 DRL 调度算法,在保留 DRL 原有优势的同时,减轻了在线试错问题。首先,我们描述了 CMfg-SP 的系统模型,并提出了顺序马尔可夫决策过程建模策略,其中所有任务都被视为一个代理。然后,我们引入了决策转换器 (DT) 的框架,它将在线调度决策问题转换为离线分类问题。最后,我们构建了一个基于注意力的模型作为代理的策略,并在 DT 架构下对其进行离线训练。实验结果表明,所提出的方法始终匹配或超过在线 DRL 算法,包括深度双 q 网络 (DDQN)、深度循环 q 网络 (DRQN)、近端策略优化 (PPO)、

更新日期:2023-05-13
down
wechat
bug