当前位置: X-MOL 学术Energy Convers. Manag. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Twin delayed deep deterministic policy gradient-based deep reinforcement learning for energy management of fuel cell vehicle integrating durability information of powertrain
Energy Conversion and Management ( IF 10.4 ) Pub Date : 2022-11-14 , DOI: 10.1016/j.enconman.2022.116454
Yuanzhi Zhang, Caizhi Zhang, Ruijia Fan, Shulong Huang, Yun Yang, Qianwen Xu

Deep reinforcement learning (DRL)-based energy management strategy (EMS) is attractive for fuel cell vehicle (FCV). Nevertheless, the fuel economy and lifespan durability of proton exchange membrane fuel cell (PEMFC) stack and lithium-ion battery (LIB) may not be synchronously optimized since transient degradation variations of PEMFC stack and LIB are not generally regarded for DRL-based EMSs. Furthermore, the inappropriate action space and the overestimated value function of DRL can lead to suboptimal EMS for on-line control. To this end, the objective of this research endeavors to formulate a twin delayed deep deterministic policy gradient (TD3)-based EMS integrating durability information of PEMFC stack and LIB, which can interact with the vehicle operating states to continuously control the hybrid powertrain and limit the overestimation of DRL value function for ensuring maximum multi-objective reward at each moment. Unlike traditional DRL-based EMSs, the multi-objective reward function for this study is enlarged to incorporate the hydrogen consumption, state of charge (SOC)-sustaining penalty and transient lifespan degradation information of PEMFC stack and LIB in off-line training and on-line control. The results demonstrate that the proposed EMS can drastically lessen the training time and computational burden. Meanwhile, in contrast with deep Q-network (DQN)-based and deep deterministic policy gradient (DDPG)-based EMSs in the various real-world urban and standard driving cycles, the proposed EMS can achieve hydrogen abatement at least 9.76% and 1.07%, and slow down total powertrain degradation at least 9.11% and 2.62%, respectively.



中文翻译:

基于双延迟深度确定性策略梯度的深度强化学习,用于燃料电池汽车能量管理,集成动力总成耐久性信息

基于深度强化学习 (DRL) 的能量管理策略 (EMS) 对燃料电池汽车 (FCV) 具有吸引力。然而,质子交换膜燃料电池 (PEMFC) 堆和锂离子电池 (LIB) 的燃料经济性和寿命耐久性可能无法同步优化,因为 PEMFC 堆和 LIB 的瞬态退化变化通常不被视为基于 DRL 的 EMS。此外,不适当的动作空间和高估的 DRL 价值函数可能导致在线控制的 EMS 不理想。为此,本研究的目标是制定基于双延迟深度确定性策略梯度 (TD3) 的 EMS,集成 PEMFC 堆和 LIB 的耐久性信息,它可以与车辆运行状态相互作用,以持续控制混合动力系统并限制 DRL 价值函数的高估,以确保每个时刻的最大多目标奖励。与传统的基于 DRL 的 EMS 不同,本研究的多目标奖励函数被扩大,以纳入 PEMFC 堆和 LIB 在离线训练和-线控制。结果表明,所提出的 EMS 可以大大减少训练时间和计算负担。同时,与基于深度 Q 网络(DQN)和基于深度确定性策略梯度(DDPG)的 EMS 在各种现实世界城市和标准驾驶循环中相比,所提出的 EMS 可以实现至少 9.76% 和 1 的氢气减排。 .

更新日期:2022-11-15
down
wechat
bug