Convergence analysis of the deep neural networks based globalized dual heuristic programming,Automatica

当前位置： X-MOL 学术 › Automatica › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Convergence analysis of the deep neural networks based globalized dual heuristic programming
Automatica ( IF 4.8 ) Pub Date : 2020-08-26 , DOI: 10.1016/j.automatica.2020.109222
Jong Woo Kim , Tae Hoon Oh , Sang Hwan Son , Dong Hwi Jeong , Jong Min Lee

Globalized dual heuristic programming (GDHP) algorithm is a special form of approximate dynamic programming (ADP) method that solves the Hamilton–Jacobi–Bellman (HJB) equation for the case where the system takes control-affine form subject to the quadratic cost function. This study incorporates the deep neural networks (DNNs) as a function approximator to inherit the advantages of which to express high-dimensional function space. Elementwise error bound of the costate function sequence is newly derived and the convergence property is presented. In the approximated function space, uniformly ultimate boundedness (UUB) condition for the weights of the general multi-layer NNs weights is obtained. It is also proved that under the gradient descent method for solving the moving target regression problem, UUB gradually converges to the value, which exclusively contains the approximation reconstruction error. The proposed method is demonstrated on the continuous reactor control in aims to obtain the control policy for multiple initial states, which justifies the necessity of DNNs structure for such cases.

中文翻译：

基于全局双重启发式编程的深度神经网络的收敛性分析

全局双重启发式规划（GDHP）算法是一种近似形式的动态规划（ADP）方法的特殊形式，它解决了系统采用仿射形式服从二次成本函数的情况下的Hamilton–Jacobi–Bellman（HJB）方程的问题。这项研究结合了深度神经网络（DNN）作为函数逼近器，以继承其表达高维函数空间的优势。重新推导了肋函数序列的元素误差界，并给出了收敛性。在近似函数空间中，获得了通用多层NNs权重的统一极限有界性（UUB）条件。还证明了在梯度下降法下解决运动目标回归问题，UUB逐渐收敛到该值，其中仅包含近似重建误差。提出的方法在反应堆连续控制中得到了证明，旨在获得多个初始状态的控制策略，证明了这种情况下DNN结构的必要性。

更新日期：2020-08-27

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11