当前位置: X-MOL 学术Int. J. Syst. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Optimal control for unknown mean-field discrete-time system based on Q-Learning
International Journal of Systems Science ( IF 4.9 ) Pub Date : 2021-05-20 , DOI: 10.1080/00207721.2021.1929554
Yingying Ge 1 , Xikui Liu 2 , Yan Li 2
Affiliation  

Solving the optimal mean-field control problem usually requires complete system information. In this paper, a Q-learning algorithm is discussed to solve the optimal control problem of the unknown mean-field discrete-time stochastic system. First, through the corresponding transformation, we turn the stochastic mean-field control problem into a deterministic problem. Second, the H matrix is obtained through Q-function, and the control strategy relies only on the H matrix. Therefore, solving H matrix is equivalent to solving the mean-field optimal control. The proposed Q-learning method iteratively solves H matrix and gain matrix according to input system state information, without the need for system parameter knowledge. Next, it is proved that the control matrix sequence obtained by Q-learning converge to the optimal control, which shows theoretical feasibility of the Q-learning. Finally, two simulation cases verify the effectiveness of Q-learning algorithm.



中文翻译:

基于Q-Learning的未知平均场离散时间系统优化控制

求解最优平均场控制问题通常需要完整的系统信息。本文讨论了一种Q-learning算法来解决未知平均场离散时间随机系统的最优控制问题。首先,通过相应的变换,我们将随机平均场控制问题转化为确定性问题。其次,H矩阵是通过Q函数得到的,控制策略仅依赖于H矩阵。因此,求解H矩阵等效于求解平均场最优控制。提出的 Q-learning 方法迭代求解H矩阵和增益矩阵根据输入的系统状态信息,无需系统参数知识。接下来证明Q-learning得到的控制矩阵序列收敛到最优控制,说明Q-learning在理论上是可行的。最后,通过两个仿真案例验证了Q-learning算法的有效性。

更新日期:2021-05-20
down
wechat
bug