Elsevier

Neurocomputing

Volume 425, 15 February 2021, Pages 149-159
Neurocomputing

Brief papers
Active disturbance rejection controller for multi-area interconnected power system based on reinforcement learning

https://doi.org/10.1016/j.neucom.2020.03.070Get rights and content

Abstract

In this paper, a method of Active Disturbance Rejection Controller (ADRC) is proposed based on Q-learning of Reinforcement Learning (RL) for multi-area interconnected power system. Excessive changes in load can cause instability to the system. Therefore, the ADRC controller is used to keep the load within rated range for its strong anti-interference performance and Q-learning algorithm to select the adaptive parameters of the controller. Finally, through simulation experiments on traditional and deregulated three-area interconnected power system respectively, the effectiveness of the proposed method is proved and the results show that reinforcement learning can indeed be used to solve the problem of controller parameter adjustment.

Introduction

With the development of the electricity market, people are increasingly demanding electrical energy, which results to the emergence of more and more large-scale power grids and stronger links between power grids areas. However, multi-area interconnected power systems may in turn make the system operate unsafely. This is mainly because the tie-line power between two areas is easily affected by load disturbances, which then can cause overload problems, leading to widespread power outage [1]. Therefore, when the system is subjected to load disturbance, for the safe operation of the system, the system frequency must be kept within a certain range, and the tie-line exchanged power must be stable at planned value. This is the studied problem of load frequency control (LFC) in generalized automatic generation control (AGC) [2]. About LFC, the earliest controller is the traditional PID. The reference [3] applied the PID controller to a two-area interconnected power system, and the simulation results show that the PID controller can effectively control the load frequency. But it can be found that the PID controller is difficult to balance the adjustment time and overshoot. At present, there are two main types of control methods for LFC. One is that optimizing the parameters of PID controller by bionic algorithms such as particle swarm optimization (PSO) [4], [5], fuzzy control algorithms [6], [7], and genetic algorithms (GA) [8], [9], etc. The other is through modern control theory such as predictive control [10], [11], robust control [12], [13], and adaptive control [14], etc. However, if these methods are out of the precise information of the model, the system performance will be greatly affected. Jingqing Han devoted himself to control theory for more than a decade and finally the Active Disturbance Rejection Control (ADRC) emerged [15], [16]. On this basis, Dr. Gao further developed the Linear Active Disturbance Rejection Control (LADRC), and simplified the controller parameters [17], which made the industrialization of the ADRC possible. Since then, the ADRC had been applied and researched by more and more experts and scholars for the characteristic of not requiring the model information of the system. At the same time, ADRC has also demonstrated its superiority in many fields, such as underwater vehicle [18], ship course control [19] and quadrotor control [20]. The tuning and optimization of ADRC controller parameters has always been the key research for scholars and experts, and currently, many Artificial Intelligence algorithms are used to train a set of parameters offline. However, in the actual system, there may be some uncertain factors that lead to changes in the model parameters or structure, which makes the originally controller parameters unable to ensure the good operation of the system. So, it’s necessary to design an adaptable parameter optimization method online. In recent years, the adaptive parameters are usually acquired by the PID controller based on fuzzy control algorithm [21], [22]. However, the formulation of fuzzy rules depends on model information. Therefore, this paper proposes the ADRC controller based on Q-learning which dose not rely on models and can adjust parameters online.

Reinforcement Learning (RL) is a way to find the optimal strategy by the constant interaction between the agent and the environment, where the environment can be uncertain. When the agent ’communicates’ with the environment through the action, the environment returns a current reward to the agent, by which the action can be evaluated [23]. Q-learning is a kind of off-policy algorithm of RL, which was first proposed by Watkins [24] in 1989. For the features of model-free and strong convergence, Q-learning is applied to the robot path planning in complex environments [25], [26] in most cases. In addition, Q-learning has shown success in the problem of optimal tracking control [27], [28]. Nowadays, few studies combine Q-learning and parameter optimization. Reference [7] presents a Fuzzy-Proportional-Integral-Derivative (Fuzzy-PID) controller for a three area interconnected power system, and uses mine blast algorithm (MBA) to optimize the controller parameters, and the method is verified by simulation experiments. This article applies the LADRC controller to the LFC system researched by [7], and uses Q-learning to get the adaptive parameters of LADRC. The simulation results are obtained by the software Matlab, which are compared with the results shown in  [7], and the comparison results of load frequency response imply that Q-learning has good performance in parameter adaptation.

The main novel contributions of this article are summarized as follows:

  • (1)

    A reduced-order LADRC controller is designed to reduce the impact of load frequency disturbances on traditional and deregulated power systems;

  • (2)

    Apply the off-policy algorithm: Q-learning of reinforcement learning to obtain the adaptive parameters of LADRC controller, where the model information can be unknown;

  • (3)

    In order to avoid the unnecessary exploration of actions, the ε of Q-learning algorithm in εgreedy policy has been improved.

Section snippets

Multi-area interconnected power system

The interaction between areas has many advantages like improving the safety and reliability of the power systems. But it also makes the system more complicated to control because of the coupling between regions. For the LFC system, its control objectives are as follows.

  • (1)

    When the system suffers load disturbances, the frequency should be kept within a certain range.

  • (2)

    The tie-line exchanged power is maintained within a certain range between two areas.

For coupling issues of multi-area interconnected

LADRC controller

Recently, many studies on observers for disturbance estimation, such as  [32], [33], where the system output is utilized to estimate the unknown faults. Different form these methods, the LADRC controller, in the case of not knowing the model of controlled plant, can use LESO to estimate system disturbances, where the disturbance is an expanded state of the observer, then compensate them so as to suppress the influence of the disturbance on the system. On the premise of knowing the order of the

LADRC controller based on reinforcement learning

About Q-learning, the detail is introduced in the appendix. And the following is the process of designing the LADRC controller based on reinforcement learning.

Simulink results

The experimental target is to find the adaptive parameters of ADRC controller applied in the multi-area interconnected power systems by Q-learning. Reference [7] proposed a Fuzzy-PID controller via MBA, which is used to control a three-area interconnected power system. Based on this model, we do the simulation experiments on the MATLAB R2018a platform, and the following is the display and analysis of experimental results.

Robustness analysis

Controller performance evaluation should not only consider the dynamic response of the system under the influence of the disturbances, but also the robustness of the controller under uncertain system parameters.

Take the model in Section 5.1 as an example, it is assumed that the governor time constant Th, the reheat turbine time constants Tt, Tr, the power system time constant Tps in each area perturb  ±20% and  ±40% with the obtained Q table unchanged. The performance of each area is evaluated

Conclusion

In power systems, the designing of appropriate controllers to reduce the impact of load disturbance on the system is a significant research problem. The LADRC controller is applied to this kind of system due to its good anti-interference performance and the characteristics of being independent of system model. But the controller involves the difficulty of choosing the parameters. Reinforcement learning is a model-free method and can solve sequential decision problems, which can be used to solve

Declaration of Competing Interest

We wish to confirm that there are no known conflicts of interest associated with this publication and there has been no significant financial support for this work that could have influenced its outcome.

CRediT authorship contribution statement

Yuemin Zheng: Conceptualization, Writing - original draft, Software. Zengqiang Chen: Validation, Resources, Writing - review & editing. Zhaoyang Huang: Data curation, Writing - original draft. Mingwei Sun: Supervision, Validation. Qinglin Sun: Supervision, Writing - review & editing.

Acknowledgement

This work was funded by the National Nature Science Foundation of China Grant nos. 61973175, 61973172, and the Key Technologies R & D Program of Tianjin Grant no. 19JCZDJC32800.

Yue-min Zheng was born in 1996. She received B.Sc. degree from the Shijiazhuang Tiedao University, Shijiazhuang, China, in 2018 and she is currently a graduate student of Nankai University, Tianjin, China. Her current research interests include Active Disturbance Rejection Control and reinforcement learning.

References (37)

  • R. Shankar et al.

    Impact of energy storage system on load frequency control for diverse sources of interconnected power system in deregulated power environment

    Electr. Power Energy Syst.

    (2016)
  • J. Han et al.

    Reduced-order observer based fault estimation and fault-tolerant control for switched stochastic systems with actuator and sensor faults

    ISA Trans.

    (2019)
  • A. Sharifi et al.

    Load frequency control in interconnected power system using multi-objective PID controller

    2018 IEEE Conference on Soft Computing in Industrial Applications (SMCia/08)

    (2008)
  • B. Dubey et al.

    Optimization of PID controller parameters using PSO for two area load frequency control

    IAES Int. J.Robot. Autom.

    (2019)
  • R. Singh et al.

    Comparison of automatic load frequency control in two area power systems using PSO algorithm based PIDcontroller and conventional PID controller

    J. Phys.

    (2019)
  • R.V. Santhi et al.

    Implementation of fuzzy-PID controller with demand response control to LFC model in real-time using labview

    Int. J. Fuzzy Comput. Model.

    (2019)
  • A. Fathy et al.

    Optimal design of fuzzy-PID controller for deregulated LFC of multi-area power system via mine blast algorithm

    Neural Comput. Appl.

    (2018)
  • J. Mudi et al.

    Multi-verse optimization algorithm for LFC of power system with imposed nonlinearities using three-degree-of-freedom PID controller

    Iran. J. Sci.Technol. Trans. Electr. Eng.

    (2019)
  • Cited by (30)

    • Application of multi-agent EADRC in flexible operation of combined heat and power plant considering carbon emission and economy

      2023, Energy
      Citation Excerpt :

      Inspired by artificial intelligence, innovative control strategies with real-time optimal performance were generated by combining reinforcement learning (RL) algorithm with control theory [28]. The gains automatic tuning of ADRC [29] and fuzzy PID controller [30] were achieved by the Q-learning and twin delayed deep deterministic policy gradient (TD3) algorithm, respectively. Motivated by above discussions, a TD3 algorithm based multi-agent system is designed to obtain the optimal parameters of KF-EADRC.

    • Deep Deterministic Policy Gradient and Active Disturbance Rejection Controller based coordinated control for gearshift manipulator of driving robot

      2023, Engineering Applications of Artificial Intelligence
      Citation Excerpt :

      However, the used ADRC has a complex structure and is difficult to be applied in practice. Zheng et al. (2020, 2021) used Q-learning and Deep Q-network (DQN) to select the parameters of ADRC for multi-area interconnected power system. However, none of them are able to achieve continuous adjustment of parameters.

    View all citing articles on Scopus

    Yue-min Zheng was born in 1996. She received B.Sc. degree from the Shijiazhuang Tiedao University, Shijiazhuang, China, in 2018 and she is currently a graduate student of Nankai University, Tianjin, China. Her current research interests include Active Disturbance Rejection Control and reinforcement learning.

    Zeng-Qiang Chen was born in 1964. He received the B.S., M.E. and Ph.D. degrees from Nankai University, in 1987, 1990, and 1997, respectively. He is currently a professor of control theory and engineering of Nankai University, and deputy director of Institute of Robotics and Information Automation. His current research interests include intelligent predictive control, chaotic systems and complex dynamic network, and multi-agent system control.

    Zhao-yang Huang was born in 1995. He received the B.Sc. degree in automation from the Nankai University, Tianjin, China, in 2018 and he is currently a master candidate of Nankai University, Tianjin, China. His current research interests are intelligent algorithms and intelligent control.

    Ming-wei Sun was born in 1972. He received the Ph.D. degree from the Department of Computer and Systems Science, Nankai University, Tianjin, China, in 2000. From 2000 to 2008, he was a Flight Control Engineer with the Beijing Electro-mechanical Engineering Research Institute, Beijing, China. Since 2009, he has been with Nankai University, where he is currently a Professor. His research interests include flight control, guidance, model predictive control, active disturbance rejection control, and nonlinear optimization.

    Qing-lin Sun received the B.E. and M.E. degrees in control theory and control engineering from Tianjin University, Tianjin, China, in 1985 and 1990, respectively, and the Ph.D. degree in control science and engineering from Nankai University, Tianjin, China, in 2003. He is currently a Professor in the Intelligence Predictive Adaptive Control Laboratory of Nankai University and associate dean of College of Artificial Intelligence. His research interests include self-adaptive control, modeling and control of flexible spacecraft, and embedded control systems.

    View full text