Brief papersActive disturbance rejection controller for multi-area interconnected power system based on reinforcement learning
Introduction
With the development of the electricity market, people are increasingly demanding electrical energy, which results to the emergence of more and more large-scale power grids and stronger links between power grids areas. However, multi-area interconnected power systems may in turn make the system operate unsafely. This is mainly because the tie-line power between two areas is easily affected by load disturbances, which then can cause overload problems, leading to widespread power outage [1]. Therefore, when the system is subjected to load disturbance, for the safe operation of the system, the system frequency must be kept within a certain range, and the tie-line exchanged power must be stable at planned value. This is the studied problem of load frequency control (LFC) in generalized automatic generation control (AGC) [2]. About LFC, the earliest controller is the traditional PID. The reference [3] applied the PID controller to a two-area interconnected power system, and the simulation results show that the PID controller can effectively control the load frequency. But it can be found that the PID controller is difficult to balance the adjustment time and overshoot. At present, there are two main types of control methods for LFC. One is that optimizing the parameters of PID controller by bionic algorithms such as particle swarm optimization (PSO) [4], [5], fuzzy control algorithms [6], [7], and genetic algorithms (GA) [8], [9], etc. The other is through modern control theory such as predictive control [10], [11], robust control [12], [13], and adaptive control [14], etc. However, if these methods are out of the precise information of the model, the system performance will be greatly affected. Jingqing Han devoted himself to control theory for more than a decade and finally the Active Disturbance Rejection Control (ADRC) emerged [15], [16]. On this basis, Dr. Gao further developed the Linear Active Disturbance Rejection Control (LADRC), and simplified the controller parameters [17], which made the industrialization of the ADRC possible. Since then, the ADRC had been applied and researched by more and more experts and scholars for the characteristic of not requiring the model information of the system. At the same time, ADRC has also demonstrated its superiority in many fields, such as underwater vehicle [18], ship course control [19] and quadrotor control [20]. The tuning and optimization of ADRC controller parameters has always been the key research for scholars and experts, and currently, many Artificial Intelligence algorithms are used to train a set of parameters offline. However, in the actual system, there may be some uncertain factors that lead to changes in the model parameters or structure, which makes the originally controller parameters unable to ensure the good operation of the system. So, it’s necessary to design an adaptable parameter optimization method online. In recent years, the adaptive parameters are usually acquired by the PID controller based on fuzzy control algorithm [21], [22]. However, the formulation of fuzzy rules depends on model information. Therefore, this paper proposes the ADRC controller based on Q-learning which dose not rely on models and can adjust parameters online.
Reinforcement Learning (RL) is a way to find the optimal strategy by the constant interaction between the agent and the environment, where the environment can be uncertain. When the agent ’communicates’ with the environment through the action, the environment returns a current reward to the agent, by which the action can be evaluated [23]. Q-learning is a kind of off-policy algorithm of RL, which was first proposed by Watkins [24] in 1989. For the features of model-free and strong convergence, Q-learning is applied to the robot path planning in complex environments [25], [26] in most cases. In addition, Q-learning has shown success in the problem of optimal tracking control [27], [28]. Nowadays, few studies combine Q-learning and parameter optimization. Reference [7] presents a Fuzzy-Proportional-Integral-Derivative (Fuzzy-PID) controller for a three area interconnected power system, and uses mine blast algorithm (MBA) to optimize the controller parameters, and the method is verified by simulation experiments. This article applies the LADRC controller to the LFC system researched by [7], and uses Q-learning to get the adaptive parameters of LADRC. The simulation results are obtained by the software Matlab, which are compared with the results shown in [7], and the comparison results of load frequency response imply that Q-learning has good performance in parameter adaptation.
The main novel contributions of this article are summarized as follows:
- (1)
A reduced-order LADRC controller is designed to reduce the impact of load frequency disturbances on traditional and deregulated power systems;
- (2)
Apply the off-policy algorithm: Q-learning of reinforcement learning to obtain the adaptive parameters of LADRC controller, where the model information can be unknown;
- (3)
In order to avoid the unnecessary exploration of actions, the ε of Q-learning algorithm in policy has been improved.
Section snippets
Multi-area interconnected power system
The interaction between areas has many advantages like improving the safety and reliability of the power systems. But it also makes the system more complicated to control because of the coupling between regions. For the LFC system, its control objectives are as follows.
- (1)
When the system suffers load disturbances, the frequency should be kept within a certain range.
- (2)
The tie-line exchanged power is maintained within a certain range between two areas.
For coupling issues of multi-area interconnected
LADRC controller
Recently, many studies on observers for disturbance estimation, such as [32], [33], where the system output is utilized to estimate the unknown faults. Different form these methods, the LADRC controller, in the case of not knowing the model of controlled plant, can use LESO to estimate system disturbances, where the disturbance is an expanded state of the observer, then compensate them so as to suppress the influence of the disturbance on the system. On the premise of knowing the order of the
LADRC controller based on reinforcement learning
About Q-learning, the detail is introduced in the appendix. And the following is the process of designing the LADRC controller based on reinforcement learning.
Simulink results
The experimental target is to find the adaptive parameters of ADRC controller applied in the multi-area interconnected power systems by Q-learning. Reference [7] proposed a Fuzzy-PID controller via MBA, which is used to control a three-area interconnected power system. Based on this model, we do the simulation experiments on the MATLAB R2018a platform, and the following is the display and analysis of experimental results.
Robustness analysis
Controller performance evaluation should not only consider the dynamic response of the system under the influence of the disturbances, but also the robustness of the controller under uncertain system parameters.
Take the model in Section 5.1 as an example, it is assumed that the governor time constant Th, the reheat turbine time constants Tt, Tr, the power system time constant Tps in each area perturb ±20% and ±40% with the obtained Q table unchanged. The performance of each area is evaluated
Conclusion
In power systems, the designing of appropriate controllers to reduce the impact of load disturbance on the system is a significant research problem. The LADRC controller is applied to this kind of system due to its good anti-interference performance and the characteristics of being independent of system model. But the controller involves the difficulty of choosing the parameters. Reinforcement learning is a model-free method and can solve sequential decision problems, which can be used to solve
Declaration of Competing Interest
We wish to confirm that there are no known conflicts of interest associated with this publication and there has been no significant financial support for this work that could have influenced its outcome.
CRediT authorship contribution statement
Yuemin Zheng: Conceptualization, Writing - original draft, Software. Zengqiang Chen: Validation, Resources, Writing - review & editing. Zhaoyang Huang: Data curation, Writing - original draft. Mingwei Sun: Supervision, Validation. Qinglin Sun: Supervision, Writing - review & editing.
Acknowledgement
This work was funded by the National Nature Science Foundation of China Grant nos. 61973175, 61973172, and the Key Technologies R & D Program of Tianjin Grant no. 19JCZDJC32800.
Yue-min Zheng was born in 1996. She received B.Sc. degree from the Shijiazhuang Tiedao University, Shijiazhuang, China, in 2018 and she is currently a graduate student of Nankai University, Tianjin, China. Her current research interests include Active Disturbance Rejection Control and reinforcement learning.
References (37)
- et al.
Robust analysis and design of load frequency controller for power systems
Electr. Power Syst. Res.
(2009) - et al.
Load frequency controllers considering renewable energy integration in power system
Energy Rep.
(2019) - et al.
Energy management of hybrid electric vehicles: a review of energy optimization of fuel cell hybrid power system based on genetic algorithm
Energy Convers. Manag.
(2020) - et al.
Hierarchical model predictive control of wind farm with energy storage system for frequency regulation during black-start
Int. J. Electr. Power Energy Syst.
(2020) - et al.
Robust distributed MPC for load frequency control of uncertain power systems
Control Eng. Pract.
(2016) - et al.
Adaptive decentralized load frequency control of multi-area power systems
Int. J. Electr. Power Energy Syst.
(2005) - et al.
Diving control of autonomous underwater vehicle based on improved active disturbance rejection control approach
Neurocomputing
(2016) - et al.
Obstacle avoidance and active disturbance rejection control for a quadrotor
Neurocomputing
(2016) - et al.
Path planning of multi-agent systems in unknown environment with neural kernel smoothing and reinforcement learning
Neurocomputing
(2017) Tuning of PID load frequency controller for power systems
Energy Convers. Manag.
(2009)
Impact of energy storage system on load frequency control for diverse sources of interconnected power system in deregulated power environment
Electr. Power Energy Syst.
Reduced-order observer based fault estimation and fault-tolerant control for switched stochastic systems with actuator and sensor faults
ISA Trans.
Load frequency control in interconnected power system using multi-objective PID controller
2018 IEEE Conference on Soft Computing in Industrial Applications (SMCia/08)
Optimization of PID controller parameters using PSO for two area load frequency control
IAES Int. J.Robot. Autom.
Comparison of automatic load frequency control in two area power systems using PSO algorithm based PIDcontroller and conventional PID controller
J. Phys.
Implementation of fuzzy-PID controller with demand response control to LFC model in real-time using labview
Int. J. Fuzzy Comput. Model.
Optimal design of fuzzy-PID controller for deregulated LFC of multi-area power system via mine blast algorithm
Neural Comput. Appl.
Multi-verse optimization algorithm for LFC of power system with imposed nonlinearities using three-degree-of-freedom PID controller
Iran. J. Sci.Technol. Trans. Electr. Eng.
Cited by (30)
Enhancing active disturbance rejection design via deep reinforcement learning and its application to autonomous vehicle
2024, Expert Systems with ApplicationsA review on active disturbance rejection control of power generation systems: Fundamentals, tunings and practices
2023, Control Engineering PracticeAUV 3D docking control using deep reinforcement learning
2023, Ocean EngineeringStability Enhancement and Energy Management of AC-DC Microgrid based on Active Disturbance Rejection Control
2023, Electric Power Systems ResearchApplication of multi-agent EADRC in flexible operation of combined heat and power plant considering carbon emission and economy
2023, EnergyCitation Excerpt :Inspired by artificial intelligence, innovative control strategies with real-time optimal performance were generated by combining reinforcement learning (RL) algorithm with control theory [28]. The gains automatic tuning of ADRC [29] and fuzzy PID controller [30] were achieved by the Q-learning and twin delayed deep deterministic policy gradient (TD3) algorithm, respectively. Motivated by above discussions, a TD3 algorithm based multi-agent system is designed to obtain the optimal parameters of KF-EADRC.
Deep Deterministic Policy Gradient and Active Disturbance Rejection Controller based coordinated control for gearshift manipulator of driving robot
2023, Engineering Applications of Artificial IntelligenceCitation Excerpt :However, the used ADRC has a complex structure and is difficult to be applied in practice. Zheng et al. (2020, 2021) used Q-learning and Deep Q-network (DQN) to select the parameters of ADRC for multi-area interconnected power system. However, none of them are able to achieve continuous adjustment of parameters.
Yue-min Zheng was born in 1996. She received B.Sc. degree from the Shijiazhuang Tiedao University, Shijiazhuang, China, in 2018 and she is currently a graduate student of Nankai University, Tianjin, China. Her current research interests include Active Disturbance Rejection Control and reinforcement learning.
Zeng-Qiang Chen was born in 1964. He received the B.S., M.E. and Ph.D. degrees from Nankai University, in 1987, 1990, and 1997, respectively. He is currently a professor of control theory and engineering of Nankai University, and deputy director of Institute of Robotics and Information Automation. His current research interests include intelligent predictive control, chaotic systems and complex dynamic network, and multi-agent system control.
Zhao-yang Huang was born in 1995. He received the B.Sc. degree in automation from the Nankai University, Tianjin, China, in 2018 and he is currently a master candidate of Nankai University, Tianjin, China. His current research interests are intelligent algorithms and intelligent control.
Ming-wei Sun was born in 1972. He received the Ph.D. degree from the Department of Computer and Systems Science, Nankai University, Tianjin, China, in 2000. From 2000 to 2008, he was a Flight Control Engineer with the Beijing Electro-mechanical Engineering Research Institute, Beijing, China. Since 2009, he has been with Nankai University, where he is currently a Professor. His research interests include flight control, guidance, model predictive control, active disturbance rejection control, and nonlinear optimization.
Qing-lin Sun received the B.E. and M.E. degrees in control theory and control engineering from Tianjin University, Tianjin, China, in 1985 and 1990, respectively, and the Ph.D. degree in control science and engineering from Nankai University, Tianjin, China, in 2003. He is currently a Professor in the Intelligence Predictive Adaptive Control Laboratory of Nankai University and associate dean of College of Artificial Intelligence. His research interests include self-adaptive control, modeling and control of flexible spacecraft, and embedded control systems.