Real-time energy purchase optimization for a storage-integrated photovoltaic system by deep reinforcement learning
Introduction
We are on the verge of dramatic technological and cultural changes, caused by the shift from coal to renewable energy sources. With the development of smart grids, new control methods are required to support power generation and storage so as to meet energy demand. However, control decision-making in smart grids is more difficult comparing with traditional power systems because the operation of a high-renewables system is associated with more uncertainty (Zhang et al., 2018). One way to mitigate the unpredictability of renewables is the application of energy storage technologies in locally isolated microgrid areas. Still, the optimal operation of storage-integrated systems remains a challenge, taking into account non-linear storage charging/discharging characteristics, and uncertain conditions (Chauhan & Saini, 2014).
Even though a storage system can help manage the stochastic behavior of renewables, there may be insufficient availability of renewable energy for quite long periods, like during cloudy or winter days in case of solar photovoltaic (PV) systems. Therefore, a microgrid system may not operate in stand-alone mode, but it must be supported by a grid energy supply to fulfill the critical load with conventional power. Real-time electricity prices are most likely to be applied in such a case, as argued by many economists (Dufo-López, 2015). The objective of optimal storage operation would be to minimize the cost of energy purchased from the grid. The energy management processes under consideration are nonlinear, stochastic, and multi-period. The overall cost feedback is delayed relative to individual purchase decisions, because buying too little now may force additional purchases later on, and buying too much now may prevent making savings later on when prices drop. This creates challenges that can be hardly addressed with simple ad hoc control strategies (Iovine et al., 2019).
The hybrid photovoltaic-electrical energy storage technology is the most popular installation in leading markets, as reviewed by Liu et al. (2019). Optimization of hybrid PV-storage systems has been extensively investigated to improve system performance (Arani et al., 2019). Ampatzis et al. (2017) explain how such systems can be used in a cluster managed by an aggregator to obtain demand response.
Numerous methods have been proposed to determine cost- minimizing real-time control of an integrated PV-storage system, based on forecasting of load, renewables, and prices. These methods fall into categories of dynamic programming (Li & Danzer, 2014), convex optimization (Wang et al., 2015), stochastic optimization (Conte et al., 2017), and the optimization of Lagrange multipliers (Nge et al., 2019). To reduce dependency on forecasting accuracy some researchers proposed fuzzy logic controllers (Teo et al., 2018), or heuristic search methods, such as particle swarm optimization (Stoppato et al., 2014), refined by a genetic algorithm (Phan et al., 2018).
Reinforcement learning is an attractive paradigm for addressing stochastic optimal control problems. This approach, based on dynamic interactions and evaluative feedback, does not require forecasting models to be available. Some domain knowledge is usually still required, though, to properly design the learning control system, including its input and output representation as well as training information (Glavic et al., 2017).
There is only a handful of articles that use machine learning approaches to optimize energy storage control. One of the first studies employing adaptive dynamic programming, an approach closely related to reinforcement learning, is that of Wei et al. (2014), subsequently extended to account for PV generation (Wei et al., 2017). They determine optimal battery charging/discharging/idle control law, which minimizes the total expense of the power from the grid under the assumptions of periodic residential load and electricity rate. The dynamic pricing demand response problem that takes into consideration the uncertainty of load demand was solved using reinforcement learning by Lu et al. (2018). Recently (Henri & Lu, 2019) used a supervised machine learning approach to control several different energy storage devices.
Deep reinforcement learning, which is a combination of deep learning and reinforcement learning, is recently gaining a lot of attention. However, its applications in power system and smart grids can be scarcely found (Zhang et al., 2018). An application of deep Q-learning to real-time scheduling of energy consuming resources was presented by Zhang et al. (2017). Lee and Choi (2019) apply the Q-learning algorithm to schedule energy consumption of home appliances, including energy storage system. To the best of our knowledge, no study has applied deep reinforcement learning to energy control and cost minimization for a complex, stochastic system, as considered in this article.
The core contribution of this work is the development of the architecture of a control system with an intelligent decision-making module using an advanced model-free learning algorithm. More specifically, the main achievements presented in the article are listed below.
- •
The application of deep reinforcement learning as a control method for a complex system with nonlinear, stochastic, and multi-period processes.
- •
A novel formulation of the energy storage control problem as making purchase decisions, which makes it possible to limit the size of the action space to a relatively small number of discrete actions and reduce the dimensionality of the state space.
- •
Successful demonstrations in realistic simulation experiments, in which the proposed algorithm, with appropriately tuned parameters, exhibits good learning speed and stability, and achieves better control quality than human-designed heuristic rules in several different environment configurations.
The remainder of this article is organized as follows. The adopted problem formulation is presented in Section 2. Section 3 describes the proposed reinforcement learning approach to optimizing energy purchase decision. The results of a realistic experimental case study are presented in Section 4. Section 5 summarizes the main findings of this work and outlines some promising continuation directions.
Section snippets
Problem formulation
The results presented in this article are based on a realistic simulation of a storage-integrated solar power system. The assumed system architecture matches standard microgrid infrastructure, and real insolation, energy consumption, energy price data are used to perform the simulation.
Solution method
In the paradigm of reinforcement learning a learning agent learns to perform its task from interactions with its environment. At each time step it observes the current state of the environment and performs an action. Then it receives a reinforcement value, also called a reward, and a state transition takes place. State transitions and reinforcement values may be, in general, stochastic, and the agent does not know either their underlying distributions or expected values. The objective of
Case study
To verify the effectiveness and performance of the proposed Q-learning approach, several test cases and comparisons were studied.
Conclusions
The combination of reinforcement learning with deep neural networks has become one of the most promising areas in the field of artificial intelligence. Successful applications to practical problems with real-time control can be still hardly found, though. This article managed to overcome difficulties anticipated by the theory and observed in prior experimental work, such as insufficient learning speed and poor stability issues. We believe the achieved successful performance can be primarily
CRediT authorship contribution statement
Waldemar Kolodziejczyk: Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing - original draft, Writing - review & editing. Izabela Zoltowska: Supervision, Research methodology, Writing - original draft, Writing - review & editing. Pawel Cichosz: Supervision, Research methodology, Writing - review & editing.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
References (53)
- et al.
A review on integrated renewable energy system based power generation for stand-alone applications: Configurations, storage options, sizing methodologies and control
Renewable & Sustainable Energy Reviews
(2014) - et al.
Day-ahead planning and real-time control of integrated PV-storage systems by stochastic optimization
IFAC-PapersOnLine
(2017) Optimisation of size and control of grid-connected storage under real time electricity pricing conditions
Applied Energy
(2015)- et al.
Reinforcement learning for electric power system decision and control: past considerations and perspectives
IFAC-PapersOnLine
(2017) - et al.
Power management for a DC MicroGrid integrating renewables and storages
Control Engineering Practice
(2019) - et al.
Optimal charge control strategies for stationary photovoltaic battery systems
Journal of Power Sources
(2014) - et al.
Overview on hybrid solar photovoltaic-electrical energy storage technologies for power supply to buildings
Energy Conversion and Management
(2019) - et al.
A dynamic pricing demand response algorithm for smart grid: reinforcement learning approach
Applied Energy
(2018) - et al.
A real-time energy management system for smart grid integrated photovoltaic generation with battery storage
Renewable Energy
(2019) - et al.
Determination of optimal battery utilization to minimize operating costs for a grid-connected building with renewable energy sources
Energy Conversion and Management
(2018)
A PSO (particle swarm optimization)-based model for the optimal management of a small PV (photovoltaic)-pump hydro energy storage in a rural dry area
Energy
State-of-Charge-based droop control for stand-alone AC supply systems with distributed energy storage
Energy Conversion and Management
Robust optimisation for deciding on real-time flexibility of storage-integrated photovoltaic units controlled by intelligent software agents
IET Renewable Power Generation
Strategy learning with multilayer connectionist representations
Review on energy storage systems control methods in microgrids
International Journal of Electrical Power & Energy Systems
Deep reinforcement learning: A brief survey
IEEE Signal Processing Magazine
Residual algorithms: Reinforcement learning with function approximation
On the computational economics of reinforcement learning
Neuronlike adaptive elements that can solve difficult learning control problems
IEEE Transactions on Systems, Man, and Cybernetics
Dynamic programming
Generalization in reinforcement learning: Safely approximating the value function
Improving the exploration strategy in bandit algorithms
Truncating temporal differences: On the efficient implementation of TD() for reinforcement learning
Journal of Artificial Intelligence Research
An analysis of experience replay in temporal difference learning
Cybernetics and Systems
Improving elevator performance using reinforcement learning
How to discount deep reinforcement learning: Towards new dynamic strategies
Cited by (31)
Prioritized experience replay based deep distributional reinforcement learning for battery operation in microgrids
2024, Journal of Cleaner ProductionOperational optimization for the grid-connected residential photovoltaic-battery system using model-based reinforcement learning
2023, Journal of Building EngineeringTASAC: A twin-actor reinforcement learning framework with a stochastic policy with an application to batch process control
2023, Control Engineering PracticeControl frameworks for transactive energy storage services in energy communities
2023, Control Engineering PracticeCitation Excerpt :Fostered by the decreasing cost of storage technologies and emerging mechanisms of energy exchange and sharing, a viable solution to attain self-consumption of on-site production is represented by the use of energy storage systems (ESSs) that are valuable resources of the community at the local level (Bartolini, Carducci, Muñoz, & Comodi, 2020). The use of ESSs allows users to create energy arbitrage by discharging during price peaks and charging during off-peak periods if a variable energy price is considered (Bradbury, Pratson, & Patiño-Echeverri, 2014; Kolodziejczyk, Zoltowska, & Cichosz, 2021). In addition, ESSs contribute to the overall resilience of the energy community when facing systematic failures or natural disasters (Nguyen, Muhs, & Parvania, 2019).