Bi-level stochastic real-time pricing model in multi-energy generation system: A reinforcement learning approach
Introduction
Challenges such as continued growth in energy demand, increasing carbon emissions and aging infrastructure are driving the traditional electricity system toward a more responsive, efficient, reliable and economical system. Smart grid is widely regarded as the next electricity generation, transmission, and distribution architecture, which incorporates advanced modern information and communication technology and smart metering infrastructure [[1], [2], [3]]. Demand side management (DSM) is one of the most important features in smart grid, which is mainly aimed at reducing peak-to-average ratio (PAR) and balancing power supply and demand [[4], [5], [6], [7]]. Pricing is one of the most effective tools for DSM that can encourage users to consume energy more carefully and wisely. Real-time pricing (RTP) is the most direct and efficient approach [8,9].
As a time-dependent pricing, RTP can effectively guide users to adjust their inherent consumption patterns in response to varied electricity price signals [[10], [11], [12]]. It has a profound impact on the behaviors of users and the operation and management of the power grid. An efficient RTP should rely on both supply and demand sides [13,14].
Aiming to guarantee that both supply and demand sides all benefit to provide a win-win outcome for smart grids, this work proposes a novel RTP strategy. In contrast to the existing studies, this paper firstly aims to design an appropriate RTP strategy for the power system which integrates multi-energy generation on the supply side. Without loss of generality, small-scale distributed energy generation and power storage devices for users are also considered. The cost of stochastic generation and the demand of users differ from those of deterministic generation, which brings new challenges to pricing. To solve this problem, the concept of two electricity prices is proposed for different power sources. From the perspective of social fairness and carbon emission reduction, we formulate a bilevel stochastic model for RTP in the framework of Markov decision process (MDP). In the upper level, the multi-energy generation is managed on the supply side by the leader, namely the power market scheduling center (PMSC). The PMSC plays a dominant role in the grid, and determines the electricity prices and optimal amount of power supply. In the lower level, each user as a follower independently makes decision on the optimal energy allocation and distributed energy production with corresponding price information.
To solve this MDP model, regarding the difficulty of collecting exact information from all users in a centralized way and obtaining the transition probabilities in practice, we utilize reinforcement learning (RL) to formulate a novel distributed online multi-agent learning algorithm. The main contributions of this paper are summarized as follows:
- •
A novel RTP strategy is firstly proposed for a comprehensive multi-energy system that integrates stochastic generation on the supply side.
•Focusing on the interaction between supply and demand, a bilevel RTP model in the framework of MDP is formulated for the multi-energy system. In this model, due to the uncertainty of power supply, the cost function of stochastic generation is expressed in a piece-wise linear form, which is different from that of thermal-generation.
•RL is utilized to solve the MDP model adaptively without acquisition of the transition probabilities. A distributed online multi-agent learning algorithm is proposed to get the optimal real-time prices through exploration and exploitation.
•Simulation results demonstrate that the proposed method and algorithm have a good performance in cutting peak and filling the valley and guarantee that both supply and demand sides all benefit.
Section snippets
Literature review
A number of studies have been devoted to the design of RTP mechanisms incorporating the interests of both supply and demand sides for smart grid. Social welfare maximization is an effective method for RTP, which not only helps to improve the social welfare, but also aims to keep the balance between supply and demand [[15], [16], [17], [18]]. Samadi et al. [15] initially proposed a social welfare maximization model of RTP, and formulated a distributed algorithm to solve this model by using dual
System model
Consider a hierarchical smart grid with multi-energy generation, which is composed of a fossil-fuel based thermal power plant, a wind/photovoltaic (PV) plant, and multiple users. Let denote the set of all users. The time cycle for the operation is divided into time slots where . The plants are managed by the PMSC, which determines the electricity prices (selling and buying-back prices) and optimal amount of power to supply. Users are equipped with energy storage systems
Problem description and mathematical formulation
In this section, from the perspective of social fairness, focusing on the information exchange in the hierarchical grid with multi-energy generation, we formulate a bilevel RTP model in the framework of MDP due to the Markov properties of power consumption, electricity prices and renewable generations [35,37]. Each level of this model is defined by the actions and profit of decision maker, the states of the system, and the transition probability.
Algorithm design
Based on the trends on smart gird and carbon emissions trading, we construct a bilevel stochastic RTP model. This bilevel model is a discrete, nonconvex programming problem, including a nonsmooth function in the objective, that is, Eq. (24). Thus, classical optimization techniques may not be effective in solving it. Beyond that, techniques that use MDP framework to solve stochastic models, such as approximate dynamic programming or direct strategy search, requires knowledge of the transition
Simulation results
This section presents the numerical simulation results to evaluate the performance of our RTP approach with distributed online algorithm.
Conclusion
This paper designs an RTP strategy for a smart grid with a comprehensive multi-energy generation system which accommodates both the small-scale distributed generation on the demand side and the stochastic generation on the supply side. Focusing on the interaction between the power plants and users, a bilevel stochastic model for RTP in the framework of MDP is formulated. In the model, the classification of appliances, depreciation of storage capacity and carbon emission trading are also taken
Credit author statement
Li Zhang: Conceptualization, Methodology, Software, Validation, Formal analysis, Data curation, Writing - original draft, Visualization, Investigation. Yan Gao: Project administration, Funding acquisition, Conceptualization, Methodology, Formal analysis, Writing - original draft, Investigation. Hongbo Zhu: Conceptualization, Methodology, Formal analysis, Investigation. Li Tao: Conceptualization, Formal analysis, Data curation, Investigation.
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
This work is supported by the National Natural Science Foundation of China (No. 72071130), Social Science Foundation of Jiangsu (No. 19GLB022), and Natural Science Foundation of Huai'an (No. HABZ202019). This work is also financially supported by the open fund for Jiangsu Smart Factory Engineering Research Center (Huaiyin Institute of Technology).
References (51)
Closed loop elastic demand control by dynamic energy pricing in smart grids
Energy
(2019)- et al.
Optimal pricing in time of use demand response by integrating with dynamic economic dispatch problem
Energy
(2016) - et al.
Demand Side Management using a multi-criteria ε-constraint based exact approach
Expert Syst Appl
(2018) - et al.
A real-time demand response market through a repeated incomplete-information game
Energy
(2018) - et al.
Optimal real time cost-benefit based demand response with intermittent resources
Energy
(2015) - et al.
Modified PSO algorithm for real-time energy management in grid-connected microgrids
Renew Energy
(2019) - et al.
A novel multitype-users welfare equilibrium based real-time pricing in smart grid
Future Generat Comput Syst
(2020) - et al.
Game-theory based dynamic pricing strategies for demand side management in smart grids
Energy
(2017) - et al.
Real-time pricing considering different type of smart home appliances based on Markov decision process
Int J Electr Power Energy Syst
(2019) - et al.
Optimal scheduling of the RIES considering time-based demand response programs with energy price
Energy
(2018)