Abstract
This paper develops a deep reinforcement learning (DRL) framework for intelligence operation of cascaded hydropower reservoirs considering inflow forecasts, in which two key problems of large discrete action spaces and uncertainty of inflow forecasts are addressed. In this study, a DRL framework is first developed based on a newly defined knowledge sample form and a deep Q-network (DQN). Then, an aggregation-disaggregation model is used to reduce the multi-dimension spaces of state and action for cascaded reservoirs. Following, three DRL models are developed respectively to evaluate the performance of the newly defined decision value functions and modified decision action selection approach. In this paper, the DRL methodologies are tested on China’s Hun River cascade hydropower reservoirs system. The results show that the aggregation-disaggregation model can effectively reduce the dimensions of state and action, which also makes the model structure simpler and has higher learning efficiency. The Bayesian theory in the decision action selection approach is useful to address the uncertainty of inflow forecasts, which can improve the performance to reduce spillages during the wet season. The proposed DRL models outperform the comparison models (i.e., stochastic dynamic programming) in terms of annual hydropower generation and system reliability. This study suggests that the DRL has the potential to be implemented in practice to derive optimal operation strategies.
Similar content being viewed by others
References
Archibald TW, Marshall SE (2018) Review of mathematical programming applications in water resource management under uncertainty. Environ Model Assess 23(6):753–777
Celeste AB, Billib M (2009) Evaluation of stochastic reservoir operation optimization models. Adv Water Resour 32(9):1429–1443
Chen P, He Z, Chen C, Xu J (2018) Control strategy of speed servo systems based on deep reinforcement learning. Algorithms. 11(5):65
Doltsinis S, Ferreira P, Lohse N (2014) An MDP model-based reinforcement learning approach for production station ramp-up optimization: Q-learning analysis. IEEE Trans Syst Man Cybern Syst 44(9):1125–1138
Dressler OJ, Howes PD, Choo J, deMello AJ (2018) Reinforcement learning for dynamic microfluidic control. ACS Omega 3(8):10084–10091
Dulac-Arnold G, Evans R, van Hasselt H, Sunehag P, Lillicrap T, Hunt J, Mann T, Weber T, Degris T, Coppin B (2015) Deep reinforcement learning in large discrete action spaces. arXiv preprint arXiv:1512.07679
Gao Y, Chen J, Robertazzi T, Brown KA (2019) Reinforcement learning based schemes to manage client activities in large distributed control systems. Phys Rev Accel Beams 22(1):014601
Hashimoto T, Stedinger JR, Loucks DP (1982) Reliability, resiliency, and vulnerability criteria for water resource system performance evaluation. Water Resour Res 18(1):14–20
Kim S, Lim H (2018) Reinforcement learning based energy management algorithm for smart energy buildings. Energies. 11(8):2010
Li M, Deng CH, Tan J, Yang W, Zheng L (2016) Research on small hydropower generation forecasting method based on improved BP neural network. 3rd international conference on materials engineering, manufacturing technology and control. Atlantis Press
Li H, Cai R, Liu N, Lin X, Wang Y (2018) Deep reinforcement learning: algorithm, applications, and ultra-low-power implementation. Nano Commun Netw 16:81–90
Lu H, Hu B, Ma Z, Wen S (2014) Reinforcement learning optimization for energy-efficient cellular networks with coordinated multipoint communications. Math Probl Eng 2014:1–9
Ming B, Liu P, Chang J, Wang Y, Huang Q (2017) Deriving operating rules of pumped water storage using multiobjective optimization: case study of the Han to Wei interbasin water transfer project, China. J Water Resour Plan Manage 143(10):05017012
Mnih V, Kavukcuoglu K, Silver,D, Graves A, Antonoglou I, Wierstra D, Riedmiller M (2013) Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran D, Wierstra D, Legg S, Hassabis D (2015) Human-level control through deep reinforcement learning. Nature. 518(7540):529–533
Mujumdar PP, Nirmala B (2007) A Bayesian stochastic optimization model for a multi-reservoir hydropower system. Water Resour Manag 21(9):1465–1485
Niroui F, Zhang K, Kashino Z, Nejat G (2019) Deep reinforcement learning robot for search and rescue applications: exploration in unknown cluttered environments. IEEE Robot Autom Lett 4(2):610–617
Peng A, Peng Y, Zhou H, Zhang C (2014) Multi-reservoir joint operating rule in inter-basin water transfer-supply project. Sci China-Technol Sci 58(1):123–137
Pineau J, Bellemare MG, Islam R, Henderson P, François-Lavet V (2018) An introduction to deep reinforcement learning. Found Trends Mach Learn 11(3–4):219–354
Rodriguez-Ramos A, Sampedro C, Bavle H, de la Puente P, Campoy P (2018) A deep reinforcement learning strategy for UAV autonomous landing on a moving platform. J Intell Robot Syst 93(1–2):351–366
Silver D, Huang A, Maddison CJ, Guez A, Sifre L, van den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M, Dieleman S, Grewe D, Nham J, Kalchbrenner N, Sutskever I, Lillicrap T, Leach M, Kavukcuoglu K, Graepel T, Hassabis D (2016) Mastering the game of Go with deep neural networks and tree search. Nature. 529(7587):484–489
Tang G, Zhou H, Li N, Wang F, Wang Y, Jian D (2010) Value of medium-range precipitation forecasts in inflow prediction and hydropower optimization. Water Resour Manag 24(11):2721–2742
Turgeon A (1980) Optimal operation of multireservoir power systems with stochastic inflows. Water Resour Res 16(2):275–283
Xu W, Zhang C, Peng Y, Fu G, Zhou H (2014) A two stage Bayesian stochastic optimization model for cascaded hydropower systems considering varying uncertainty of flow forecasts. Water Resour Res 50(12):9267–9286
Zhang X, Peng Y, Xu W, Wang B (2018) An optimal operation model for hydropower stations considering inflow forecasts with different Lead-times. Water Resour Manag 33(1):173–188
Zhao T, Zhao J, Liu P, Lei X (2015) Evaluating the marginal utility principle for long-term hydropower scheduling. Energy Conv Manag 106:213–223
Acknowledgments
This research is supported by the National Natural Science Foundation of China (Grant No. 51609025, 51709108 and 51709177); Chongqing technology innovation and application demonstration project (cstc2018jscx-msybX0274). A special thanks to Hun River cascade hydropower reservoirs development company, Ltd. for collecting the data (http://www.hydroshare.org /resource/ 93f1f580de88403a8c52d2b3238297eb).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The author declares no conflict of interest
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Xu, W., Zhang, X., Peng, A. et al. Deep Reinforcement Learning for Cascaded Hydropower Reservoirs Considering Inflow Forecasts. Water Resour Manage 34, 3003–3018 (2020). https://doi.org/10.1007/s11269-020-02600-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11269-020-02600-w