当前位置: X-MOL 学术Control Eng. Pract. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Real-time energy purchase optimization for a storage-integrated photovoltaic system by deep reinforcement learning
Control Engineering Practice ( IF 4.9 ) Pub Date : 2021-01-01 , DOI: 10.1016/j.conengprac.2020.104598
Waldemar Kolodziejczyk , Izabela Zoltowska , Pawel Cichosz

Abstract The objective of this article is to minimize the cost of energy purchased on a real-time basis for a storage-integrated photovoltaic (PV) system installed in a microgrid. Under non-linear storage charging/discharging characteristics, as well as uncertain solar energy generation, demands, and market prices, it is a complex task. It requires a proper level of tradeoff between storing too much and too little energy in the battery: future excess PV energy is lost in the former case, and demand is exposed to future high electricity prices in the latter case. We propose a reinforcement learning approach to deal with a non-stationary environment and non-linear storage characteristics. To make this approach applicable, a novel formulation of the decision problem is presented, which focuses on the optimization of grid energy purchases rather than on direct storage control. This limits the complexity of the state and action space, making it possible to achieve satisfactory learning speed and avoid stability issues. Then the Q-learning algorithm combined with a dense deep neural network for function representation is used to learn an optimal decision policy. The algorithm incorporates enhancements that were found to improve learning speed and stability by prior work, such as experience replay, target network, and increasing discount factor. Extensive simulation results performed on real data confirm that our approach is effective and outperforms rule-based heuristics.

中文翻译:

基于深度强化学习的储能一体化光伏系统实时购电优化

摘要 本文的目标是将安装在微电网中的存储集成光伏 (PV) 系统实时购买的能源成本降至最低。在非线性储能充放电特性,以及不确定的太阳能发电、需求和市场价格下,这是一项复杂的任务。它需要在电池中存储过多和过少能量之间进行适当的权衡:在前一种情况下,未来多余的光伏能量会丢失,而在后一种情况下,需求会受到未来高电价的影响。我们提出了一种强化学习方法来处理非平稳环境和非线性存储特性。为了使这种方法适用,提出了一种新的决策问题公式,它侧重于优化电网能源购买,而不是直接存储控制。这限制了状态和动作空间的复杂性,从而可以实现令人满意的学习速度并避免稳定性问题。然后使用 Q-learning 算法结合用于函数表示的密集深度神经网络来学习最优决策策略。该算法结合了先前工作中发现的可提高学习速度和稳定性的增强功能,例如经验回放、目标网络和增加折扣因子。对真实数据进行的大量模拟结果证实,我们的方法是有效的,并且优于基于规则的启发式方法。使达到令人满意的学习速度和避免稳定性问题成为可能。然后使用 Q-learning 算法结合用于函数表示的密集深度神经网络来学习最优决策策略。该算法结合了先前工作中发现的可提高学习速度和稳定性的增强功能,例如经验回放、目标网络和增加折扣因子。对真实数据进行的大量模拟结果证实,我们的方法是有效的,并且优于基于规则的启发式方法。使达到令人满意的学习速度和避免稳定性问题成为可能。然后使用 Q-learning 算法结合用于函数表示的密集深度神经网络来学习最优决策策略。该算法结合了先前工作中发现的可提高学习速度和稳定性的增强功能,例如经验回放、目标网络和增加折扣因子。对真实数据进行的大量模拟结果证实,我们的方法是有效的,并且优于基于规则的启发式方法。并增加折扣系数。对真实数据进行的大量模拟结果证实,我们的方法是有效的,并且优于基于规则的启发式方法。并增加折扣系数。对真实数据进行的大量模拟结果证实,我们的方法是有效的,并且优于基于规则的启发式方法。
更新日期:2021-01-01
down
wechat
bug