Delayed Q-update: A novel credit assignment technique for deriving an optimal operation policy for the Grid-Connected Microgrid,arXiv - CS - Systems and Control

当前位置： X-MOL 学术 › arXiv.cs.SY › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Delayed Q-update: A novel credit assignment technique for deriving an optimal operation policy for the Grid-Connected Microgrid
arXiv - CS - Systems and Control Pub Date : 2020-06-30 , DOI: arxiv-2006.16659
Hyungjun Park, Daiki Min, Jong-hyun Ryu, Dong Gu Choi

A microgrid is an innovative system that integrates distributed energy resources to supply electricity demand within electrical boundaries. This study proposes an approach for deriving a desirable microgrid operation policy that enables sophisticated controls in the microgrid system using the proposed novel credit assignment technique, delayed-Q update. The technique employs novel features such as the ability to tackle and resolve the delayed effective property of the microgrid, which prevents learning agents from deriving a well-fitted policy under sophisticated controls. The proposed technique tracks the history of the charging period and retroactively assigns an adjusted value to the ESS charging control. The operation policy derived using the proposed approach is well-fitted for the real effects of ESS operation because of the process of the technique. Therefore, it supports the search for a near-optimal operation policy under a sophisticatedly controlled microgrid environment. To validate our technique, we simulate the operation policy under a real-world grid-connected microgrid system and demonstrate the convergence to a near-optimal policy by comparing performance measures of our policy with benchmark policy and optimal policy.

中文翻译：

延迟 Q 更新：一种新的信用分配技术，用于为并网微电网推导出最优运行策略

微电网是一种创新系统，它整合分布式能源，在电力边界内供应电力需求。本研究提出了一种推导理想微电网运行策略的方法，该策略使用所提议的新型信用分配技术延迟 Q 更新在微电网系统中实现复杂控制。该技术采用了新颖的功能，例如能够解决和解决微电网延迟有效特性的能力，这会阻止学习代理在复杂的控制下推导出合适的策略。所提出的技术跟踪充电周期的历史，并追溯性地为 ESS 充电控制分配一个调整值。由于该技术的过程，使用所提出的方法导出的操作策略非常适合ESS操作的实际效果。因此，它支持在复杂控制的微电网环境下寻找接近最优的运行策略。为了验证我们的技术，我们模拟了真实世界并网微电网系统下的运行策略，并通过将我们的策略的性能指标与基准策略和最优策略进行比较来证明接近最优策略的收敛性。

更新日期：2020-10-21

点击分享查看原文

点击收藏

阅读更多本刊最新论文