当前位置: X-MOL 学术Energy Rep. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Reinforcement learning-based differential evolution for parameters extraction of photovoltaic models
Energy Reports ( IF 4.7 ) Pub Date : 2021-02-08 , DOI: 10.1016/j.egyr.2021.01.096
Zhenzhen Hu , Wenyin Gong , Shuijia Li

In photovoltaic (PV) model, it is an urgent problem to control and optimize the accurate parameters. Hence, many algorithms have been proposed for parameter extraction of different PV models. However, the ability of many optimization algorithms is greatly affected by parameters, and the same parameter is not suitable for different model problems. In recent years, reinforcement learning has achieved competitive results in solving the problem of maximizing returns through learning strategies in the process of interaction with the environment. Therefore, in this paper we propose a new algorithm which combines differential evolution algorithm with reinforcement learning. Specifically, in the iterative process, the fitness function value is evaluated to determine the action reward for adjustment of parameter value, and the parameter value is adjusted through reinforcement learning to obtain the most suitable algorithm parameters for the environment model. The performance of the proposed approach has been verified by extracting single diode model, double diode model and PV module parameters, The simulation results (root mean square error) of single diode model (9.8602E−04), double diode model (9.8248E−04) and 2.4251E−03 for the Photowatt-PWP201, 1.7298E−03 for STM6-40/36 and 1.6601E−02 for the STP6-120/36 comprehensively show that the algorithm has better accuracy and robustness when compared with other advanced algorithms.

中文翻译:

基于强化学习的差分进化光伏模型参数提取

在光伏(PV)模型中,准确的参数控制和优化是一个紧迫的问题。因此,人们提出了许多算法来提取不同光伏模型的参数。然而很多优化算法的能力受参数影响很大,同一参数并不适合不同的模型问题。近年来,强化学习在解决与环境交互过程中通过学习策略实现收益最大化的问题上取得了有竞争力的成果。因此,本文提出一种将差分进化算法与强化学习相结合的新算法。具体地,在迭代过程中,评估适应度函数值以确定调整参数值的动作奖励,并通过强化学习调整参数值以获得最适合环境模型的算法参数。通过提取单二极管模型、双二极管模型和光伏组件参数验证了该方法的性能,单二极管模型(9.8602E−04)、双二极管模型(9.8248E−04)的仿真结果(均方根误差) Photowatt-PWP201的2.4251E−03、STM6-40/36的1.7298E−03和STP6-120/36的1.6601E−02综合表明该算法与其他先进算法相比具有更好的精度和鲁棒性算法。
更新日期:2021-02-08
down
wechat
bug