当前位置: X-MOL 学术Appl. Energy › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Robust and automatic data cleansing method for short-term load forecasting of distribution feeders
Applied Energy ( IF 11.2 ) Pub Date : 2020-01-07 , DOI: 10.1016/j.apenergy.2019.114405
Nathalie Huyghues-Beaufond , Simon Tindemans , Paola Falugi , Mingyang Sun , Goran Strbac

Distribution networks are undergoing fundamental changes at medium voltage level. To support growing planning and control decision-making, the need for large numbers of short-term load forecasts has emerged. Data-driven modelling of medium voltage feeders can be affected by (1) data quality issues, namely, large gross errors and missing observations (2) the presence of structural breaks in the data due to occasional network reconfiguration and load transfers. The present work investigates and reports on the effects of advanced data cleansing techniques on forecast accuracy. A hybrid framework to detect and remove outliers in large datasets is proposed; this automatic procedure combines the Tukey labelling rule and the binary segmentation algorithm to cleanse data more efficiently, it is fast and easy to implement. Various approaches for missing value imputation are investigated, including unconditional mean, Hot Deck via k-nearest neighbour and Kalman smoothing. A combination of the automatic detection/removal of outliers and the imputation methods mentioned above are implemented to cleanse time series of 342 medium-voltage feeders. A nested rolling-origin-validation technique is used to evaluate the feed-forward deep neural network models. The proposed data cleansing framework efficiently removes outliers from the data, and the accuracy of forecasts is improved. It is found that Hot Deck (k-NN) imputation performs best in balancing the bias-variance trade-off for short-term forecasting.



中文翻译:

配电网短期负荷预测的鲁棒自动数据清洗方法

配电网络正在中等电压水平上发生根本变化。为了支持不断增长的计划和控制决策,已经出现了对大量短期负荷预测的需求。中压馈线的数据驱动建模可能会受到以下因素的影响:(1)数据质量问题,即较大的总体错误和缺少的观测结果;(2)由于偶尔的网络重新配置和负载转移,数据中存在结构性中断。本工作调查和报告高级数据清洗技术对预测准确性的影响。提出了一种在大型数据集中检测和消除异常值的混合框架。该自动过程结合了Tukey标记规则和二进制分段算法,可以更有效地清除数据,并且快速,易于实现。研究了缺失值插补的各种方法,包括无条件均值,通过k最近邻的Hot Deck和Kalman平滑。将异常值的自动检测/消除与上述插补方法结合使用,以清理342个中压馈线的时间序列。嵌套滚动起源验证技术用于评估前馈深度神经网络模型。提出的数据清理框架可以有效地从数据中删除异常值,从而提高了预测的准确性。已经发现,热甲板(k-NN)插值在平衡短期预测的偏差方差折衷方面表现最佳。将异常值的自动检测/消除与上述插补方法结合使用,以清理342个中压馈线的时间序列。嵌套滚动起源验证技术用于评估前馈深度神经网络模型。提出的数据清理框架可以有效地从数据中删除异常值,从而提高了预测的准确性。发现在短期预测中,热甲板(k-NN)插值法在平衡偏差方差折衷方面表现最佳。将异常值的自动检测/消除与上述插补方法结合使用,以清理342个中压馈线的时间序列。嵌套滚动起源验证技术用于评估前馈深度神经网络模型。提出的数据清理框架可以有效地从数据中删除异常值,从而提高了预测的准确性。已经发现,热甲板(k-NN)插值在平衡短期预测的偏差方差折衷方面表现最佳。并提高了预测的准确性。已经发现,热甲板(k-NN)插值在平衡短期预测的偏差方差折衷方面表现最佳。并提高了预测的准确性。已经发现,热甲板(k-NN)插值在平衡短期预测的偏差方差折衷方面表现最佳。

更新日期:2020-01-08
down
wechat
bug