Dimension reduction for NILM classification based on principle component analysis

https://doi.org/10.1016/j.epsr.2020.106459Get rights and content

Highlights

  • This paper suggests to use PCA as an efficient dimension reduction method for NILM.

  • The method can be used with any NILM classification technique, various datasets and sample-rates.

  • The method is tested using the public dataset AMPds and a private dataset.

  • The results show that the run time is reduced while the accuracy is preserved, i.e. there is minimal loss of information.

  • The suggested PCA method may be useful in applications in which run-time is an important factor, and therefore cannot use complex NILM algorithms, with a high-dimensional solution space.

Abstract

Non-intrusive load monitoring (NILM) techniques estimate the consumption of individual appliances in a household or facility, based on readings of a centralized meter. Usually, NILM techniques are shown to be improved when various power features and additional power quality parameters are included. However, adding power features leads to increased time complexity which is a disadvantage to real-time operation. Therefore, in this work we offer a process based on principal component analysis (PCA) which reduces the dimension of NILM power features. The suggested method can be used with any NILM classification technique, and shows good performance in terms of standard measures and time complexity when tested on popular datasets.

Introduction

In the last decades smart grid technologies attract worldwide attention [1]. A recent advancement is the use of monitoring systems that are based on smart meters which may help consumers manage their energy expenses by estimating the power consumption of each device in the system. Open access to such information can encourage energy saving behavior, improved fault detection, better demand forecast, energy incentives, and more [2].

A trivial method for measuring the power consumption of individual devices in a system is to use smart appliances that are capable of reporting their power consumption. However, this method is complex and expensive. A more efficient method for estimating the consumption of individual devices is Non-Intrusive Load Monitoring (NILM). NILM methods use a single meter which measures the total power consumption of a system, in order to estimate the power consumption of individual appliances. Since it was first introduced by Hart [3], many NILM techniques have been explored. Several of those are based on factorial hidden Markov models [4], machine learning algorithms [5], deep neural networks [6], [7], cross-entropy method [8], time series distance [9] and optimal classifier [10]. Additional approaches for NILM can be found in the following surveys [11], [12], [13]. A recent development is the increasing availability of public data sets that can be accessed as a reference for NILM algorithms such as REDD [14], AMPds [15] and PLAID [16].

Several NILM algorithms use active power as the only feature. However, many smart meters measure additional features such as reactive power, power factor (PF), total harmonic distortion (THD) and apparent power. These signals may be used for improved classification. One popular feature is the active-reactive power signal, which was also used in the first NILM technique suggested by Hart [3]. This signal is used for classification based on several approaches, such as finite-state-machine based on transients [17], Mixed-Integer Linear Programming [18], wavelet transform [19], ZIP modeling [20] and additive factorial hidden Markov models [21]. If more features are used the classification of individual appliances can be improved. For example, classification based on current harmonics features is presented in [22], [23]. Other power features include active-reactive power and THD [24], active-reactive power and harmonics [25], active-reactive power, THD and harmonics [26], active-reactive-apparent power, PF and THD [27], and more [28]. Another important feature is the V-I trajectory that uses voltage and current waveforms. Some works utilizing this feature are [29], [30], [31].

While additional features might improve the accuracy of NILM algorithms, adding features is not always the best approach due to increased time complexity. Moreover, there is no guaranty that additional features improve the classification accuracy, since too many features might lead to over-fitting. To overcome the mentioned problems some works suggest to use a feature selection approach, in which the goal is to find a simple model with a minimal number of features [32]. The earliest work for feature selection in NILM is [33], which uses a neural network. Later, [34] uses the short-time Fourier transform and the wavelet transform to select the features. More recent research works which studied feature selection are [35] and [36]. In [35] a recursive feature elimination process is used to identify the most effective feature set based on the PLAID dataset [16]. This approach is based on elimination of features based on heuristic methods, and have relatively high complexity. Work [36] implements a forward selection algorithm which is used to select features by analyzing fast transients, based on four different datasets. In addition, [37], [38] convert correlated features into non-correlated ones, in order to simplify the classification process. The data conversation is based on the principle component analysis (PCA) transform, which is used in this works for new data representation.

Feature selection methods will always lose information since, by definition, several features are removed from the dataset. Therefore, in this work we propose to use the PCA method for efficient dimension reduction for NILM classification. The main idea is to use PCA to reduce the dimension of the power feature vector, while preserving information. This leads to reduced time complexity, which is critical for real-time operation. Note that in comparison to [37], [38] which used PCA for new data representation, we use it for dimension reduction. Additional advantages of the proposed approach is that the PCA is applied to raw data and operates independently of the NILM classification algorithm and the dataset. The sampling rate is irrelevant to the PCA, so this method can be used with smart meters with various sampling rates, ranging from milliseconds to minutes. The proposed method is tested on two classification algorithms using the public dataset AMPds [15], and a private dataset that was collected in a typical kitchen using the SATEC PM135 smart meter.

The paper is organized as follows. Section 2 outlines properties of NILM features. Section 3 explains how PCA can be used for dimension reduction in NILM. Section 4 presents simulation results based on the AMPds public dataset. Section 5 shows experimental and simulation results based on a private dataset collected by the SATEC PM135 smart meter. Finally, Section 6 concludes the paper.

Section snippets

One state and multi-state formulation

Consider a house or facility with m appliances (i=1,2,,m), each with d power features ψi,1,ψi,2,,ψi,d>0. The aggregated power features, Ψ, measured during sample time n at the entry point of the facility isΨ[n]=[Ψ1[n]Ψ2[n]Ψd[n]]=i=1mxi[n][ψi,1ψi,2ψi,d]+[ϵ1[n]ϵ2[n]ϵd[n]],where xi[n] denotes the On/Off [0,1] status of appliance i, and ϵ[n] represents background noise due to low-power appliances. It is also assumed that for at least one power feature ψi, v ≠ ψb, v, for i ≠ b. The purpose of

Principal component analysis - background

Principal Component Analysis (PCA), also known as the Karhunen-Loeve transform, is a widely used technique in applications such as feature selection, lossy data compression and dimension reduction [43], [44], [45]. PCA simplifies the complexity of high-dimensional data while retaining trends and patterns, by transforming the data into a lower dimension. PCA may be considered an unsupervised feature transformation method, which requires no label data, and is completely non-parametric. There are

Settings and test scenario

AMPds is an open data set proposed by SFU [15]. The data are collected from a house in Canada over 1 year. AMPds contains low sample-rate recordings of 60 s for 21 sub-meters, and includes active, reactive and apparent powers, in addition to current and power factor signals. In this simulations 7 sub-meters were classified, six of them containing a single load, and one sub-meter that contains multiple loads. The selected power features are active power, reactive power, apparent power and

Settings and test scenario

This experiment is done in a private house during two days. The first day is used for the training process and the second day is used for the inference stage. The private dataset was collected in a typical kitchen using the SATEC PM135 smart meter which is presented in Fig. 9. The data is sampled in a low sample-rate of 10 s from 7:00 to 19:00. The SATEC PM135 smart meter measures active power, reactive power, apparent power, current, power factor, harmonics and THD signals. In this experiment

Conclusions

This paper suggests to use PCA as an efficient dimension reduction method for NILM. The suggested method can be used with any NILM classification technique, and with various datasets and sample-rates. The training stage calculates the principal components based on past recorded data, and the inference stage uses the principal components for reducing the dimension of new samples. The proposed modified PCA procedure is tested using the public dataset AMPds and a private dataset that was collected

Declaration of Competing Interest

The authors whose names are listed immediately below certify that they have NO affiliations with or involvement in any organization or entity with any financial interest (such as honoraria; educational grants; participation in speakers’ bureaus; membership, employment, consultancies, stock ownership, or other equity interest; and expert testimony or patent-licensing arrangements), or non-financial interest (such as personal or professional relationships, affiliations, knowledge or beliefs) in

Acknowledgment

Y. Levron was partly supported by Israel Science Foundation grant No. 2//7221. Y. Beck was partly supported by the Israeli Innovation Authority under Grant 60689. The authors also want to thank SATEC Ltd. for contributing their power quality monitors.

References (48)

  • M. Dash et al.

    Feature selection for classification

    Intell. Data Anal.

    (1997)
  • N. Sadeghianpourhamami et al.

    Comprehensive feature selection for appliance classification in NILM

    Energy Build.

    (2017)
  • G. Hripcsak et al.

    Agreement, the f-measure, and reliability in information retrieval

    J. Am. Med. Inform.Assoc.

    (2005)
  • M. Hashmi et al.

    Survey of smart grid concepts, architectures, and technological demonstrations worldwide

    IEEE PES Conference on Innovative Smart Grid Technologies Latin America

    (2011)
  • J. Froehlich et al.

    Disaggregated end-use energy sensing for the smart grid

    IEEE Pervasive Comput.

    (2011)
  • G.W. Hart

    Nonintrusive appliance load monitoring

    Proc. IEEE

    (1992)
  • J.Z. Kolter et al.

    Approximate inference in additive factorial HMMs with application to energy disaggregation

    Artif. Intell. Stat.

    (2012)
  • J. Kelly et al.

    Neural NILM deep neural networks applied to energy disaggregation

    ACM International Conference on Embedded Systems For Energy-Efficient Built Environments

    (2015)
  • R. Machlev et al.

    Modified cross entropy method for classification events in NILM system

    IEEE Trans. Smart Grid

    (2018)
  • A. Zoha et al.

    Non-intrusive load monitoring approaches for disaggregated energy sensing: a survey

    Sensors

    (2012)
  • M. Zeifman et al.

    Nonintrusive appliance load monitoring: review and outlook

    IEEE Trans. Consum. Electron.

    (2011)
  • A. Faustine, N.H. Mvungi, S. Kaijage, K. Michael, A survey on non-intrusive load monitoring methodies and techniques...
  • J.Z. Kolter et al.

    REDD: A public data set for energy disaggregation research

    SustKDD Workshop on Data Mining Applications in Sustainability

    (2011)
  • S. Makonin et al.

    AMPds a public dataset for load disaggregation and eco-feedback research

    Electrical Power & Energy Conference

    (2013)
  • Cited by (29)

    • Non-intrusive multi-label load monitoring via transfer and contrastive learning architecture

      2023, International Journal of Electrical Power and Energy Systems
    • Selection of features from power theories to compose NILM datasets

      2022, Advanced Engineering Informatics
      Citation Excerpt :

      Features from standards and electrical studies assist in performing the load disaggregation and expanding functionalities to the meter, such as improvements in energy efficiency and possible malfunctions of electrical appliances [33]. Some works from the literature use Principal Component Analysis (PCA) [34,35], artificial neural networks, and deep learning for the automatic feature identification capable of distinguishing appliances [36–41], creating features by means of waveform differentiation between appliances. Although such researches present quite significant results, the extracted features do not necessarily correspond to the appliances’ physical phenomena.

    • Polyphase uncertainty analysis through virtual modelling technique

      2022, Mechanical Systems and Signal Processing
    • Comparison between artificial neural network and random forest for effective disaggregation of building cooling load

      2021, Case Studies in Thermal Engineering
      Citation Excerpt :

      In residential buildings, NILM is usually applied to identify the on/off states of household appliances such as washing machines, microwaves, refrigerators and space heating, which can help learn the occupant energy-use patterns [7–9]. In commercial buildings, NILM mainly concentrates on recognizing the energy use of air conditioning systems or their components [10–12] and the energy consumption of lighting systems [13] from the building-level energy consumption. Existing research about cooling loads is limited in the field of load disaggregation.

    • A non-intrusive load monitoring algorithm based on multiple features and decision fusion

      2021, Energy Reports
      Citation Excerpt :

      In general, identifying active load is performed by utilizing machine learning techniques to analyze the pattern of load features. Measurement data, such as current waveform data, power factor, active power is usually utilized due to easy acquisition [3,4]. Three alternative load signatures formations considering current harmonic amplitude, phase angles, harmonic current vectors are proposed to accurately classify electric appliances in [5,6].

    View all citing articles on Scopus
    View full text