当前位置: X-MOL 学术J. Hydroinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A nonparametric framework for water consumption data cleansing: an application to a smart water network in Naples (Italy)
Journal of Hydroinformatics ( IF 2.2 ) Pub Date : 2020-07-01 , DOI: 10.2166/hydro.2020.133
Roberta Padulano 1 , Giuseppe Del Giudice 2
Affiliation  

Remote monitoring and collection of water consumption has gained pivotal importance in the field of demand understanding, modelling and prediction. However, most of the analyses that can be performed on such databases could be jeopardized by inconsistencies due to technological or behavioural issues causing significant amounts of missing or anomalous values. In the present paper, a nonparametric, unsupervised approach is presented to investigate the reliability of a consumption database, applied to the dataset of a district metering area in Naples (Italy) and focused on the detection of suspicious amounts of zero or outlying data. Results showed that the methodology is effective in identifying criticalities both in terms of unreliable time series, namely time series having huge amounts of invalid data, and in terms of unreliable data, namely data values suspiciously different from some suitable central parameters, irrespective of the source causing the anomaly. As such, the proposed approach is suitable for large databases when no prior information is known about the underlying probability distribution of data, and it can also be coupled with other nonparametric, pattern-based methods in order to guarantee that the database to be analysed is homogeneous in terms of water uses.



中文翻译:

耗水量数据清洗的非参数框架:在那不勒斯(意大利)的智能水网络中的应用

在需求理解,建模和预测领域,远程监控和收集用水已变得至关重要。但是,由于技术或行为问题导致大量遗漏或异常值的不一致性,可能在此类数据库上执行的大多数分析可能会受到损害。在本文中,提出了一种非参数,无监督的方法来研究消费数据库的可靠性,并将其应用于那不勒斯(意大利)的区域计量区域的数据集,并着重于检测可疑数量的零或偏远数据。结果表明,该方法可以有效地从不可靠的时间序列(即具有大量无效数据的时间序列)和不可靠的数据方面识别临界值,也就是说,数据值可疑地与某些合适的中心参数不同,而与引起异常的来源无关。这样,当不知道有关数据的潜在概率分布的先验信息时,建议的方法适用于大型数据库,并且它还可以与其他基于模式的非参数方法相结合,以确保要分析的数据库已被分析。在用水方面是均匀的。

更新日期:2020-08-20
down
wechat
bug