当前位置: X-MOL 学术J. Big Data › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Sensor data quality: a systematic review
Journal of Big Data ( IF 8.1 ) Pub Date : 2020-02-11 , DOI: 10.1186/s40537-020-0285-1
Hui Yie Teh , Andreas W. Kempa-Liehr , Kevin I-Kai Wang

Sensor data quality plays a vital role in Internet of Things (IoT) applications as they are rendered useless if the data quality is bad. This systematic review aims to provide an introduction and guide for researchers who are interested in quality-related issues of physical sensor data. The process and results of the systematic review are presented which aims to answer the following research questions: what are the different types of physical sensor data errors, how to quantify or detect those errors, how to correct them and what domains are the solutions in. Out of 6970 literatures obtained from three databases (ACM Digital Library, IEEE Xplore and ScienceDirect) using the search string refined via topic modelling, 57 publications were selected and examined. Results show that the different types of sensor data errors addressed by those papers are mostly missing data and faults e.g. outliers, bias and drift. The most common solutions for error detection are based on principal component analysis (PCA) and artificial neural network (ANN) which accounts for about 40% of all error detection papers found in the study. Similarly, for fault correction, PCA and ANN are among the most common, along with Bayesian Networks. Missing values on the other hand, are mostly imputed using Association Rule Mining. Other techniques include hybrid solutions that combine several data science methods to detect and correct the errors. Through this systematic review, it is found that the methods proposed to solve physical sensor data errors cannot be directly compared due to the non-uniform evaluation process and the high use of non-publicly available datasets. Bayesian data analysis done on the 57 selected publications also suggests that publications using publicly available datasets for method evaluation have higher citation rates.



中文翻译:

传感器数据质量:系统回顾

传感器数据质量在物联网(IoT)应用程序中起着至关重要的作用,因为如果数据质量不好,它们将变得无用。本系统综述旨在为对物理传感器数据质量相关问题感兴趣的研究人员提供简介和指南。提出了系统审查的过程和结果,旨在回答以下研究问题:物理传感器数据错误有哪些不同类型,如何量化或检测这些错误,如何纠正它们以及解决方案的领域。在通过主题建模精炼的搜索字符串从三个数据库(ACM数字图书馆,IEEE Xplore和ScienceDirect)获得的6970篇文献中,选择并检查了57种出版物。结果表明,这些论文解决的不同类型的传感器数据错误主要是数据丢失和故障,例如异常值,偏差和漂移。错误检测的最常见解决方案是基于主成分分析(PCA)和人工神经网络(ANN),这约占研究中所有错误检测论文的40%。同样,对于故障纠正,PCA和ANN以及贝叶斯网络是最常见的。另一方面,缺失值通常是使用关联规则挖掘来估算的。其他技术包括混合解决方案,这些解决方案结合了多种数据科学方法来检测和纠正错误。通过这项系统的审查,结果发现,由于评估过程的不统一以及非公开数据集的大量使用,解决物理传感器数据错误的方法无法直接进行比较。对57个选定出版物进行的贝叶斯数据分析还表明,使用公开可用的数据集进行方法评估的出版物的引用率更高。

更新日期:2020-04-21
down
wechat
bug