当前位置: X-MOL 学术Comput. Geosci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A machine learning approach to quality control oceanographic data
Computers & Geosciences ( IF 4.4 ) Pub Date : 2021-06-19 , DOI: 10.1016/j.cageo.2021.104803
G.P. Castelão

Sensor errors are inevitable when measuring the ocean; thus, a reliable dataset of observations requires a quality control (QC) procedure capable of detecting spurious measurements. While manual QC by human experts minimizes errors, it is inefficient to handle large datasets and vulnerable to inconsistencies between different experts. Although automatic QC addresses some of these issues, the traditional methods result in high rates of false positives. Here, I propose a machine learning approach to automatically QC oceanographic data based on the Anomaly Detection technique. Multiple tests are combined into a single multidimensional criterion that learns the behavior of the valid measurements and identifies bad samples as outliers. When applied to 13 years of hydrographic profiles, the Anomaly Detection resulted in the best classification performance, reducing the error by at least 50%. The Anomaly Detection approach introduced here was implemented in the Python package CoTeDe, an open-source framework to quality control oceanographic data.



中文翻译:

质量控制海洋学数据的机器学习方法

测量海洋时,传感器误差是不可避免的;因此,可靠的观测数据集需要能够检测虚假测量的质量控制 (QC) 程序。虽然人类专家的手动 QC 可以最大限度地减少错误,但处理大型数据集效率低下,并且容易受到不同专家之间不一致的影响。尽管自动 QC 解决了其中一些问题,但传统方法会导致误报率很高。在这里,我提出了一种基于异常检测技术自动 QC 海洋数据的机器学习方法。多个测试被组合成一个单一的多维标准,该标准学习有效测量的行为并将不良样本识别为异常值。当应用于 13 年的水文剖面时,异常检测产生了最好的分类性能,至少减少了 50% 的错误。这里介绍的异常检测方法是在 Python 包 CoTeDe 中实现的,CoTeDe 是一个用于质量控制海洋数据的开源框架。

更新日期:2021-07-04
down
wechat
bug