当前位置: X-MOL 学术Comput. Geosci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Design of an expert distance metric for climate clustering: The case of rainfall in the Lesser Antilles
Computers & Geosciences ( IF 4.2 ) Pub Date : 2020-12-01 , DOI: 10.1016/j.cageo.2020.104612
Emmanuel Biabiany , Didier C. Bernard , Vincent Page , Hélène Paugam-Moisy

Abstract To expand our knowledge of the climate in the Lesser Antilles, we attempted to identify the spatio-temporal configurations of daily weather. We noticed certain pitfalls that can lead to poor results when using clustering algorithms and have proposed some steps towards the solution. These advancements might prove interesting for climate informatics, as well as for many applications that cluster physical fields. We illustrated the pitfalls with a dataset of cumulative rainfall from NASA’s Tropical Rainfall Measuring Mission for the period 2000 to 2014. First, the pitfall is the lack of numerical evaluation of the clusters found by the algorithms, which prevents the comparison of algorithms. We used silhouette index for this evaluation and to demonstrate other problems. Second, algorithms like K-means cluster the points around their barycentre. For many physical fields, this barycentre is trivial, which may lead to poor performances. Third, the L2 norm used in conventional clustering methods, such as K-means and hierarchical agglomerative clustering, focus on the exact location of fields, which leads to poor evaluations of similarity between fields. We replaced it by a similarity measure called the expert distance (ED) that compares the histograms of four zones, based on the symmetrised Kullback–Leibler divergence. It integrates the properties of the observed physical parameter and climate knowledge. With these improvements, the results revealed five clusters with high indexes. The algorithms now discriminate the daily scenarios favourably, thereby providing more physical meaning to the resulting clusters. The interpretation of these clusters as weather types is discussed.

中文翻译:

气候聚类专家距离度量的设计:以小安的列斯群岛降雨为例

摘要 为了扩大我们对小安的列斯群岛气候的了解,我们试图确定日常天气的时空结构。我们注意到在使用聚类算法时可能会导致糟糕的结果的某些陷阱,并提出了一些解决方案的步骤。这些进步可能对气候信息学以及许多物理场集群应用很感兴趣。我们用 2000 年至 2014 年期间美国宇航局热带降雨测量任务的累积降雨量数据集说明了这些缺陷。首先,缺陷是缺乏对算法发现的集群的数值评估,这阻碍了算法的比较。我们使用轮廓指数进行评估并演示其他问题。第二,像 K 均值这样的算法将重心周围的点聚类。对于许多物理领域,这个重心是微不足道的,这可能会导致性能不佳。第三,传统聚类方法中使用的 L2 范数,如 K-means 和层次凝聚聚类,侧重于字段的确切位置,这导致对字段之间相似性的评估很差。我们将其替换为称为专家距离 (ED) 的相似性度量,该度量基于对称的 Kullback-Leibler 散度来比较四个区域的直方图。它整合了观测到的物理参数和气候知识的特性。通过这些改进,结果显示五个具有高索引的集群。这些算法现在可以很好地区分日常场景,从而为生成的集群提供更多的物理意义。
更新日期:2020-12-01
down
wechat
bug