当前位置: X-MOL 学术Nonlinear Process. Geophys. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Applications of matrix factorization methods to climate data
Nonlinear Processes in Geophysics ( IF 1.7 ) Pub Date : 2020-09-17 , DOI: 10.5194/npg-27-453-2020
Dylan Harries , Terence J. O'Kane

An initial dimension reduction forms an integral part of many analyses in climate science. Different methods yield low-dimensional representations that are based on differing aspects of the data. Depending on the features of the data that are relevant for a given study, certain methods may be more suitable than others, for instance yielding bases that can be more easily identified with physically meaningful modes. To illustrate the distinction between particular methods and identify circumstances in which a given method might be preferred, in this paper we present a set of case studies comparing the results obtained using the traditional approaches of empirical orthogonal function analysis and k-means clustering with the more recently introduced methods such as archetypal analysis and convex coding. For data such as global sea surface temperature anomalies, in which there is a clear, dominant mode of variability, all of the methods considered yield rather similar bases with which to represent the data while differing in reconstruction accuracy for a given basis size. However, in the absence of such a clear scale separation, as in the case of daily geopotential height anomalies, the extracted bases differ much more significantly between the methods. We highlight the importance in such cases of carefully considering the relevant features of interest and of choosing the method that best targets precisely those features so as to obtain more easily interpretable results.

中文翻译:

矩阵分解方法在气候数据中的应用

初始维数减少是气候科学中许多分析的组成部分。不同的方法会基于数据的不同方面产生低维表示。根据与给定研究相关的数据特征,某些方法可能比其他方法更适合,例如产生可以用物理上有意义的模式更容易识别的碱基。为了说明特定方法之间的区别并确定在哪种情况下可能首选给定方法,在本文中,我们提供了一组案例研究,比较了使用经验正交函数分析和k的传统方法获得的结果-表示使用最新引入的方法(例如原型分析和凸编码)进行聚类。对于诸如全球海表温度异常这样的数据,其中存在明显的占主导地位的可变性模式,所有考虑的方法都产生了相当相似的基准来表示数据,同时对于给定的基准大小,重建精度也有所不同。但是,在缺乏如此清晰的尺度分离的情况下,例如在每日地势高度异常的情况下,两种方法之间提取的碱基之间的差异要大得多。在这种情况下,我们强调了认真考虑相关特征并选择最能精确针对这些特征的方法的重要性,以便获得更容易解释的结果。
更新日期:2020-09-18
down
wechat
bug