当前位置: X-MOL 学术Oecologia › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Using the right tool for the job: the difference between unsupervised and supervised analyses of multivariate ecological data
Oecologia ( IF 2.7 ) Pub Date : 2021-02-12 , DOI: 10.1007/s00442-020-04848-w
Eric R. Scott , Elizabeth E. Crone

Ecologists often collect data with the aim of determining which of many variables are associated with a particular cause or consequence. Unsupervised analyses (e.g. principal components analysis, PCA) summarize variation in the data, without regard to the response. Supervised analyses (e.g., partial least squares, PLS) evaluate the variables to find the combination that best explain a causal relationship. These approaches are not interchangeable, especially when the variables most responsible for a causal relationship are not the greatest source of overall variation in the data—a situation that ecologists are likely to encounter. To illustrate the differences between unsupervised and supervised techniques, we analyze a published dataset using both PCA and PLS and compare the questions and answers associated with each method. We also use simulated datasets representing situations that further illustrate differences between unsupervised and supervised analyses. For simulated data with many correlated variables that were unrelated to the response, PLS was better than PCA at identifying which variables were associated with the response. There are many applications for both unsupervised and supervised approaches in ecology. However, PCA is currently overused, at least in part because supervised approaches, such as PLS, are less familiar.



中文翻译:

使用正确的工具完成工作:多元生态数据的无监督分析与有监督分析之间的区别

生态学家经常收集数据,以确定许多变量中的哪一个与特定原因或后果相关。无监督分析(例如,主成分分析,PCA)总结了数据中的变化,而不考虑响应。监督分析(例如,偏最小二乘,PLS)评估变量以找到最能解释因果关系的组合。这些方法不可互换,尤其是在最重要的因果关系变量不是数据整体变化的最大来源时-生态学家很可能会遇到这种情况。为了说明无监督技术与无监督技术之间的差异,我们使用PCA和PLS分析了已发布的数据集,并比较了与每种方法相关的问题和答案。我们还使用表示情况的模拟数据集,这些情况进一步说明了无监督分析与有监督分析之间的差异。对于具有许多与响应无关的相关变量的模拟数据,PLS在识别哪些变量与响应相关联方面优于PCA。生态学中无监督方法和有监督方法都有许多应用。但是,PCA当前被过度使用了,至少部分是因为受监督的方法(例如PLS)不那么熟悉。生态学中无监督方法和有监督方法都有许多应用。但是,PCA当前被过度使用了,至少部分是因为受监督的方法(例如PLS)不那么熟悉。生态学中无监督方法和有监督方法都有许多应用。但是,PCA当前被过度使用了,至少部分是因为受监督的方法(例如PLS)不那么熟悉。

更新日期:2021-02-12
down
wechat
bug