当前位置: X-MOL 学术Talanta › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Integration and holistic analysis of multiple multidimensional soil data sets
Talanta ( IF 6.1 ) Pub Date : 2024-04-04 , DOI: 10.1016/j.talanta.2024.125954
Lisa I. Pilkington , William Kerner , Daniela Bertoldi , Roberto Larcher , Soon A. Lee , Matthew R. Goddard , Davide Albanese , Pietro Franceschi , Bruno Fedrizzi

Complex matrices such as soil have a range of measurable characteristics, and thus data to describe them can be considered multidimensional. These characteristics can be strongly influenced by factors that introduce confounding effects that hinder analyses. Traditional statistical approaches lack the flexibility and granularity required to adequately evaluate such matrices, particularly those with large dataset of varying data types (i.e. quantitative non-compositional, quantitative compositional). We present a statistical workflow designed to effectively analyse complex, multidimensional systems, even in the presence of confounding variables. The developed methodology involves exploratory analysis to identify the presence of confounding variables, followed by data decomposition (including strategies for both compositional and non-compositional quantitative data) to minimise the influence of these confounding factors such as sampling site/location. These data processing methods then allow for common patterns to be highlighted in the data, including the identification of biomarkers and determination of non-trivial associations between variables. We demonstrate the utility of this statistical workflow by jointly analysing the chemical composition and fungal biodiversity of New Zealand vineyard soils that have been managed with either organic low-input or conventional input approaches. By applying this pipeline, we were able to identify biomarkers that distinguish viticultural soil from both approaches and also unearth links and associations between the chemical and metagenomic profiles. While soil is an example of a system that can require this type of statistical methodology, there are a range of biological and ecological systems that are challenging to analyse due to the complex interplay of global and local effects. Utilising our developed pipeline will greatly enhance the way that these systems can be studied and the quality and impact of insight gained from their analysis.

中文翻译:

多个多维土壤数据集的整合与整体分析

土壤等复杂基质具有一系列可测量的特征,因此描述它们的数据可以被认为是多维的。这些特征可能会受到引入阻碍分析的混杂效应的因素的强烈影响。传统的统计方法缺乏充分评估此类矩阵所需的灵活性和粒度,特别是那些具有不同数据类型(即定量非组合、定量组合)的大型数据集的矩阵。我们提出了一种统计工作流程,旨在有效分析复杂的多维系统,即使存在混杂变量。开发的方法包括探索性分析,以确定混杂变量的存在,然后进行数据分解(包括成分和非成分定量数据的策略),以尽量减少这些混杂因素(例如采样地点/位置)的影响。然后,这些数据处理方法允许在数据中突出显示常见模式,包括生物标志物的识别和变量之间的重要关联的确定。我们通过联合分析采用有机低投入或传统投入方法管理的新西兰葡萄园土壤的化学成分和真菌生物多样性,展示了这种统计工作流程的实用性。通过应用这一流程,我们能够识别出区分葡萄栽培土壤与两种方法的生物标志物,并揭示化学和宏基因组特征之间的联系和关联。虽然土壤是需要这种统计方法的系统的一个例子,但由于全球和局部影响的复杂相互作用,有一系列生物和生态系统难以分析。利用我们开发的管道将极大地增强研究这些系统的方式以及从分析中获得的见解的质量和影响。
更新日期:2024-04-04
down
wechat
bug