当前位置: X-MOL 学术bioRxiv. Microbiol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
­Multivariable Association Discovery in Population-scale Meta-omics Studies
bioRxiv - Microbiology Pub Date : 2021-01-20 , DOI: 10.1101/2021.01.20.427420
Himel Mallick , Ali Rahnavard , Lauren J. McIver , Siyuan Ma , Yancong Zhang , Long H. Nguyen , Timothy L. Tickle , George Weingart , Boyu Ren , Emma H. Schwager , Suvo Chatterjee , Kelsey N. Thompson , Jeremy E. Wilkinson , Ayshwarya Subramanian , Yiren Lu , Levi Waldron , Joseph N. Paulson , Eric A. Franzosa , Hector Corrada Bravo , Curtis Huttenhower

It is challenging to associate features such as human health outcomes, diet, environmental conditions, or other metadata to microbial community measurements, due in part to their quantitative properties. Microbiome multi-omics are typically noisy, sparse (zero-inflated), high-dimensional, extremely non-normal, and often in the form of count or compositional measurements. Here we introduce an optimized combination of novel and established methodology to assess multivariable association of microbial community features with complex metadata in population-scale observational studies. Our approach, MaAsLin 2 (Microbiome Multivariable Associations with Linear Models), uses general linear models to accommodate a wide variety of modern epidemiological studies, including cross-sectional and longitudinal designs, as well as a variety of data types (e.g. counts and relative abundances) with or without covariates and repeated measurements. To construct this method, we conducted a large-scale evaluation of a broad range of scenarios under which straightforward identification of meta-omics associations can be challenging. These simulation studies reveal that MaAsLin 2’s linear model preserves statistical power in the presence of repeated measures and multiple covariates, while accounting for the nuances of meta-omics features and controlling false discovery. We also applied MaAsLin 2 to a microbial multi-omics dataset from the Integrative Human Microbiome (HMP2) project which, in addition to reproducing established results, revealed a unique, integrated landscape of inflammatory bowel disease (IBD) across multiple time points and omics profiles.

中文翻译:

人口规模的基因组学研究中的多变量关联发现

将诸如人类健康结果,饮食,环境条件或其他元数据之类的特征与微生物群落测量联系起来是一项挑战,这部分是由于其定量特性。微生物组多组学通常嘈杂,稀疏(零膨胀),高维,极不正常,并且通常以计数或组成测量的形式出现。在这里,我们介绍了一种新颖的和已建立的方法的优化组合,以评估人口规模观察研究中微生物群落特征与复杂元数据的多变量关联。我们的方法MaAsLin 2(具有线性模型的微生物组多变量关联)使用通用线性模型来适应各种现代流行病学研究,包括横断面和纵向设计以及各种数据类型(例如 计数和相对丰度),可以选择是否包含协变量和重复测量。为了构建这种方法,我们对各种场景进行了大规模评估,在这些场景下,直接识别元组学关联可能会面临挑战。这些模拟研究表明,在重复测量和多个协变量存在的情况下,MaAsLin 2的线性模型可以保留统计能力,同时可以考虑元组学特征的细微差别并控制错误发现。我们还将MaAsLin 2应用于整合人类微生物组(HMP2)项目的微生物多组学数据集,该数据集除了重现已确立的结果外,还揭示了跨多个时间点和组学概况的炎性肠病(IBD)的独特,整合景观。
更新日期:2021-01-21
down
wechat
bug