当前位置: X-MOL 学术Algorithms Mol. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Fast computation of genome-metagenome interaction effects.
Algorithms for Molecular Biology ( IF 1.5 ) Pub Date : 2020-07-01 , DOI: 10.1186/s13015-020-00173-2
Florent Guinot 1 , Marie Szafranski 1, 2 , Julien Chiquet 3 , Anouk Zancarini 4 , Christine Le Signor 5 , Christophe Mougel 6 , Christophe Ambroise 1
Affiliation  

Association studies have been widely used to search for associations between common genetic variants observations and a given phenotype. However, it is now generally accepted that genes and environment must be examined jointly when estimating phenotypic variance. In this work we consider two types of biological markers: genotypic markers, which characterize an observation in terms of inherited genetic information, and metagenomic marker which are related to the environment. Both types of markers are available in their millions and can be used to characterize any observation uniquely. Our focus is on detecting interactions between groups of genetic and metagenomic markers in order to gain a better understanding of the complex relationship between environment and genome in the expression of a given phenotype. We propose a novel approach for efficiently detecting interactions between complementary datasets in a high-dimensional setting with a reduced computational cost. The method, named SICOMORE, reduces the dimension of the search space by selecting a subset of supervariables in the two complementary datasets. These supervariables are given by a weighted group structure defined on sets of variables at different scales. A Lasso selection is then applied on each type of supervariable to obtain a subset of potential interactions that will be explored via linear model testing. We compare SICOMORE with other approaches in simulations, with varying sample sizes, noise, and numbers of true interactions. SICOMORE exhibits convincing results in terms of recall, as well as competitive performances with respect to running time. The method is also used to detect interaction between genomic markers in Medicago truncatula and metagenomic markers in its rhizosphere bacterial community. An R package is available [4], along with its documentation and associated scripts, allowing the reader to reproduce the results presented in the paper.

中文翻译:

基因组-宏基因组相互作用效应的快速计算。

关联研究已被广泛用于寻找常见遗传变异观察结果与给定表型之间的关联。然而,现在普遍认为,在估计表型变异时,必须联合检查基因和环境。在这项工作中,我们考虑了两种类型的生物标记:基因型标记,它根据遗传的遗传信息表征观察结果,以及与环境相关的宏基因组标记。这两种类型的标记都有数以百万计的数量,可用于独特地表征任何观察结果。我们的重点是检测遗传和宏基因组标记组之间的相互作用,以便更好地了解环境和基因组在给定表型表达中的复杂关系。我们提出了一种新方法,用于有效检测高维环境中互补数据集之间的交互,同时降低计算成本。该方法名为 SICOMORE,通过在两个互补数据集中选择超变量的子集来降低搜索空间的维度。这些超变量由在不同尺度的变量集上定义的加权组结构给出。然后对每种类型的超变量应用 Lasso 选择,以获得将通过线性模型测试探索的潜在相互作用的子集。我们将 SICOMORE 与模拟中的其他方法进行了比较,具有不同的样本大小、噪声和真实交互的数量。SICOMORE 在召回率方面表现出令人信服的结果,在运行时间方面也表现出具有竞争力的表现。该方法还用于检测蒺藜苜蓿基因组标记与其根际细菌群落中宏基因组标记之间的相互作用。提供了一个 R 包 [4],以及它的文档和相关脚本,允许读者重现论文中呈现的结果。
更新日期:2020-07-01
down
wechat
bug