当前位置: X-MOL 学术bioRxiv. Evol. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Higher-order epistasis and phenotypic prediction
bioRxiv - Evolutionary Biology Pub Date : 2022-03-14 , DOI: 10.1101/2020.10.14.339804
Juannan Zhou , Mandy S. Wong , Wei-Chia Chen , Adrian R. Krainer , Justin B. Kinney , David M. McCandlish

Contemporary high-throughput mutagenesis experiments are providing an increasingly detailed view of the complex patterns of genetic interaction that occur between multiple mutations within a single protein or regulatory element. By simultaneously measuring the effects of thousands of combinations of mutations, these experiments have revealed that the genotype-phenotype relationship typically reflects genetic interactions not only between pairs of sites, but also higher-order interactions among larger numbers of sites. However, modeling and understanding these higher-order interactions remains challenging. Here, we present a method for reconstructing sequence-to-function mappings from partially observed data that can accommodate all orders of genetic interaction. The main idea is to make predictions for unobserved genotypes that match the type and extent of epistasis found in the observed data. This information on the type and extent of epistasis can be extracted by considering how phenotypic correlations change as a function of mutational distance, which is equivalent to estimating the fraction of phenotypic variance due to each order of genetic interaction (additive, pairwise, three-way, etc.). Using these estimated variance components, we then define an empirical Bayes prior that in expectation matches the observed pattern of epistasis, and reconstruct the genotype-phenotype mapping by conducting Gaussian process regression under this prior. To demonstrate the power of this approach, we present an application to the antibody-binding domain GB1 and also provide a detailed exploration of a dataset consisting of high-throughput measurements for the splicing efficiency of human pre-mRNA 5′ splice sites, for which we also validate our model predictions via additional low-throughput experiments.

中文翻译:

高阶上位性和表型预测

当代的高通量诱变实验提供了越来越详细的视图,以了解在单个蛋白质或调节元件内的多个突变之间发生的遗传相互作用的复杂模式。通过同时测量数千种突变组合的影响,这些实验表明,基因型-表型关系通常不仅反映了成对位点之间的遗传相互作用,而且还反映了大量位点之间的高阶相互作用。然而,建模和理解这些高阶交互仍然具有挑战性。在这里,我们提出了一种从部分观察到的数据中重建序列到功能映射的方法,该方法可以适应所有遗传相互作用的顺序。主要思想是对未观察到的基因型进行预测,这些基因型与观察到的数据中发现的上位性的类型和程度相匹配。这种关于上位性类型和程度的信息可以通过考虑表型相关性如何作为突变距离的函数而变化来提取,这相当于估计由于每个遗传相互作用顺序(加性、成对、三向)引起的表型方差分数, 等等。)。使用这些估计的方差分量,我们然后定义一个经验贝叶斯先验,它期望与观察到的上位性模式相匹配,并通过在此先验下进行高斯过程回归来重建基因型-表型映射。为了展示这种方法的威力,5 '拼接位点,我们还通过额外的低通量实验验证了我们的模型预测。
更新日期:2022-03-14
down
wechat
bug