Multiple Linear Regression Allows Weighted Burden Analysis of Rare Coding Variants in an Ethnically Heterogeneous Population,Human Heredity

当前位置： X-MOL 学术 › Hum. Hered. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Multiple Linear Regression Allows Weighted Burden Analysis of Rare Coding Variants in an Ethnically Heterogeneous Population
Human Heredity ( IF 1.8 ) Pub Date : 2021-01-07 , DOI: 10.1159/000512576
David Curtis _{1,

2}

Affiliation

Weighted burden analysis has been used in exome-sequenced case-control studies to identify genes in which there is an excess of rare and/or functional variants associated with phenotype. Implementation in a ridge regression framework allows simultaneous analysis of all variants along with relevant covariates, such as population principal components. In order to apply the approach to a quantitative phenotype, a weighted burden score is derived for each subject and included in a linear regression analysis. The weighting scheme is adjusted in order to apply differential weights to rare and very rare variants and a score is derived based on both the frequency and predicted effect of each variant. When applied to an ethnically heterogeneous dataset consisting of 49,790 exome-sequenced UK Biobank subjects and using body mass index as the phenotype, the method produces a very inflated test statistic. However, this is almost completely corrected by including 20 population principal components as covariates. When this is done, the top 30 genes include a few which are quite plausibly associated with the phenotype, including LYPLAL1 and NSDHL. This approach offers a way to carry out gene-based analyses of rare variants identified by exome sequencing in heterogeneous datasets without requiring that data from ethnic minority subjects be discarded. This research has been conducted using the UK Biobank Resource.
Hum Hered

中文翻译：

多元线性回归允许对种族异质人群中的稀有编码变体进行加权负担分析

加权负荷分析已用于外显子组测序的病例对照研究，以识别与表型相关的罕见和/或功能变异过多的基因。在岭回归框架中实施允许同时分析所有变体以及相关协变量，例如总体主成分。为了将该方法应用于定量表型，为每个受试者导出加权负担分数并包括在线性回归分析中。调整权重方案以将不同的权重应用于稀有和非常罕见的变体，并根据每个变体的频率和预测效果得出分数。当应用于由 49,790 个外显子组测序的英国生物银行受试者组成的种族异质数据集时，并使用体重指数作为表型，该方法产生了一个非常膨胀的测试统计量。但是，通过将 20 个总体主成分作为协变量，这几乎可以完全纠正。完成此操作后，前 30 个基因包括一些很可能与表型相关的基因，包括LYPLAL1和NSDHL。这种方法提供了一种方法，可以对异质数据集中的外显子组测序确定的稀有变异进行基于基因的分析，而无需丢弃来自少数族裔受试者的数据。这项研究是使用英国生物银行资源进行的。
哼哼

更新日期：2021-01-07

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>