当前位置: X-MOL 学术BMC Med. Genomics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Hierarchical structural component model for pathway analysis of common variants.
BMC Medical Genomics ( IF 2.7 ) Pub Date : 2020-02-24 , DOI: 10.1186/s12920-019-0650-0
Nan Jiang 1 , Sungyoung Lee 2 , Taesung Park 1, 3
Affiliation  

BACKGROUND Genome-wide association studies (GWAS) have been widely used to identify phenotype-related genetic variants using many statistical methods, such as logistic and linear regression. However, GWAS-identified SNPs, as identified with stringent statistical significance, explain just a small portion of the overall estimated genetic heritability. To address this 'missing heritability' issue, gene- and pathway-based analysis, and biological mechanisms, have been used for many GWAS studies. However, many of these methods often neglect the correlation between genes and between pathways. METHODS We constructed a hierarchical component model that considers correlations both between genes and between pathways. Based on this model, we propose a novel pathway analysis method for GWAS datasets, Hierarchical structural Component Model for Pathway analysis of Common vAriants (HisCoM-PCA). HisCoM-PCA first summarizes the common variants of each gene, first at the gene-level, and then analyzes all pathways simultaneously by ridge-type penalization of both the gene and pathway effects on the phenotype. Statistical significance of the gene and pathway coefficients can be examined by permutation tests. RESULTS Using the simulation data set of Genetic Analysis Workshop 17 (GAW17), for both binary and continuous phenotypes, we showed that HisCoM-PCA well-controlled type I error, and had a higher empirical power compared to several other methods. In addition, we applied our method to a SNP chip dataset of KARE for four human physiologic traits: (1) type 2 diabetes; (2) hypertension; (3) systolic blood pressure; and (4) diastolic blood pressure. Those results showed that HisCoM-PCA could successfully identify signal pathways with superior statistical and biological significance. CONCLUSIONS Our approach has the advantage of providing an intuitive biological interpretation for associations between common variants and phenotypes, via pathway information, potentially addressing the missing heritability conundrum.

中文翻译:

用于常见变体的路径分析的层次结构组件模型。

背景技术全基因组关联研究(GWAS)已被广泛用于使用许多统计方法(例如逻辑和线性回归)来鉴定表型相关的遗传变异。但是,经GWAS鉴定的SNP具有严格的统计意义,仅解释了总体估计遗传力的一小部分。为了解决这个“遗传性缺失”的问题,许多GWAS研究都使用了基于基因和途径的分析以及生物学机制。但是,这些方法中的许多方法经常忽略基因之间以及途径之间的相关性。方法我们构建了一个层次结构的组件模型,该模型考虑了基因之间以及途径之间的相关性。在此模型的基础上,我们提出了一种新的GWAS数据集路径分析方法,通用变量路径分析的分层结构组件模型(HisCoM-PCA)。HisCoM-PCA首先在基因水平上总结每个基因的常见变体,然后通过对基因的岭型惩罚和途径对表型的影响来同时分析所有途径。基因和通路系数的统计显着性可以通过排列检验来检查。结果使用遗传分析研讨会17(GAW17)的模拟数据集,对于二元和连续表型,我们证明HisCoM-PCA能够很好地控制I型错误,并且与其他几种方法相比,具有更高的经验能力。此外,我们将我们的方法应用于KARE的SNP芯片数据集,以解决以下四种人类生理特征:(1)2型糖尿病;(2)高血压;(3)收缩压;(4)舒张压。这些结果表明,HisCoM-PCA可以成功地识别具有卓越统计学和生物学意义的信号途径。结论我们的方法的优势在于可以通过途径信息为常见变异与表型之间的关联提供直观的生物学解释,从而潜在地解决了遗留的遗传性难题。
更新日期:2020-04-22
down
wechat
bug