当前位置: X-MOL 学术BMC Genet. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
MCC-SP: a powerful integration method for identification of causal pathways from genetic variants to complex disease.
BMC Genetics Pub Date : 2020-08-26 , DOI: 10.1186/s12863-020-00899-3
Yuchen Zhu 1 , Jiadong Ji 2 , Weiqiang Lin 1 , Mingzhuo Li 1 , Lu Liu 1 , Huanhuan Zhu 3, 4 , Fuzhong Xue 1 , Xiujun Li 1 , Xiang Zhou 3, 4 , Zhongshang Yuan 1
Affiliation  

Genome-wide association studies (GWAS) have successfully identified genetic susceptible variants for complex diseases. However, the underlying mechanism of such association remains largely unknown. Most disease-associated genetic variants have been shown to reside in noncoding regions, leading to the hypothesis that regulation of gene expression may be the primary biological mechanism. Current methods to characterize gene expression mediating the effect of genetic variant on diseases, often analyzed one gene at a time and ignored the network structure. The impact of genetic variant can propagate to other genes along the links in the network, then to the final disease. There could be multiple pathways from the genetic variant to the final disease, with each having the chain structure since the first node is one specific SNP (Single Nucleotide Polymorphism) variant and the end is disease outcome. One key but inadequately addressed question is how to measure the between-node connection strength and rank the effects of such chain-type pathways, which can provide statistical evidence to give the priority of some pathways for potential drug development in a cost-effective manner. We first introduce the maximal correlation coefficient (MCC) to represent the between-node connection, and then integrate MCC with K shortest paths algorithm to rank and identify the potential pathways from genetic variant to disease. The pathway importance score (PIS) was further provided to quantify the importance of each pathway. We termed this method as “MCC-SP”. Various simulations are conducted to illustrate MCC is a better measurement of the between-node connection strength than other quantities including Pearson correlation, Spearman correlation, distance correlation, mutual information, and maximal information coefficient. Finally, we applied MCC-SP to analyze one real dataset from the Religious Orders Study and the Memory and Aging Project, and successfully detected 2 typical pathways from APOE genotype to Alzheimer’s disease (AD) through gene expression enriched in Alzheimer’s disease pathway. MCC-SP has powerful and robust performance in identifying the pathway(s) from the genetic variant to the disease. The source code of MCC-SP is freely available at GitHub ( https://github.com/zhuyuchen95/ADnet ).

中文翻译:


MCC-SP:一种强大的整合方法,用于识别从遗传变异到复杂疾病的因果途径。



全基因组关联研究(GWAS)已成功识别出复杂疾病的遗传易感变异。然而,这种关联的根本机制仍然很大程度上未知。大多数与疾病相关的遗传变异已被证明存在于非编码区,这导致了基因表达调控可能是主要生物学机制的假设。目前表征介导遗传变异对疾病影响的基因表达的方法通常一次分析一个基因,而忽略了网络结构。遗传变异的影响可以沿着网络中的链接传播到其他基因,然后传播到最终的疾病。从遗传变异到最终疾病可能有多种途径,每条途径都具有链结构,因为第一个节点是一个特定的 SNP(单核苷酸多态性)变异,最后是疾病结果。一个关键但未充分解决的问题是如何测量节点间的连接强度并对此类链式路径的效果进行排序,这可以提供统计证据,以经济有效的方式优先考虑某些潜在药物开发的路径。我们首先引入最大相关系数(MCC)来表示节点之间的连接,然后将 MCC 与 K 最短路径算法结合起来,对从遗传变异到疾病的潜在路径进行排序和识别。进一步提供通路重要性评分(​​PIS)来量化每个通路的重要性。我们将这种方法称为“MCC-SP”。 进行了各种模拟来说明 MCC 比其他量(包括 Pearson 相关性、Spearman 相关性、距离相关性、互信息和最大信息系数)更好地测量节点间连接强度。最后,我们应用MCC-SP分析了来自Religious Orders Study和Memory and Aging Project的一组真实数据集,并通过富含阿尔茨海默病途径的基因表达成功检测到了从APOE基因型到阿尔茨海默病(AD)的2条典型途径。 MCC-SP 在识别从遗传变异到疾病的途径方面具有强大而稳健的性能。 MCC-SP的源代码可以在GitHub上免费获取(https://github.com/zhuyuchen95/ADnet)。
更新日期:2020-08-26
down
wechat
bug