当前位置: X-MOL 学术J. Multivar. Anal. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Robust network-based analysis of the associations between (epi)genetic measurements
Journal of Multivariate Analysis ( IF 1.6 ) Pub Date : 2018-11-01 , DOI: 10.1016/j.jmva.2018.06.009
Cen Wu 1 , Qingzhao Zhang 2 , Yu Jiang 3 , Shuangge Ma 4
Affiliation  

With its important biological implications, modeling the associations of gene expression (GE) and copy number variation (CNV) has been extensively conducted. Such analysis is challenging because of the high data dimensionality, lack of knowledge regulating CNVs for a specific GE, different behaviors of the cis-acting and trans-acting CNVs, possible long-tailed distributions and contamination of GE measurements, and correlations between CNVs. The existing methods fail to address one or more of these challenges. In this study, a new method is developed to model more effectively the GE-CNV associations. Specifically, for each GE, a partially linear model, with a nonlinear cis-acting CNV effect, is assumed. A robust loss function is adopted to accommodate long-tailed distributions and data contamination. We adopt penalization to accommodate the high dimensionality and identify relevant CNVs. A network structure is introduced to accommodate the correlations among CNVs. The proposed method comprehensively accommodates multiple challenging characteristics of GE-CNV modeling and effectively overcomes the limitations of existing methods. We develop an effective computational algorithm and rigorously establish the consistency properties. Simulation shows the superiority of the proposed method over alternatives. The TCGA (The Cancer Genome Atlas) data on the PCD (programmed cell death) pathway are analyzed, and the proposed method has improved prediction and stability and biologically plausible findings.

中文翻译:

对(表观)遗传测量之间关联的基于网络的稳健分析

由于其重要的生物学意义,对基因表达 (GE) 和拷贝数变异 (CNV) 的关联进行建模已被广泛进行。这种分析具有挑战性,因为数据维度高、缺乏调节特定 GE 的 CNV 的知识、顺式作用和反式作用 CNV 的不同行为、GE 测量可能的长尾分布和污染以及 CNV 之间的相关性。现有方法无法解决这些挑战中的一个或多个。在这项研究中,开发了一种新方法来更有效地模拟 GE-CNV 关联。具体而言,对于每个 GE,假设具有非线性顺式作用 CNV 效应的部分线性模型。采用稳健的损失函数来适应长尾分布和数据污染。我们采用惩罚来适应高维并识别相关的 CNV。引入了网络结构以适应 CNV 之间的相关性。所提出的方法全面适应了GE-CNV建模的多个具有挑战性的特点,并有效克服了现有方法的局限性。我们开发了一种有效的计算算法并严格建立了一致性属性。仿真表明所提出的方法优于替代方法。分析了关于 PCD(程序性细胞死亡)途径的 TCGA(癌症基因组图谱)数据,所提出的方法提高了预测和稳定性以及生物学上合理的发现。所提出的方法全面适应了GE-CNV建模的多个具有挑战性的特点,并有效克服了现有方法的局限性。我们开发了一种有效的计算算法并严格建立了一致性属性。仿真表明所提出的方法优于替代方法。分析了关于 PCD(程序性细胞死亡)途径的 TCGA(癌症基因组图谱)数据,所提出的方法提高了预测和稳定性以及生物学上合理的发现。所提出的方法全面适应了GE-CNV建模的多个具有挑战性的特点,并有效克服了现有方法的局限性。我们开发了一种有效的计算算法并严格建立了一致性属性。仿真表明所提出的方法优于替代方法。分析了关于 PCD(程序性细胞死亡)途径的 TCGA(癌症基因组图谱)数据,所提出的方法提高了预测和稳定性以及生物学上合理的发现。
更新日期:2018-11-01
down
wechat
bug