当前位置: X-MOL 学术Biometrics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Information-incorporated Gaussian graphical model for gene expression data
Biometrics ( IF 1.9 ) Pub Date : 2021-02-02 , DOI: 10.1111/biom.13428
Huangdi Yi 1 , Qingzhao Zhang 2 , Cunjie Lin 3 , Shuangge Ma 1
Affiliation  

In the analysis of gene expression data, network approaches take a system perspective and have played an irreplaceably important role. Gaussian graphical models (GGMs) have been popular in the network analysis of gene expression data. They investigate the conditional dependence between genes and “transform” the problem of estimating network structures into a sparse estimation of precision matrices. When there is a moderate to large number of genes, the number of parameters to be estimated may overwhelm the limited sample size, leading to unreliable estimation and selection. In this article, we propose incorporating information from previous studies (for example, those deposited at PubMed) to assist estimating the network structure in the present data. It is recognized that such information can be partial, biased, or even wrong. A penalization-based estimation approach is developed, shown to have consistency properties, and realized using an effective computational algorithm. Simulation demonstrates its competitive performance under various information accuracy scenarios. The analysis of TCGA lung cancer prognostic genes leads to network structures different from the alternatives.

中文翻译:

基因表达数据的信息结合高斯图模型

在基因表达数据的分析中,网络方法从系统的角度出发,发挥了不可替代的重要作用。高斯图形模型 (GGM) 在基因表达数据的网络分析中很受欢迎。他们研究基因之间的条件依赖性,并将估计网络结构的问题“转化”为精度矩阵的稀疏估计。当存在中等到大量基因时,待估计的参数数量可能超过有限的样本量,导致估计和选择不可靠。在这篇文章中,我们建议结合以前研究的信息(例如,那些存放在 PubMed 上的信息)来帮助估计当前数据中的网络结构。人们认识到,此类信息可能是片面的、有偏见的,甚至是错误的。开发了一种基于惩罚的估计方法,该方法显示出具有一致性特性,并使用有效的计算算法实现。仿真展示了其在各种信息准确性场景下的竞争性能。TCGA 肺癌预后基因的分析导致网络结构不同于其他选择。
更新日期:2021-02-02
down
wechat
bug