当前位置: X-MOL 学术Funct. Integr. Genomics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Identifying subset of genes that have influential impacts on cancer progression: a new approach to analyze cancer microarray data.
Functional & Integrative Genomics ( IF 2.9 ) Pub Date : 2008-05-20 , DOI: 10.1007/s10142-008-0084-9
Mingyu Shi 1 , Shuangge Ma
Affiliation  

Cancer is a complex genetic disease, resulting from defects of multiple genes. Development of microarray techniques makes it possible to survey the whole genome and detect genes that have influential impacts on the progression of cancer. Statistical analysis of cancer microarray data is challenging because of the high dimensionality and cluster nature of gene expressions. Here, clusters are composed of genes with coordinated pathological functions and/or correlated expressions. In this article, we consider cancer studies where censored survival endpoint is measured along with microarray gene expressions. We propose a hybrid clustering approach, which uses both pathological pathway information retrieved from KEGG and statistical correlations of gene expressions, to construct gene clusters. Cancer survival time is modeled as a linear function of gene expressions. We adopt the clustering threshold gradient directed regularization (CTGDR) method for simultaneous gene cluster selection, within-cluster gene selection, and predictive model building. Analysis of two lymphoma studies shows that the proposed approach - which is composed of the hybrid gene clustering, linear regression model for survival, and clustering regularized estimation with CTGDR - can effectively identify gene clusters and genes within selected clusters that have satisfactory predictive power for censored cancer survival outcomes.

中文翻译:

识别对癌症进展有影响的基因子集:一种分析癌症微阵列数据的新方法。

癌症是一种复杂的遗传疾病,由多个基因的缺陷引起。微阵列技术的发展使得调查整个基因组和检测对癌症进展有影响的基因成为可能。由于基因表达的高维度和聚类特性,癌症微阵列数据的统计分析具有挑战性。在这里,簇由具有协调病理功能和/或相关表达的基因组成。在本文中,我们考虑了癌症研究,其中审查生存终点与微阵列基因表达一起测量。我们提出了一种混合聚类方法,它使用从 KEGG 检索到的病理途径信息和基因表达的统计相关性来构建基因簇。癌症存活时间被建模为基因表达的线性函数。我们采用聚类阈值梯度定向正则化 (CTGDR) 方法进行同步基因簇选择、簇内基因选择和预测模型构建。对两项淋巴瘤研究的分析表明,所提出的方法 - 由混合基因聚类、生存线性回归模型和 CTGDR 聚类正则化估计组成 - 可以有效地识别基因簇和选定簇内的基因,这些簇对删失具有令人满意的预测能力。癌症生存结果。
更新日期:2019-11-01
down
wechat
bug