当前位置: X-MOL 学术Comput. Stat. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Integrative analysis of multiple types of genomic data using an accelerated failure time frailty model
Computational Statistics ( IF 1.0 ) Pub Date : 2021-02-03 , DOI: 10.1007/s00180-020-01060-5
Shirong Deng , Jie Chen , Huidong Shi

As the high throughput technologies rapidly develop, multiple types of genomic data become available within and across different studies. It has become a challenging task in modern statistical research to use all types of genomic data to infer some disease-prone genetic information. In this work, we propose an integrative analysis of multiple and different types of genomic data, clinical covariates and survival data under a framework of an accelerated failure time with frailty model. The proposed integrative approach aims to answer some aspects of the complex problem in genomic data analysis by finding relevant genomic features and inferring patients’ survival time using identified features. The proposed integrative approach is developed using a weighted least-squares with a sparse group LASSO penalty as the objective function to simultaneously estimate and select the relevant features. Extensive simulation studies are conducted to assess the performance of the proposed method with two types of genomic data, DNA methylation data and copy number variation data, on 600 genes and three clinical covariates. The simulation results show promises of the proposed method. The proposed method is applied to the analysis of the Cancer Genome Atlas data on Glioblastoma, a lethal brain cancer, and biologically interpretable results are obtained.



中文翻译:

使用加速失效时间脆弱模型对多种基因组数据进行综合分析

随着高通量技术的迅速发展,在不同研究中和不同研究中都可以使用多种类型的基因组数据。使用所有类型的基因组数据来推断一些易患疾病的遗传信息已成为现代统计研究中的一项艰巨任务。在这项工作中,我们提出了在衰弱模型加速失败时间的框架下,对多种和不同类型的基因组数据,临床协变量和生存数据进行综合分析。拟议的整合方法旨在通过找到相关的基因组特征并使用已识别的特征推断患者的生存时间,来解决基因组数据分析中复杂问题的某些方面。使用带有稀疏组LASSO罚分的加权最小二乘作为目标函数来开发所建议的集成方法,以同时估计和选择相关特征。进行了广泛的模拟研究,以评估关于600种基因和3种临床协变量的两种类型的基因组数据,DNA甲基化数据和拷贝数变异数据,以评估该方法的性能。仿真结果表明了该方法的前景。所提出的方法被用于分析胶质母细胞瘤,一种致命的脑癌的癌症基因组图谱数据,并获得了生物学上可解释的结果。有关600个基因和三个临床协变量的DNA甲基化数据和拷贝数变异数据。仿真结果表明了该方法的前景。所提出的方法被用于分析胶质母细胞瘤,一种致命的脑癌的癌症基因组图谱数据,并获得了生物学上可解释的结果。有关600个基因和三个临床协变量的DNA甲基化数据和拷贝数变异数据。仿真结果表明了该方法的前景。所提出的方法被用于分析胶质母细胞瘤,一种致命的脑癌的癌症基因组图谱数据,并获得了生物学上可解释的结果。

更新日期:2021-02-03
down
wechat
bug