当前位置: X-MOL 学术Adv. Data Anal. Classif. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
ParticleMDI: particle Monte Carlo methods for the cluster analysis of multiple datasets with applications to cancer subtype identification
Advances in Data Analysis and Classification ( IF 1.4 ) Pub Date : 2020-06-12 , DOI: 10.1007/s11634-020-00401-y
Nathan Cunningham , Jim E. Griffin , David L. Wild

We present a novel nonparametric Bayesian approach for performing cluster analysis in a context where observational units have data arising from multiple sources. Our approach uses a particle Gibbs sampler for inference in which cluster allocations are jointly updated using a conditional particle filter within a Gibbs sampler, improving the mixing of the MCMC chain. We develop several approaches to improving the computational performance of our algorithm. These methods can achieve greater than an order-of-magnitude improvement in performance at no cost to accuracy and can be applied more broadly to Bayesian inference for mixture models with a single dataset. We apply our algorithm to the discovery of risk cohorts amongst 243 patients presenting with kidney renal clear cell carcinoma, using samples from the Cancer Genome Atlas, for which there are gene expression, copy number variation, DNA methylation, protein expression and microRNA data. We identify 4 distinct consensus subtypes and show they are prognostic for survival rate (\(p < 0.0001\)).

中文翻译:

ParticleMDI:粒子蒙特卡洛方法用于多个数据集的聚类分析,并应用于癌症亚型鉴定

我们提出了一种新颖的非参数贝叶斯方法,用于在观测单位具有来自多个来源的数据的情况下进行聚类分析。我们的方法使用粒子Gibbs采样器进行推断,其中使用Gibbs采样器中的条件粒子过滤器联合更新群集分配,从而改善了MCMC链的混合。我们开发了几种方法来改善算法的计算性能。这些方法可以在不牺牲准确性的情况下实现性能上的数量级提高,并且可以更广泛地应用于具有单个数据集的混合模型的贝叶斯推断。我们使用癌症基因组图谱(Cancer Genome Atlas)的样本,将我们的算法应用于243名患有肾肾透明细胞癌的患者的风险队列的发现,其中包括基因表达,拷贝数变异,DNA甲基化,蛋白质表达和microRNA数据。我们确定了4种不同的共识亚型,并表明它们对生存率具有预后性(\(p <0.0001 \))。
更新日期:2020-06-12
down
wechat
bug