当前位置: X-MOL 学术BMC Med. Genomics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Integrative subspace clustering by common and specific decomposition for applications on cancer subtype identification.
BMC Medical Genomics ( IF 2.1 ) Pub Date : 2019-12-24 , DOI: 10.1186/s12920-019-0633-1
Yin Guo 1 , Huiran Li 1 , Menglan Cai 1 , Limin Li 1
Affiliation  

BACKGROUND Recent high throughput technologies have been applied for collecting heterogeneous biomedical omics datasets. Computational analysis of the multi-omics datasets could potentially reveal deep insights for a given disease. Most existing clustering methods by multi-omics data assume strong consistency among different sources of datasets, and thus may lose efficacy when the consistency is relatively weak. Furthermore, they could not identify the conflicting parts for each view, which might be important in applications such as cancer subtype identification. METHODS In this work, we propose an integrative subspace clustering method (ISC) by common and specific decomposition to identify clustering structures with multi-omics datasets. The main idea of our ISC method is that the original representations for the samples in each view could be reconstructed by the concatenation of a common part and a view-specific part in orthogonal subspaces. The problem can be formulated as a matrix decomposition problem and solved efficiently by our proposed algorithm. RESULTS The experiments on simulation and text datasets show that our method outperforms other state-of-art methods. Our method is further evaluated by identifying cancer types using a colorectal dataset. We finally apply our method to cancer subtype identification for five cancers using TCGA datasets, and the survival analysis shows that the subtypes we found are significantly better than other compared methods. CONCLUSION We conclude that our ISC model could not only discover the weak common information across views but also identify the view-specific information.

中文翻译:

通过常见和特定分解的集成子空间聚类在癌症亚型识别中的应用。

背景技术近来的高通量技术已经被用于收集异构生物医学组学数据集。对多组学数据集的计算分析可能会揭示给定疾病的深刻见解。现有的大多数通过多组学数据进行聚类的方法都假定不同数据集之间具有很强的一致性,因此当一致性相对较弱时可能会失去功效。此外,他们无法为每个视图识别冲突的部分,这在诸如癌症亚型识别之类的应用中可能很重要。方法在这项工作中,我们通过通用分解和特定分解提出了一种集成子空间聚类方法(ISC),以识别具有多组学数据集的聚类结构。我们的ISC方法的主要思想是,每个视图中样本的原始表示可以通过正交子空间中公共部分和特定于视图的部分的连接来重建。该问题可以表述为矩阵分解问题,并且可以通过我们提出的算法有效解决。结果在模拟和文本数据集上的实验表明,我们的方法优于其他最新方法。通过使用结肠直肠数据集识别癌症类型,可以进一步评估我们的方法。最后,我们使用TCGA数据集将我们的方法应用于五种癌症的癌症亚型识别,生存分析表明,我们发现的亚型明显优于其他比较方法。
更新日期:2019-12-25
down
wechat
bug