当前位置: X-MOL 学术Artif. Intell. Med. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A novel deep mining model for effective knowledge discovery from omics data.
Artificial Intelligence in Medicine ( IF 7.5 ) Pub Date : 2020-02-24 , DOI: 10.1016/j.artmed.2020.101821
Abeer Alzubaidi 1 , Jonathan Tepper 2 , Ahmad Lotfi 1
Affiliation  

Knowledge discovery from omics data has become a common goal of current approaches to personalised cancer medicine and understanding cancer genotype and phenotype. However, high-throughput biomedical datasets are characterised by high dimensionality and relatively small sample sizes with small signal-to-noise ratios. Extracting and interpreting relevant knowledge from such complex datasets therefore remains a significant challenge for the fields of machine learning and data mining. In this paper, we exploit recent advances in deep learning to mitigate against these limitations on the basis of automatically capturing enough of the meaningful abstractions latent with the available biological samples. Our deep feature learning model is proposed based on a set of non-linear sparse Auto-Encoders that are deliberately constructed in an under-complete manner to detect a small proportion of molecules that can recover a large proportion of variations underlying the data. However, since multiple projections are applied to the input signals, it is hard to interpret which phenotypes were responsible for deriving such predictions. Therefore, we also introduce a novel weight interpretation technique that helps to deconstruct the internal state of such deep learning models to reveal key determinants underlying its latent representations. The outcomes of our experiment provide strong evidence that the proposed deep mining model is able to discover robust biomarkers that are positively and negatively associated with cancers of interest. Since our deep mining model is problem-independent and data-driven, it provides further potential for this research to extend beyond its cognate disciplines.



中文翻译:

从组学数据中有效发现知识的新型深度挖掘模型。

从组学数据中发现知识已成为当前个性化癌症医学方法和了解癌症基因型和表型的共同目标。然而,高通量生物医学数据集的特点是高维、样本量相对较小、信噪比小。因此,从如此复杂的数据集中提取和解释相关知识仍然是机器学习和数据挖掘领域的重大挑战。在本文中,我们利用深度学习的最新进展,在自动捕获可用生物样本潜在的足够有意义的抽象的基础上,减轻这些限制。我们的深度特征学习模型是基于一组非线性稀疏自动编码器提出的,这些自动编码器以欠完备的方式故意构建,以检测一小部分分子,这些分子可以恢复数据背后的大部分变化。然而,由于对输入信号应用了多个投影,因此很难解释哪些表型导致了这种预测。因此,我们还引入了一种新颖的权重解释技术,有助于解构此类深度学习模型的内部状态,以揭示其潜在表征背后的关键决定因素。我们的实验结果提供了强有力的证据,表明所提出的深度挖掘模型能够发现与感兴趣的癌症正负相关的强大生物标志物。

更新日期:2020-02-24
down
wechat
bug