当前位置: X-MOL 学术npj Syst. Biol. Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
PIMKL: Pathway-Induced Multiple Kernel Learning.
npj Systems Biology and Applications ( IF 3.5 ) Pub Date : 2019-03-05 , DOI: 10.1038/s41540-019-0086-3
Matteo Manica 1, 2 , Joris Cadow 1, 2 , Roland Mathis 1 , María Rodríguez Martínez 1
Affiliation  

Reliable identification of molecular biomarkers is essential for accurate patient stratification. While state-of-the-art machine learning approaches for sample classification continue to push boundaries in terms of performance, most of these methods are not able to integrate different data types and lack generalization power, limiting their application in a clinical setting. Furthermore, many methods behave as black boxes, and we have very little understanding about the mechanisms that lead to the prediction. While opaqueness concerning machine behavior might not be a problem in deterministic domains, in health care, providing explanations about the molecular factors and phenotypes that are driving the classification is crucial to build trust in the performance of the predictive system. We propose Pathway-Induced Multiple Kernel Learning (PIMKL), a methodology to reliably classify samples that can also help gain insights into the molecular mechanisms that underlie the classification. PIMKL exploits prior knowledge in the form of a molecular interaction network and annotated gene sets, by optimizing a mixture of pathway-induced kernels using a Multiple Kernel Learning (MKL) algorithm, an approach that has demonstrated excellent performance in different machine learning applications. After optimizing the combination of kernels to predict a specific phenotype, the model provides a stable molecular signature that can be interpreted in the light of the ingested prior knowledge and that can be used in transfer learning tasks.

中文翻译:

PIMKL:途径诱导的多核学习。

分子生物学标志物的可靠鉴定对于准确的患者分层至关重要。尽管用于样本分类的最先进机器学习方法继续在性能方面突破界限,但这些方法大多数都无法集成不同的数据类型并且缺乏归纳能力,从而限制了它们在临床环境中的应用。此外,许多方法的行为都像黑匣子一样,而且我们对导致预测的机制知之甚少。尽管关于机器行为的不透明性在确定性领域中可能不是问题,但在医疗保健中,提供有关驱动分类的分子因素和表型的解释对于建立对预测系统性能的信任至关重要。我们提出了路径诱导的多核学习(PIMKL),一种可靠地对样品进行分类的方法,还可以帮助您洞察构成分类基础的分子机制。PIMKL通过使用多核学习(MKL)算法优化途径诱导的内核的混合物,以分子相互作用网络和带注释的基因集的形式利用先验知识,该方法已在不同的机器学习应用中表现出出色的性能。优化内核组合以预测特定表型后,该模型将提供一个稳定的分子特征,可以根据所摄取的先验知识对其进行解释,并可以用于转移学习任务。PIMKL通过使用多核学习(MKL)算法优化途径诱导的内核的混合物,以分子相互作用网络和带注释的基因集的形式利用先验知识,该方法已在不同的机器学习应用中表现出出色的性能。在优化内核组合以预测特定表型后,该模型提供了一个稳定的分子特征,可以根据摄入的先验知识对其进行解释,并可以用于转移学习任务。PIMKL通过使用多核学习(MKL)算法优化途径诱导的内核的混合物,以分子相互作用网络和带注释的基因集的形式利用先验知识,该方法已在不同的机器学习应用中表现出出色的性能。在优化内核组合以预测特定表型后,该模型提供了一个稳定的分子特征,可以根据摄入的先验知识对其进行解释,并可以用于转移学习任务。
更新日期:2019-03-05
down
wechat
bug