Interpretation of machine learning models using shapley values: application to compound potency and multi-target activity predictions.,Journal of Computer-Aided Molecular Design

当前位置： X-MOL 学术 › J. Comput. Aid. Mol. Des. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Interpretation of machine learning models using shapley values: application to compound potency and multi-target activity predictions.
Journal of Computer-Aided Molecular Design ( IF 3.0 ) Pub Date : 2020-05-02 , DOI: 10.1007/s10822-020-00314-0
Raquel Rodríguez-Pérez ₁ , Jürgen Bajorath ₁

Affiliation

Difficulties in interpreting machine learning (ML) models and their predictions limit the practical applicability of and confidence in ML in pharmaceutical research. There is a need for agnostic approaches aiding in the interpretation of ML models regardless of their complexity that is also applicable to deep neural network (DNN) architectures and model ensembles. To these ends, the SHapley Additive exPlanations (SHAP) methodology has recently been introduced. The SHAP approach enables the identification and prioritization of features that determine compound classification and activity prediction using any ML model. Herein, we further extend the evaluation of the SHAP methodology by investigating a variant for exact calculation of Shapley values for decision tree methods and systematically compare this variant in compound activity and potency value predictions with the model-independent SHAP method. Moreover, new applications of the SHAP analysis approach are presented including interpretation of DNN models for the generation of multi-target activity profiles and ensemble regression models for potency prediction.

中文翻译：

使用 shapley 值解释机器学习模型：应用于复合效力和多目标活动预测。

解释机器学习 (ML) 模型及其预测的困难限制了 ML 在药物研究中的实际适用性和信心。需要一种不可知的方法来帮助解释 ML 模型，而不管它们的复杂性如何，这也适用于深度神经网络 (DNN) 架构和模型集成。为此，最近引入了 SHapley Additive exPlanations (SHAP) 方法。SHAP 方法支持使用任何 ML 模型确定复合分类和活动预测的特征的识别和优先级。在此处，我们通过研究用于精确计算决策树方法的 Shapley 值的变体，进一步扩展了对 SHAP 方法的评估，并系统地将这种变体在化合物活性和效力值预测中与模型无关的 SHAP 方法进行了比较。此外，还介绍了 SHAP 分析方法的新应用，包括解释用于生成多目标活动概况的 DNN 模型和用于效力预测的集成回归模型。

更新日期：2020-05-02

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11