当前位置: X-MOL 学术Inform. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Approximating XGBoost with an interpretable decision tree
Information Sciences ( IF 8.1 ) Pub Date : 2021-05-27 , DOI: 10.1016/j.ins.2021.05.055
Omer Sagi , Lior Rokach

The increasing usage of machine-learning models in critical domains has recently stressed the necessity of interpretable machine-learning models. In areas like healthcare, finary – the model consumer must understand the rationale behind the model output in order to use it when making a decision. For this reason, it is impossible to use black-box models in these scenarios, regardless of their high predictive performance. Decision forests, and in particular Gradient Boosting Decision Trees (GBDT), are examples of this kind of model. GBDT models are considered the state-of-the-art in many classification challenges, reflected by the fact that the majority of Kaggle’s recent winners used GBDT methods as a part of their solution (such as XGBoost). But despite their superior predictive performance, they cannot be used in tasks that require transparency. This paper presents a novel method for transforming a decision forest of any kind into an interpretable decision tree. The method extends the tool-set available for machine learning practitioners, who want to exploit the interpretability of decision trees without significantly impairing the predictive performance gained by GBDT models like XGBoost. We show in an empirical evaluation that in some cases the generated tree is able to approximate the predictive performance of a XGBoost model while enabling better transparency of the outputs.



中文翻译:

用可解释的决策树逼近 XGBoost

机器学习模型在关键领域的使用越来越多,最近强调了可解释机器学习模型的必要性。在医疗保健、金融等领域,模型消费者必须了解模型输出背后的基本原理,以便在做出决定时使用它。出于这个原因,在这些场景中使用黑盒模型是不可能的,不管它们的预测性能如何。决策森林,特别是梯度提升决策树 (GBDT),就是这种模型的例子。GBDT 模型在许多分类挑战中被认为是最先进的,这反映在 Kaggle 最近的大多数获胜者都使用 GBDT 方法作为其解决方案的一部分(例如 XGBoost)。但是,尽管它们具有卓越的预测性能,但它们不能用于需要透明度的任务。本文提出了一种将任何类型的决策森林转换为可解释决策树的新方法。该方法扩展了机器学习从业者可用的工具集,他们希望在不显着损害 XGBoost 等 GBDT 模型获得的预测性能的情况下,利用决策树的可解释性。我们在经验评估中表明,在某些情况下,生成的树能够近似 XGBoost 模型的预测性能,同时实现更好的输出透明度。希望利用决策树的可解释性而不显着损害 XGBoost 等 GBDT 模型获得的预测性能的人。我们在经验评估中表明,在某些情况下,生成的树能够近似 XGBoost 模型的预测性能,同时实现更好的输出透明度。希望利用决策树的可解释性而不显着损害 XGBoost 等 GBDT 模型获得的预测性能的人。我们在经验评估中表明,在某些情况下,生成的树能够近似 XGBoost 模型的预测性能,同时实现更好的输出透明度。

更新日期:2021-06-08
down
wechat
bug