当前位置: X-MOL 学术Decis. Support Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Simpler is better: Lifting interpretability-performance trade-off via automated feature engineering
Decision Support Systems ( IF 7.5 ) Pub Date : 2021-03-26 , DOI: 10.1016/j.dss.2021.113556
Alicja Gosiewska , Anna Kozak , Przemyslaw Biecek

Machine learning has proved to generate useful predictive models that can and should support decision makers in many areas. The availability of tools for AutoML makes it possible to quickly create an effective but complex predictive model. However, the complexity of such models is often a major obstacle in applications, especially in terms of high-stake decisions. We are experiencing a growing number of examples where the use of black boxes leads to decisions that are harmful, unfair or simply wrong. In this paper, we show that very often we can simplify complex models without compromising their performance; however, with the benefit of much needed transparency.

We propose a framework that uses elastic black boxes as supervisor models to create simpler, less opaque, yet still accurate and interpretable glass box models. The new models were created using newly engineered features extracted with the help of a supervisor model. We supply the analysis using a large-scale benchmark on several tabular data sets from the OpenML database. There are tree main results of this paper: 1) we show that extracting information from complex models may improve the performance of simpler models, 2) we question a common myth that complex predictive models outperform simpler predictive models, 3) we present a real-life application of the proposed method.



中文翻译:

越简单越好:通过自动化特征工程提升可解释性-性能的权衡

机器学习已被证明可以生成有用的预测模型,这些模型可以而且应该支持许多领域的决策者。AutoML 工具的可用性使得快速创建有效但复杂的预测模型成为可能。然而,此类模型的复杂性往往是应用中的主要障碍,尤其是在高风险决策方面。我们正在经历越来越多的例子,在这些例子中,使用黑匣子会导致有害、不公平或完全错误的决策。在本文中,我们展示了很多时候我们可以在不影响其性能的情况下简化复杂模型;然而,这得益于急需的透明度。

我们提出了一个框架,它使用弹性黑盒作为监督模型来创建更简单、不透明但仍然准确且可解释的玻璃盒模型。新模型是使用在监督模型的帮助下提取的新工程特征创建的。我们使用来自 OpenML 数据库的多个表格数据集的大规模基准提供分析。本文的主要结果有:1)我们表明从复杂模型中提取信息可以提高简单模型的性能,2)我们质疑复杂预测模型优于简单预测模型的普遍神话,3)我们提出了一个真实的 -建议方法的生活应用。

更新日期:2021-03-26
down
wechat
bug