当前位置: X-MOL 学术Mach. Learn. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Can metafeatures help improve explanations of prediction models when using behavioral and textual data?
Machine Learning ( IF 4.3 ) Pub Date : 2021-06-08 , DOI: 10.1007/s10994-021-05981-0
Yanou Ramon , David Martens , Theodoros Evgeniou , Stiene Praet

Machine learning models built on behavioral and textual data can result in highly accurate prediction models, but are often very difficult to interpret. Linear models require investigating thousands of coefficients, while the opaqueness of nonlinear models makes things worse. Rule-extraction techniques have been proposed to combine the desired predictive accuracy of complex “black-box” models with global explainability. However, rule-extraction in the context of high-dimensional, sparse data, where many features are relevant to the predictions, can be challenging, as replacing the black-box model by many rules leaves the user again with an incomprehensible explanation. To address this problem, we develop and test a rule-extraction methodology based on higher-level, less-sparse “metafeatures”. We empirically validate the quality of the explanation rules in terms of fidelity, stability, and accuracy over a collection of data sets, and benchmark their performance against rules extracted using the fine-grained behavioral and textual features. A key finding of our analysis is that metafeatures-based explanations are better at mimicking the behavior of the black-box prediction model, as measured by the fidelity of explanations.



中文翻译:

在使用行为和文本数据时,元特征能否帮助改进对预测模型的解释?

建立在行为和文本数据上的机器学习模型可以产生高度准确的预测模型,但通常很难解释。线性模型需要研究数千个系数,而非线性模型的不透明性使事情变得更糟。已经提出了规则提取技术来将复杂“黑盒”模型的所需预测精度与全局可解释性相结合。然而,在高维、稀疏数据的背景下进行规则提取,其中许多特征与预测相关,可能具有挑战性,因为用许多规则替换黑盒模型再次给用户留下难以理解的解释。为了解决这个问题,我们开发并测试了一种基于更高级别、更少稀疏的“元特征”的规则提取方法。我们在数据集集合的保真度、稳定性和准确性方面凭经验验证了解释规则的质量,并根据使用细粒度行为和文本特征提取的规则对其性能进行了基准测试。我们分析的一个关键发现是,根据解释的保真度来衡量,基于元特征的解释更擅长模仿黑盒预测模型的行为。

更新日期:2021-06-09
down
wechat
bug