当前位置: X-MOL 学术Knowl. Inf. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Learning credible DNNs via incorporating prior knowledge and model local explanation
Knowledge and Information Systems ( IF 2.7 ) Pub Date : 2020-10-21 , DOI: 10.1007/s10115-020-01517-5
Mengnan Du , Ninghao Liu , Fan Yang , Xia Hu

Recent studies have shown that state-of-the-art DNNs are not always credible, despite their impressive performance on the hold-out test set of a variety of tasks. These models tend to exploit dataset shortcuts to make predictions, rather than learn the underlying task. The non-credibility could lead to low generalization, adversarial vulnerability, as well as algorithmic discrimination of the DNN models. In this paper, we propose CREX in order to develop more credible DNNs. The high-level idea of CREX is to encourage DNN models to focus more on evidences that actually matter for the task at hand and to avoid overfitting to data-dependent shortcuts. Specifically, in the DNN training process, CREX directly regularizes the local explanation with expert rationales, i.e., a subset of features highlighted by domain experts as justifications for predictions, to enforce the alignment between local explanations and rationales. Even when rationales are not available, CREX still could be useful by requiring the generated explanations to be sparse. In addition, CREX is widely applicable to different network architectures, including CNN, LSTM and attention model. Experimental results on several text classification datasets demonstrate that CREX could increase the credibility of DNNs. Comprehensive analysis further shows three meaningful improvements of CREX: (1) it significantly increases DNN accuracy on new and previously unseen data beyond test set, (2) it enhances fairness of DNNs in terms of equality of opportunity metric and reduce models’ discrimination toward certain demographic group, and (3) it promotes the robustness of DNN models with respect to adversarial attack. These experimental results highlight the advantages of the increased credibility by CREX.



中文翻译:

通过结合先验知识和模型本地解释来学习可靠的DNN

最新研究表明,尽管先进的DNN在各种任务的保​​持测试集上表现出色,但它们并不总是可信的。这些模型倾向于利用数据集快捷方式进行预测,而不是学习基础任务。非可信性可能导致较低的泛化,对抗性脆弱性以及DNN模型的算法区分。在本文中,我们提出了CREX,以开发更可信的DNN。CREX的高级想法是鼓励DNN模型将更多的注意力放在与手头任务实际相关的证据上,并避免过度拟合依赖数据的快捷方式。具体来说,在DNN培训过程中,CREX通过专家理论直接规范了当地的解释,,这是领域专家强调的功能子集,可作为预测依据,以加强本地解释和基本原理之间的一致性。即使没有合理的依据,通过要求生成的解释稀疏,CREX仍然可能有用。另外,CREX广泛适用于不同的网络体系结构,包括CNN,LSTM和关注模型。在多个文本分类数据集上的实验结果表明,CREX可以提高DNN的信誉。全面的分析进一步显示了CREX的三项有意义的改进:(1)大大提高了DNN在测试集之外的新数据和以前看不见的数据上的准确性;(2)在机会均等性方面提高了DNN的公平性,并减少了模型对某些特定指标的歧视人口群体 (3)增强了DNN模型在对抗攻击方面的鲁棒性。这些实验结果突出了CREX增强信誉的优势。

更新日期:2020-10-21
down
wechat
bug