当前位置: X-MOL 学术arXiv.cs.CL › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Enhancing Model Robustness By Incorporating Adversarial Knowledge Into Semantic Representation
arXiv - CS - Computation and Language Pub Date : 2021-02-23 , DOI: arxiv-2102.11584
Jinfeng Li, Tianyu Du, Xiangyu Liu, Rong Zhang, Hui Xue, Shouling Ji

Despite that deep neural networks (DNNs) have achieved enormous success in many domains like natural language processing (NLP), they have also been proven to be vulnerable to maliciously generated adversarial examples. Such inherent vulnerability has threatened various real-world deployed DNNs-based applications. To strength the model robustness, several countermeasures have been proposed in the English NLP domain and obtained satisfactory performance. However, due to the unique language properties of Chinese, it is not trivial to extend existing defenses to the Chinese domain. Therefore, we propose AdvGraph, a novel defense which enhances the robustness of Chinese-based NLP models by incorporating adversarial knowledge into the semantic representation of the input. Extensive experiments on two real-world tasks show that AdvGraph exhibits better performance compared with previous work: (i) effective - it significantly strengthens the model robustness even under the adaptive attacks setting without negative impact on model performance over legitimate input; (ii) generic - its key component, i.e., the representation of connotative adversarial knowledge is task-agnostic, which can be reused in any Chinese-based NLP models without retraining; and (iii) efficient - it is a light-weight defense with sub-linear computational complexity, which can guarantee the efficiency required in practical scenarios.

中文翻译:

通过将对抗性知识纳入语义表示来增强模型的鲁棒性

尽管深层神经网络(DNN)在自然语言处理(NLP)等许多领域都取得了巨大的成功,但事实证明,它们很容易受到恶意生成的对抗性示例的攻击。这种固有的漏洞已威胁到各种实际部署的基于DNN的应用程序。为了增强模型的鲁棒性,在英语自然语言处理领域已经提出了几种对策,并取得了令人满意的性能。但是,由于中文的独特语言特性,将现有防御扩展到中文领域并非易事。因此,我们提出了AdvGraph,这是一种新颖的防御措施,它通过将对抗性知识纳入输入的语义表示中来增强基于中文的NLP模型的鲁棒性。在两个实际任务上的大量实验表明,AdvGraph与以前的工作相比具有更好的性能:(i)有效-即使在自适应攻击设置下,它也大大增强了模型的鲁棒性,而对合法输入的模型性能没有负面影响;(ii)通用的-它的关键组成部分,即具有代表性的对抗性知识是与任务无关的,可以在任何基于中文的NLP模型中重用,而无需重新训练;(iii)高效-这是一种轻量级防御,具有亚线性计算复杂性,可以保证实际场景中所需的效率。(ii)通用的-它的关键组成部分,即具有代表性的对抗性知识是与任务无关的,可以在任何基于中文的NLP模型中重用,而无需重新训练;(iii)高效-这是一种轻量级防御,具有亚线性计算复杂性,可以保证实际场景中所需的效率。(ii)通用的-它的关键组成部分,即具有代表性的对抗性知识是与任务无关的,可以在任何基于中文的NLP模型中重用,而无需重新训练;(iii)高效-这是一种轻量级防御,具有亚线性计算复杂性,可以保证实际场景中所需的效率。
更新日期:2021-02-24
down
wechat
bug