Channel-wise Gated Res2Net: Towards Robust Detection of Synthetic Speech Attacks,arXiv - CS - Sound

当前位置： X-MOL 学术 › arXiv.cs.SD › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Channel-wise Gated Res2Net: Towards Robust Detection of Synthetic Speech Attacks
arXiv - CS - Sound Pub Date : 2021-07-19 , DOI: arxiv-2107.08803
Xu Li, Xixin Wu, Hui Lu, Xunying Liu, Helen Meng

Existing approaches for anti-spoofing in automatic speaker verification (ASV) still lack generalizability to unseen attacks. The Res2Net approach designs a residual-like connection between feature groups within one block, which increases the possible receptive fields and improves the system's detection generalizability. However, such a residual-like connection is performed by a direct addition between feature groups without channel-wise priority. We argue that the information across channels may not contribute to spoofing cues equally, and the less relevant channels are expected to be suppressed before adding onto the next feature group, so that the system can generalize better to unseen attacks. This argument motivates the current work that presents a novel, channel-wise gated Res2Net (CG-Res2Net), which modifies Res2Net to enable a channel-wise gating mechanism in the connection between feature groups. This gating mechanism dynamically selects channel-wise features based on the input, to suppress the less relevant channels and enhance the detection generalizability. Three gating mechanisms with different structures are proposed and integrated into Res2Net. Experimental results conducted on ASVspoof 2019 logical access (LA) demonstrate that the proposed CG-Res2Net significantly outperforms Res2Net on both the overall LA evaluation set and individual difficult unseen attacks, which also outperforms other state-of-the-art single systems, depicting the effectiveness of our method.

中文翻译：

Channel-wise Gated Res2Net：对合成语音攻击的稳健检测

自动说话人验证 (ASV) 中的现有反欺骗方法仍然缺乏对看不见的攻击的普遍性。Res2Net 方法在一个块内的特征组之间设计了一种类似残差的连接，这增加了可能的感受野并提高了系统的检测泛化能力。然而，这种类似残差的连接是通过在没有通道优先级的情况下在特征组之间直接添加来执行的。我们认为，跨通道的信息可能不会平等地对欺骗线索做出贡献，并且在添加到下一个特征组之前，预计不太相关的通道会被抑制，以便系统可以更好地泛化到看不见的攻击。这个论点激发了当前的工作，提出了一种新颖的、通道式门控 Res2Net (CG-Res2Net)，它修改了 Res2Net 以在特征组之间的连接中启用通道式门控机制。这种门控机制根据输入动态选择通道特征，以抑制不太相关的通道并增强检测的通用性。提出了三种不同结构的门控机制并将其集成到 Res2Net 中。在 ASVspoof 2019 逻辑访问 (LA) 上进行的实验结果表明，所提出的 CG-Res2Net 在整个 LA 评估集和单个困难的看不见的攻击上都明显优于 Res2Net，这也优于其他最先进的单一系统，描述了我们方法的有效性。抑制不太相关的通道并增强检测的泛化性。提出了三种不同结构的门控机制并将其集成到 Res2Net 中。在 ASVspoof 2019 逻辑访问 (LA) 上进行的实验结果表明，所提出的 CG-Res2Net 在整个 LA 评估集和单个困难的看不见的攻击上都明显优于 Res2Net，这也优于其他最先进的单一系统，描述了我们方法的有效性。抑制不太相关的通道并增强检测的泛化性。提出了三种不同结构的门控机制并将其集成到 Res2Net 中。在 ASVspoof 2019 逻辑访问 (LA) 上进行的实验结果表明，所提出的 CG-Res2Net 在整个 LA 评估集和单个困难的看不见的攻击上都明显优于 Res2Net，这也优于其他最先进的单一系统，描述了我们方法的有效性。

更新日期：2021-07-20

点击分享查看原文

点击收藏

阅读更多本刊最新论文