Replay spoofing countermeasure using autoencoder and siamese networks on ASVspoof 2019 challenge,Computer Speech & Language

当前位置： X-MOL 学术 › Comput. Speech Lang › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Replay spoofing countermeasure using autoencoder and siamese networks on ASVspoof 2019 challenge
Computer Speech & Language ( IF 4.3 ) Pub Date : 2020-05-15 , DOI: 10.1016/j.csl.2020.101105
Mohammad Adiban , Hossein Sameti , Saeedreza Shehnepoor

Automatic Speaker Verification (ASV) is authentication of individuals by analyzing their speech signals. Different synthetic approaches allow spoofing to deceive ASV systems (ASVs), whether using techniques to imitate a voice or reconstruct the features. Attackers beat up the ASVs using four general techniques; impersonation, speech synthesis, voice conversion, and replay. The last technique is considered as a common and high potential tool for spoofing purposes since replay attacks are more accessible and require no technical knowledge of adversaries. In this study, we introduce a novel replay spoofing countermeasure for ASVs. Accordingly, we use the Constant Q Cepstral Coefficient (CQCC) features fed into an autoencoder to attain more informative features and to consider the noise information of spoofed utterances for discrimination purpose. Finally, different configurations of the Siamese network are used for the first time in this context for classification. The experiments performed on ASVspoof challenge 2019 dataset using Equal Error Rate (EER) and Tandem Detection Cost Function (t-DCF) as evaluation metrics show that the proposed system improved the results over the baseline by 10.73% and 0.2344 in terms of EER and t-DCF, respectively.

中文翻译：

在ASVspoof 2019挑战赛中使用自动编码器和暹罗网络重播欺骗对策

自动说话者验证（ASV）是通过分析个人的语音信号对个人进行身份验证。无论使用模拟声音还是重构特征，不同的合成方法都允许欺骗来欺骗ASV系统（ASV）。攻击者使用四种通用技术击败了ASV。模拟，语音合成，语音转换和重播。由于重播攻击更易于访问且不需要对手的技术知识，因此最后一种技术被认为是用于欺骗目的的常见且高潜力的工具。在这项研究中，我们介绍了一种针对ASV的新颖的重播欺骗对策。因此，我们使用输入到自动编码器中的恒定Q倒谱系数（CQCC）特征来获得更多信息特征，并考虑出于欺骗目的的欺骗性话语的噪声信息。最后，在这种情况下，首次使用暹罗网络的不同配置进行分类。使用等错误率（EER）和串联检测成本函数（t-DCF）作为评估指标对ASVspoof Challenge 2019数据集进行的实验表明，所提出的系统在EER和t方面比基线提高了10.73％和0.2344 -DCF，分别。

更新日期：2020-05-15

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>