当前位置: X-MOL 学术Comput. Speech Lang › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Replay anti-spoofing countermeasure based on data augmentation with post selection
Computer Speech & Language ( IF 4.3 ) Pub Date : 2020-05-16 , DOI: 10.1016/j.csl.2020.101115
Yuanjun Zhao , Roberto Togneri , Victor Sreeram

Automatic Speaker Verification (ASV) systems have been widely applied for speaker authentication for biometric security especially in e-business scenarios. However, vulnerabilities of the ASV technology have been discovered and have generated much interest to design anti-spoofing countermeasures. Serious threats can be posed by replay attacks, which are difficult to detect and easy to mount with accessible devices. In this paper, an efficient replay anti-spoofing countermeasure based on data augmentation with post selection is proposed. The auxiliary classifier generative adversarial network (AC-GAN) is adopted to generate more speech samples with diverse variants. To select generated samples of high quality and avoid the bias caused by human subjective perception, we also propose a convolutional neural network (CNN) based post-filter. By integrating data augmentation and post selection approaches, issues of over-fitting and lack of generalization can be significantly alleviated with extra informative training data. The proposed anti-spoofing countermeasure is evaluated on the ASVspoof 2017 Version 2.0 database. Experimental results measured by equal error rates (EERs) indicate a promising improvement over the development and evaluation subsets. This provides the motivation for novel audio data augmentation and also promotes the future research on generation selection in the application of speaker spoofing detection.



中文翻译:

基于数据增强和后期选择重播反欺骗对策

自动说话人验证(ASV)系统已被广泛应用于说话人身份验证,以实现生物识别安全,尤其是在电子商务场景中。但是,ASV技术的漏洞已经被发现,并且引起了人们对设计反欺骗对策的极大兴趣。重播攻击可能会构成严重威胁,重播攻击很难检测到并且易于通过可访问的设备安装。本文提出了一种基于后选数据增强的有效重放反欺骗对策。采用辅助分类器生成对抗网络(AC-GAN)来生成更多具有不同变体的语音样本。为了选择生成的高质量样本并避免由人类主观感知引起的偏差,我们还提出了基于卷积神经网络(CNN)的后置滤波器。通过整合数据扩充和后期选择方法,可以通过额外的信息性训练数据大大缓解过度拟合和缺乏概括性的问题。在ASVspoof 2017版本2.0数据库中评估了建议的反欺骗对策。通过均等错误率(EER)测得的实验结果表明,对开发和评估子集的改进很有希望。这为新颖的音频数据增强提供了动力,也促进了说话人欺骗检测应用中的世代选择的未来研究。0个数据库。通过均等错误率(EER)测得的实验结果表明,对开发和评估子集的改进很有希望。这为新颖的音频数据增强提供了动力,也促进了说话人欺骗检测应用中的世代选择的未来研究。0个数据库。通过均等错误率(EER)测得的实验结果表明,对开发和评估子集的改进很有希望。这为新颖的音频数据增强提供了动力,也促进了说话人欺骗检测应用中的世代选择的未来研究。

更新日期:2020-05-16
down
wechat
bug