当前位置: X-MOL 学术Comput. Speech Lang › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Detection of speech playback attacks using robust harmonic trajectories
Computer Speech & Language ( IF 4.3 ) Pub Date : 2020-07-16 , DOI: 10.1016/j.csl.2020.101133
Wei Shang , Maryhelen Stevenson

In this paper, a new feature set is proposed for use in a playback attack detector (PAD) aimed at safeguarding a passphrase and speaker-verified protected system that can be remotely accessed from an arbitrary location using an arbitrary telecommunication channel. The new feature set, termed VoicedTracks, is a time-frequency map of the most robust harmonic trajectories in an utterance and serves as an audio fingerprint that can uniquely identify an utterance despite a moderate amount of noise and channel distortion. Experimental results are obtained using a specially designed in-house database; the impact of various noise types and SNR levels is further investigated using a publicly available database. An analysis of playback scores across several combinations of telecommunication channel types, playback devices and additive noise demonstrates robustness of the feature set to channel distortion and additive noise, thus making it suitable for use in a copy-detection based PAD (cd-PAD) designed for applications such as telephone banking. Relative to other cd-PADs the proposed approach was better able to defend against playback attacks when telephone channels were involved. An analysis of its performance across the replay configurations used in the ASVspoof 2017 V2 evaluation set suggests that the proposed cd-PAD is highly effective in detecting those playback attacks that are most likely to spoof the speaker verification system.



中文翻译:

使用健壮的谐波轨迹检测语音播放攻击

在本文中,提出了一种用于回放攻击检测器(PAD)的新功能集,旨在保护可以通过任意电信信道从任意位置远程访问的密码短语和说话者验证的受保护系统。新的功能集称为VoicedTracks,是发声中最健壮的谐波轨迹的时频图,并用作音频指纹尽管噪声和通道失真程度适中,但仍可以唯一识别发声。实验结果是使用专门设计的内部数据库获得的;使用公共数据库进一步研究了各种噪声类型和SNR的影响。对电信信道类型,回放设备和加性噪声的几种组合进行的回放得分分析表明,该功能集对通道失真和加性噪声的鲁棒性,因此使其适合用于基于复制检测的PAD(cd-PAD)设计适用于电话银行等应用。相对于其他cd-PAD,建议的方法在涉及电话信道时能够更好地防御播放攻击。

更新日期:2020-08-12
down
wechat
bug