当前位置: X-MOL 学术Comput. Speech Lang › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Voice spoofing detection corpus for single and multi-order audio replays
Computer Speech & Language ( IF 4.3 ) Pub Date : 2020-07-16 , DOI: 10.1016/j.csl.2020.101132
Roland Baumann , Khalid Mahmood Malik , Ali Javed , Andersen Ball , Brandon Kujawa , Hafiz Malik

The evolution of modern voice-controlled devices (VCDs) has revolutionized the Internet of Things (IoT) and resulted in the increased realization of smart homes, personalization, and home automation through voice commands. These VCDs can be exploited in IoT driven environments to generate various spoofing attacks, including the chaining of replay attacks (i.e. multi-order replay attacks). Existing datasets like ASVspoof 2017, ASVspoof 2019, and ReMASC contain only first-order replay recordings (i.e. replayed once); therefore, they cannot offer evaluation of anti-spoofing algorithms capable of detecting multi-order replay attacks. Additionally, large-scale datasets like ASVspoof 2017 and ASVspoof 2019 do not capture the characteristics of microphone arrays, which are an essential characteristic of modern VCDs. Therefore, there exists a need for a diverse replay spoofing detection corpus that consists of multi-order replay recordings against bona fide voice samples. This paper presents a novel voice spoofing detection corpus (VSDC) to evaluate the performance of multi-order replay anti-spoofing methods. The proposed VSDC consists of first-order (i.e. replayed once) and second-order replay (i.e. replayed twice) samples against the bona fide audio recordings. We ensured to create a diverse replay spoofing detection corpus in terms of environments, recording and playback devices, speakers, configurations, replay scenarios, etc. More specifically, we used 35 microphones, 25 different recording configurations, and 60 different playback configurations for first- and second-order replays to generate a total of 14,050 samples belonging to 19 speakers. Additionally, the proposed VSDC can also be used to evaluate the performance of speaker verification systems in terms of independent speaker verification. To the best of our knowledge, this is the first publicly available replay spoofing detection corpus comprised of first and second-order replay samples. Experimental results signify the effectiveness of the proposed VSDC in terms of evaluating the performance of anti-spoofing methods under multi-order replay attacks and diverse conditions.



中文翻译:

用于单阶和多阶音频重放的语音欺骗检测语料库

现代语音控制设备(VCD)的发展彻底改变了物联网(IoT),并通过语音命令实现了智能家居,个性化和家庭自动化的实现。可以在物联网驱动的环境中利用这些VCD来产生各种欺骗攻击,包括重播攻击的链接(即多级重播攻击)。现有的数据集(如ASVspoof 2017,ASVspoof 2019和ReMASC)仅包含一阶重播记录(即重播一次);因此,它们无法提供能够检测多级重播攻击的反欺骗算法的评估。此外,像ASVspoof 2017和ASVspoof 2019这样的大规模数据集无法捕获麦克风阵列的特征,而麦克风阵列是现代VCD的基本特征。因此,需要一种多样化的重放欺骗检测语料库,其由针对真实语音样本的多级重放记录组成。本文提出了一种新颖的语音欺骗检测语料库(VSDC),以评估多级重播反欺骗方法的性能。提议的VSDC由针对真实音频记录的一阶(即,重播一次)和二阶(即,重播两次)样本组成。我们确保在环境,记录和播放设备,扬声器,配置,重放场景等方面创建多样化的重放欺骗检测语料库。更具体地说,我们使用35个麦克风,25种不同的记录配置和60种不同的重放配置来进行以下操作:以及二次回放,以产生总共14,050个样本,这些样本属于19个扬声器。另外,提出的VSDC还可用于根据独立的说话人验证来评估说话人验证系统的性能。就我们所知,这是第一个由一阶和二阶重播样本组成的重播欺骗检测语料库。实验结果表明,在评估多级重放攻击和各种条件下的反欺骗方法的性能方面,VSDC的有效性。

更新日期:2020-07-24
down
wechat
bug