A Light-weight Replay Detection Framework for Voice Controlled IoT Devices,IEEE Journal of Selected Topics in Signal Processing

当前位置： X-MOL 学术 › IEEE J. Sel. Top. Signal Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A Light-weight Replay Detection Framework for Voice Controlled IoT Devices
IEEE Journal of Selected Topics in Signal Processing ( IF 8.7 ) Pub Date : 2020-08-01 , DOI: 10.1109/jstsp.2020.2999828
Khalid Mahmood Malik , Ali Javed , Hafiz Malik , Aun Irtaza

The growing number of voice-controlled devices (VCDs), i.e. Google Home, Amazon Alexa, etc., has resulted in automation of home appliances, smart gadgets, and next generation vehicles, etc. However, VCDs and voice-activated services i.e. chatbots are vulnerable to audio replay attacks. Our vulnerability analysis of VCDs shows that these replays could be exploited in multi-hop scenarios to maliciously access the devices/nodes attached to the Internet of Things. To protect these VCDs and voice-activated services, there is an urgent need to develop reliable and computationally efficient solutions to detect the replay attacks. This paper models replay attacks as a nonlinear process that introduces higher-order harmonic distortions. To detect these harmonic distortions, we propose the acoustic ternary patterns-gammatone cepstral coefficient (ATP-GTCC) features that are capable of capturing distortions due to replay attacks. Error correcting output codes model is used to train a multi-class SVM classifier using the proposed ATP-GTCC feature space and tested for voice replay attack detection. Performance of the proposed framework is evaluated on ASVspoof 2019 dataset, and our own created voice spoofing detection corpus (VSDC) consisting of bona-fide, first-order replay (replayed once), and second-order replay (replayed twice) audio recordings. Experimental results signify that the proposed audio replay detection framework reliably detects both first and second-order replay attacks and can be used in resource constrained devices.

中文翻译：

用于语音控制的物联网设备的轻量级重播检测框架

越来越多的语音控制设备 (VCD)，即谷歌 Home、亚马逊 Alexa 等，已经导致家用电器、智能小工具和下一代汽车等的自动化。然而，VCD 和语音激活服务，即聊天机器人容易受到音频重放攻击。我们对 VCD 的漏洞分析表明，这些重放可以在多跳场景中被利用来恶意访问连接到物联网的设备/节点。为了保护这些 VCD 和语音激活服务，迫切需要开发可靠且计算效率高的解决方案来检测重放攻击。本文将重放攻击建模为引入高阶谐波失真的非线性过程。为了检测这些谐波失真，我们提出了声学三元模式-伽马音倒谱系数（ATP-GTCC）特征，该特征能够捕获由于重放攻击引起的失真。纠错输出代码模型用于使用所提出的 ATP-GTCC 特征空间训练多类 SVM 分类器，并测试语音重放攻击检测。所提出框架的性能在 ASVspoof 2019 数据集和我们自己创建的语音欺骗检测语料库 (VSDC) 上进行评估，该语料库由真实的一阶重放（重放一次）和二阶重放（重放两次）录音组成。实验结果表明，所提出的音频重放检测框架能够可靠地检测一阶和二阶重放攻击，并可用于资源受限的设备。纠错输出代码模型用于使用所提出的 ATP-GTCC 特征空间训练多类 SVM 分类器，并测试语音重放攻击检测。所提出框架的性能在 ASVspoof 2019 数据集和我们自己创建的语音欺骗检测语料库 (VSDC) 上进行评估，该语料库由真实的一阶重放（重放一次）和二阶重放（重放两次）录音组成。实验结果表明，所提出的音频重放检测框架能够可靠地检测一阶和二阶重放攻击，并可用于资源受限的设备。纠错输出代码模型用于使用所提出的 ATP-GTCC 特征空间训练多类 SVM 分类器，并测试语音重放攻击检测。所提出框架的性能在 ASVspoof 2019 数据集和我们自己创建的语音欺骗检测语料库 (VSDC) 上进行评估，该语料库由真实的一阶重放（重放一次）和二阶重放（重放两次）录音组成。实验结果表明，所提出的音频重放检测框架能够可靠地检测一阶和二阶重放攻击，并可用于资源受限的设备。以及我们自己创建的语音欺骗检测语料库 (VSDC)，包括真实的一阶重放（重放一次）和二阶重放（重放两次）录音。实验结果表明，所提出的音频重放检测框架能够可靠地检测一阶和二阶重放攻击，并可用于资源受限的设备。以及我们自己创建的语音欺骗检测语料库 (VSDC)，包括真实的一阶重放（重放一次）和二阶重放（重放两次）录音。实验结果表明，所提出的音频重放检测框架能够可靠地检测一阶和二阶重放攻击，并可用于资源受限的设备。

更新日期：2020-08-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11