An adaptive transmission line cochlear model based front-end for replay attack detection,Speech Communication

当前位置： X-MOL 学术 › Speech Commun. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

An adaptive transmission line cochlear model based front-end for replay attack detection
Speech Communication ( IF 2.4 ) Pub Date : 2021-06-29 , DOI: 10.1016/j.specom.2021.06.004
Tharshini Gunendradasan , Eliathamby Ambikairajah , Julien Epps , Vidhyasaharan Sethu , Haizhou Li

The cochlea is a remarkable spectrum analyser with desirable properties including sharp frequency tuning and level-dependent compression and the potential advantages of incorporating these characteristics in a speech processing front-end are investigated. This paper develops a framework for an active transmission line cochlear model employing adaptive notch and resonant filters. The proposed model reproduces the observed asymmetric auditory filter shape with a sharp high-frequency roll-off and level-dependent nonlinear dynamic range compression characteristics. Experimental analysis demonstrates that sharp frequency tuning and dynamic range compression of the proposed model lead to an enhanced spectral representation compared with other spectral analysis methods. The proposed model was employed in the front-end of replay spoofing attack detection systems, and experiments on the ASVspoof 2017 version 2.0 and ASVspoof 2019 databases demonstrate that the proposed model outperforms linear and nonlinear level-dependent parallel filter bank auditory models and classical spectro-temporal front-ends. The use of the proposed model leads to relative improvements of 45.6%, 51.9% and 60.8% over the baseline feature CQCCs of ASVspoof version 2.0 and CQCCs and LFCCs of ASVspoof2019 on evaluation datasets, respectively.

中文翻译：

基于自适应传输线耳蜗模型的重放攻击检测前端

耳蜗是一种卓越的频谱分析仪，具有所需的特性，包括尖锐的频率调谐和电平相关压缩，并研究了将这些特性结合到语音处理前端的潜在优势。本文为采用自适应陷波和谐振滤波器的有源传输线耳蜗模型开发了一个框架。所提出的模型再现了观察到的非对称听觉滤波器形状，具有尖锐的高频滚降和电平相关的非线性动态范围压缩特性。实验分析表明，与其他频谱分析方法相比，所提出模型的急剧频率调谐和动态范围压缩导致增强的频谱表示。所提出的模型被用于重放欺骗攻击检测系统的前端，在 ASVspoof 2017 版本 2.0 和 ASVspoof 2019 数据库上的实验表明，所提出的模型优于线性和非线性水平相关的并行滤波器组听觉模型和经典频谱分析。时间前端。与 ASVspoof 2.0 版的基线特征 CQCC 和 ASVspoof2019 的 CQCC 和 LFCC 在评估数据集上相比，所提出模型的使用分别导致了 45.6%、51.9% 和 60.8% 的相对改进。

更新日期：2021-07-08

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11