Combination of Amplitude and Frequency Modulation Features for Presentation Attack Detection,Journal of Signal Processing Systems

当前位置： X-MOL 学术 › J. Sign. Process. Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Combination of Amplitude and Frequency Modulation Features for Presentation Attack Detection
Journal of Signal Processing Systems ( IF 1.6 ) Pub Date : 2020-04-15 , DOI: 10.1007/s11265-020-01532-3
Madhu R. Kamble , Hemant A. Patil

In this paper, we propose the combination of Amplitude Modulation and Frequency Modulation (AM-FM) features for replay Spoof Speech Detection (SSD) task. The AM components are known to be affected by noise (in this case, due to replay mechanism). In particular, we exploit this damage in AM component to corresponding Instantaneous Frequency (IF) for SSD task. Thus, the novelty of proposed Amplitude Weighted Frequency Cepstral Coefficients (AWFCC) feature set lies in using frequency components along with squared weighted amplitude components that are degraded due to replay noise. The AWFCC feature set contains the information of both AM and FM components together and hence, gave discriminatory information in the spectral characteristics. The experiments were performed on publicly available ASVspoof 2017 challenge version 1.0 and 2.0 databases using AWFCC feature set. We have compared results of proposed feature set with the other state-of-the-art feature set, such as Constant Q Cepstral Coefficients (CQCC), Linear Frequency Cepstral Coefficients (LFCC), Mel Frequency Cepstral Coefficients (MFCC) and using a simple Gaussian Mixture Model (GMM) classifier. The individual performance of AWFCC feature set obtained lower % EER than the other feature sets on both version 1.0 and 2.0 databases. Furthermore, we used score-level fusion in order to obtain the possible complementary information of two feature sets to reduce the % EER further. To that effect, the score-level fusion of CQCC and AWFCC feature sets gave 5.75 % and 10.42 % EER on development and evaluation sets, respectively, of ASVspoof 2017 version 2.0 database. Moreover, for evaluation dataset, we have also studied the performance of proposed feature set on different Replay Configurations (RC), namely, acoustic environments, playback, and recording devices. For all the levels of threat conditions (i.e., low, medium, and high) to the ASV system, the proposed feature set performed better compared to the existing state-of-the-art feature sets.

中文翻译：

幅度和频率调制功能的组合，用于演示攻击检测

在本文中，我们提出了幅度调制和频率调制（AM-FM）功能的组合，以重播欺骗性语音检测（SSD）任务。已知AM组件会受到噪声的影响（在这种情况下，由于重放机制的影响）。特别是，我们利用这种损害在AM组件中将SSD任务中对应的瞬时频率（IF）设置为因此，提出的幅度加权频率倒谱系数（AWFCC）特征集的新颖之处在于，使用了频率分量以及平方的加权幅度分量，这些分量由于重放噪声而降低了。AWFCC功能集同时包含AM和FM分量的信息，因此在光谱特性方面提供了区别信息。使用AWFCC功能集在公开可用的ASVspoof 2017挑战版1.0和2.0数据库上进行了实验。我们将建议的特征集的结果与其他最新的特征集（例如，恒定Q倒谱系数（CQCC），线性频率倒谱系数（LFCC），梅尔频率倒谱系数（MFCC）并使用简单的高斯混合模型（GMM）分类器。在1.0版和2.0版数据库上，AWFCC功能集的单个性能获得的％EER低于其他功能集。此外，我们使用分数级融合来获得两个特征集的可能补充信息，以进一步降低％EER。为此，CQCC和AWFCC功能集的分数级融合分别为ASVspoof 2017 2.0版数据库的开发集和评估集提供了5.75％和10.42％的EER。此外，对于评估数据集，我们还研究了在不同的重放配置（RC）（即声学环境，重放和记录设备）上提出的功能集的性能。对于所有级别的威胁状况（即低，中，

更新日期：2020-04-18

点击分享查看原文

点击收藏

阅读更多本刊最新论文