Pairwise Discriminative Neural PLDA for Speaker Verification,arXiv - CS - Sound

当前位置： X-MOL 学术 › arXiv.cs.SD › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Pairwise Discriminative Neural PLDA for Speaker Verification
arXiv - CS - Sound Pub Date : 2020-01-20 , DOI: arxiv-2001.07034
Shreyas Ramoji, Prashant Krishnan V, Prachi Singh, Sriram Ganapathy

The state-of-art approach to speaker verification involves the extraction of discriminative embeddings like x-vectors followed by a generative model back-end using a probabilistic linear discriminant analysis (PLDA). In this paper, we propose a Pairwise neural discriminative model for the task of speaker verification which operates on a pair of speaker embeddings such as x-vectors/i-vectors and outputs a score that can be considered as a scaled log-likelihood ratio. We construct a differentiable cost function which approximates speaker verification loss, namely the minimum detection cost. The pre-processing steps of linear discriminant analysis (LDA), unit length normalization and within class covariance normalization are all modeled as layers of a neural model and the speaker verification cost functions can be back-propagated through these layers during training. We also explore regularization techniques to prevent overfitting, which is a major concern in using discriminative back-end models for verification tasks. The experiments are performed on the NIST SRE 2018 development and evaluation datasets. We observe average relative improvements of 8% in CMN2 condition and 30% in VAST condition over the PLDA baseline system.

中文翻译：

用于说话人验证的成对判别神经 PLDA

说话人验证的最新方法涉及提取判别性嵌入，如 x 向量，然后是使用概率线性判别分析 (PLDA) 的生成模型后端。在本文中，我们提出了一种用于说话人验证任务的成对神经判别模型，该模型对一对说话人嵌入（例如 x 向量/i 向量）进行操作，并输出可以被视为缩放对数似然比的分数。我们构建了一个近似说话人验证损失的可微成本函数，即最小检测成本。线性判别分析（LDA）的预处理步骤，单位长度归一化和类内协方差归一化都被建模为神经模型的层，并且说话人验证成本函数可以在训练期间通过这些层进行反向传播。我们还探索了正则化技术以防止过度拟合，这是使用判别后端模型进行验证任务的主要问题。实验是在 NIST SRE 2018 开发和评估数据集上进行的。我们观察到与 PLDA 基线系统相比，CMN2 条件下平均相对改善 8%，VAST 条件下平均相对改善 30%。

更新日期：2020-02-10

点击分享查看原文

点击收藏

阅读更多本刊最新论文