aDCF Loss Function for Deep Metric Learning in End-to-End Text-Dependent Speaker Verification Systems,IEEE/ACM Transactions on Audio, Speech, and Language Processing

当前位置： X-MOL 学术 › IEEE ACM Trans. Audio Speech Lang. Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

aDCF Loss Function for Deep Metric Learning in End-to-End Text-Dependent Speaker Verification Systems
IEEE/ACM Transactions on Audio, Speech, and Language Processing ( IF 4.1 ) Pub Date : 2022-01-25 , DOI: 10.1109/taslp.2022.3145307
Victoria Mingote , Antonio Miguel , Dayana Ribas , Alfonso Ortega , Eduardo Lleida

Metric learning approaches have widely expanded to the training of Speaker Verification (SV) systems based on Deep Neural Networks (DNNs), by using a loss function more consistent with the evaluation process than the traditional identification losses. However, these methods do not consider the performance measure and can involve high computational cost, for example, the need for a careful pair or triplet data selection. This paper proposes the approximated Detection Cost Function (aDCF) loss, which is a loss function based on the measure of the decision errors in SV systems, namely the False Rejection Rate (FRR) and the False Acceptance Rate (FAR). With aDCF loss as the training objective function, the end-to-end system learns how to minimize decision errors. Furthermore, we replace the typical linear layer as the last layer of DNN by a cosine distance layer, which reduces the difference between the metric in the training process and the metric during evaluation. aDCF loss function was evaluated in RSR2015-Part I and RSR2015-Part II datasets for text-dependent speaker verification. The system trained with aDCF loss outperforms all the state-of-the-art functions employed in this paper in both parts of the database.

中文翻译：

用于端到端文本相关说话人验证系统中深度度量学习的 aDCF 损失函数

通过使用比传统识别损失更符合评估过程的损失函数，度量学习方法已广泛扩展到基于深度神经网络（DNN）的说话人验证（SV）系统的训练。然而，这些方法没有考虑性能度量，并且可能涉及较高的计算成本，例如，需要仔细的对或三元组数据选择。本文提出了近似检测成本函数（aDCF）损失，它是基于SV系统中决策错误度量的损失函数，即错误拒绝率（FRR）和错误接受率（FAR）。以 aDCF 损失作为训练目标函数，端到端系统学习如何最小化决策错误。此外，我们用余弦距离层代替了典型的线性层作为DNN的最后一层，这减少了训练过程中的度量与评估过程中的度量之间的差异。 aDCF 损失函数在 RSR2015-Part I 和 RSR2015-Part II 数据集中进行了评估，用于文本相关的说话人验证。使用 aDCF 损失训练的系统在数据库的两个部分中都优于本文中使用的所有最先进的函数。

更新日期：2022-01-25

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文