当前位置: X-MOL 学术Sensors › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
valuating the Performance of Speaker Recognition Solutions in E-Commerce Applications
Sensors ( IF 3.4 ) Pub Date : 2021-09-17 , DOI: 10.3390/s21186231
Olja Krčadinac 1 , Uroš Šošević 1 , Dušan Starčević 1
Affiliation  

Two important tasks in many e-commerce applications are identity verification of the user accessing the system and determining the level of rights that the user has for accessing and manipulating system’s resources. The performance of these tasks is directly dependent on the certainty of establishing the identity of the user. The main research focus of this paper is user identity verification approach based on voice recognition techniques. The paper presents research results connected to the usage of open-source speaker recognition technologies in e-commerce applications with an emphasis on evaluating the performance of the algorithms they use. Four open-source speaker recognition solutions (SPEAR, MARF, ALIZE, and HTK) have been evaluated in cases of mismatched conditions during training and recognition phases. In practice, mismatched conditions are influenced by various lengths of spoken sentences, different types of recording devices, and the usage of different languages in training and recognition phases. All tests conducted in this research were performed in laboratory conditions using the specially designed framework for multimodal biometrics. The obtained results show consistency with the findings of recent research which proves that i-vectors and solutions based on probabilistic linear discriminant analysis (PLDA) continue to be the dominant speaker recognition approaches for text-independent tasks.

中文翻译:

评估说话人识别解决方案在电子商务应用中的性能

许多电子商务应用程序中的两个重要任务是对访问系统的用户进行身份验证和确定用户访问和操作系统资源的权限级别。这些任务的执行直接取决于建立用户身份的确定性。本文的主要研究重点是基于语音识别技术的用户身份验证方法。本文介绍了与电子商务应用程序中使用开源说话人识别技术相关的研究结果,重点是评估他们使用的算法的性能。在训练和识别阶段出现条件不匹配的情况下,已经评估了四种开源说话人识别解决方案(SPEAR、MARF、ALIZE 和 HTK)。在实践中,不匹配的条件受不同长度的口语、不同类型的录音设备以及在训练和识别阶段使用不同语言的影响。本研究中进行的所有测试都是在实验室条件下使用专门设计的多模态生物识别框架进行的。获得的结果与最近的研究结果一致,证明基于概率线性判别分析 (PLDA) 的 i 向量和解决方案仍然是独立于文本的任务的主要说话人识别方法。本研究中进行的所有测试都是在实验室条件下使用专门设计的多模态生物识别框架进行的。获得的结果与最近的研究结果一致,证明基于概率线性判别分析 (PLDA) 的 i 向量和解决方案仍然是独立于文本的任务的主要说话人识别方法。本研究中进行的所有测试都是在实验室条件下使用专门设计的多模态生物识别框架进行的。获得的结果与最近的研究结果一致,证明基于概率线性判别分析 (PLDA) 的 i 向量和解决方案仍然是独立于文本的任务的主要说话人识别方法。
更新日期:2021-09-17
down
wechat
bug