当前位置: X-MOL 学术arXiv.cs.SD › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Improving Fairness in Speaker Recognition
arXiv - CS - Sound Pub Date : 2021-04-29 , DOI: arxiv-2104.14067
Gianni Fenu, Giacomo Medda, Mirko Marras, Giacomo Meloni

The human voice conveys unique characteristics of an individual, making voice biometrics a key technology for verifying identities in various industries. Despite the impressive progress of speaker recognition systems in terms of accuracy, a number of ethical and legal concerns has been raised, specifically relating to the fairness of such systems. In this paper, we aim to explore the disparity in performance achieved by state-of-the-art deep speaker recognition systems, when different groups of individuals characterized by a common sensitive attribute (e.g., gender) are considered. In order to mitigate the unfairness we uncovered by means of an exploratory study, we investigate whether balancing the representation of the different groups of individuals in the training set can lead to a more equal treatment of these demographic groups. Experiments on two state-of-the-art neural architectures and a large-scale public dataset show that models trained with demographically-balanced training sets exhibit a fairer behavior on different groups, while still being accurate. Our study is expected to provide a solid basis for instilling beyond-accuracy objectives (e.g., fairness) in speaker recognition.

中文翻译:

提高说话人识别的公平性

人类的声音传达了个人的独特特征,从而使语音生物识别技术成为验证各种行业身份的关键技术。尽管说话人识别系统在准确性方面取得了令人瞩目的进步,但仍引起了许多道德和法律方面的关注,特别是与此类系统的公平性有关。在本文中,我们的目的是探讨当考虑具有共同敏感属性(例如性别)特征的不同人群时,由最先进的深度说话者识别系统实现的性能差异。为了减轻我们通过探索性研究发现的不公平现象,我们调查了在培训集中平衡不同群体的代表是否可以导致对这些人口群体的平等对待。在两种最新的神经体系结构和大规模公共数据集上进行的实验表明,使用人口统计平衡的训练集训练的模型在不同组中表现出更公平的行为,同时仍然是准确的。预期我们的研究将为灌输说话者识别中超出准确性的目标(例如,公平性)提供坚实的基础。
更新日期:2021-04-30
down
wechat
bug