当前位置: X-MOL 学术arXiv.cs.SD › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Private Speech Characterization with Secure Multiparty Computation
arXiv - CS - Sound Pub Date : 2020-07-01 , DOI: arxiv-2007.00253
Kyle Bittner, Martine De Cock, Rafael Dowsley

Deep learning in audio signal processing, such as human voice audio signal classification, is a rich application area of machine learning. Legitimate use cases include voice authentication, gunfire detection, and emotion recognition. While there are clear advantages to automated human speech classification, application developers can gain knowledge beyond the professed scope from unprotected audio signal processing. In this paper we propose the first privacy-preserving solution for deep learning-based audio classification that is provably secure. Our approach, which is based on Secure Multiparty Computation, allows to classify a speech signal of one party (Alice) with a deep neural network of another party (Bob) without Bob ever seeing Alice's speech signal in an unencrypted manner. As threat models, we consider both passive security, i.e. with semi-honest parties who follow the instructions of the cryptographic protocols, as well as active security, i.e. with malicious parties who deviate from the protocols. We evaluate the efficiency-security-accuracy trade-off of the proposed solution in a use case for privacy-preserving emotion detection from speech with a convolutional neural network. In the semi-honest case we can classify a speech signal in under 0.3 sec; in the malicious case it takes $\sim$1.6 sec. In both cases there is no leakage of information, and we achieve classification accuracies that are the same as when computations are done on unencrypted data.

中文翻译:

使用安全多方计算进行私人语音表征

音频信号处理中的深度学习,如人声音频信号分类,是机器学习的一个丰富应用领域。合法用例包括语音认证、枪声检测和情绪识别。虽然自动人类语音分类有明显的优势,但应用程序开发人员可以从未受保护的音频信号处理中获得超出公开范围的知识。在本文中,我们为基于深度学习的音频分类提出了第一个可证明安全的隐私保护解决方案。我们的方法基于安全多方计算,允许将一方 (Alice) 的语音信号与另一方 (Bob) 的深度神经网络进行分类,而 Bob 从未以未加密的方式看到 Alice 的语音信号。作为威胁模型,我们考虑被动安全,即 与遵循密码协议指令的半诚实方以及主动安全性,即偏离协议的恶意方。我们在使用卷积神经网络从语音中进行隐私保护情绪检测的用例中评估了所提出解决方案的效率-安全-准确性权衡。在半诚实的情况下,我们可以在 0.3 秒内对语音信号进行分类;在恶意情况下,它需要 $\sim$1.6 秒。在这两种情况下都没有信息泄漏,并且我们实现了与对未加密数据进行计算时相同的分类精度。我们在使用卷积神经网络从语音中进行隐私保护情绪检测的用例中评估了所提出解决方案的效率-安全-准确性权衡。在半诚实的情况下,我们可以在 0.3 秒内对语音信号进行分类;在恶意情况下,它需要 $\sim$1.6 秒。在这两种情况下都没有信息泄漏,并且我们实现了与对未加密数据进行计算时相同的分类精度。我们在使用卷积神经网络从语音中进行隐私保护情绪检测的用例中评估了所提出解决方案的效率-安全-准确性权衡。在半诚实的情况下,我们可以在 0.3 秒内对语音信号进行分类;在恶意情况下,它需要 $\sim$1.6 秒。在这两种情况下都没有信息泄漏,并且我们实现了与对未加密数据进行计算时相同的分类精度。
更新日期:2020-07-02
down
wechat
bug