SecureNLP: A System for Multi-Party Privacy-Preserving Natural Language Processing,IEEE Transactions on Information Forensics and Security

当前位置： X-MOL 学术 › IEEE Trans. Inform. Forensics Secur. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

SecureNLP: A System for Multi-Party Privacy-Preserving Natural Language Processing
IEEE Transactions on Information Forensics and Security ( IF 6.3 ) Pub Date : 5-25-2020 , DOI: 10.1109/tifs.2020.2997134
Qi Feng , Debiao He , Zhe Liu , Huaqun Wang , Kim-Kwang Raymond Choo

Natural language processing (NLP) allows a computer program to understand human language as it is spoken, and has been increasingly deployed in a growing number of applications, such as machine translation, sentiment analysis, and electronic voice assistant. While information obtained from different sources can enhance the accuracy of NLP models, there are also privacy implications in the collection of such massive data. Thus, in this paper, we design a privacy-preserving system SecureNLP, focusing on the instance of recurrent neural network (RNN)based sequence-to-sequence with attention model for neural machine translation. Specifically, for non-linear functions such as sigmoid and tanh, we design two efficient distributed protocols using secure multi-party computation (MPC), which are used to carry out the respective tasks in the SecureNLP. We also prove the security of these two protocols (i.e., privacy-preserving long short-term memory network PrivLSTM, and privacy-preserving sequence to sequence transformation PrivSEQ2SEQ) in the semi-honest adversary model, in the sense that any honest-butcurious adversary cannot learn anything else from the messages they receive from other parties. The proposed system is implemented in C++ and Python, and the findings from the evaluation demonstrate the utility of the protocols in cross-domain NLP.

中文翻译：

SecureNLP：多方隐私保护自然语言处理系统

自然语言处理 (NLP) 允许计算机程序理解人类所说的语言，并且已越来越多地部署在越来越多的应用中，例如机器翻译、情感分析和电子语音助手。虽然从不同来源获得的信息可以提高 NLP 模型的准确性，但收集如此海量的数据也存在隐私问题。因此，在本文中，我们设计了一种隐私保护系统 SecureNLP，重点关注基于循环神经网络（RNN）的序列到序列的实例，并具有用于神经机器翻译的注意模型。具体来说，对于 sigmoid 和 tanh 等非线性函数，我们使用安全多方计算（MPC）设计了两种高效的分布式协议，用于执行 SecureNLP 中的相应任务。我们还在半诚实对手模型中证明了这两个协议（即隐私保护长短期记忆网络 PrivLSTM 和隐私保护序列到序列转换 PrivSEQ2SEQ）的安全性，即任何诚实但好奇的对手无法从其他方收到的消息中了解任何其他信息。所提出的系统是用 C++ 和 Python 实现的，评估结果证明了该协议在跨领域 NLP 中的实用性。

更新日期：2024-08-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11