Fast Privacy-Preserving Text Classification based on Secure Multiparty Computation,arXiv - CS - Cryptography and Security

当前位置： X-MOL 学术 › arXiv.cs.CR › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Fast Privacy-Preserving Text Classification based on Secure Multiparty Computation
arXiv - CS - Cryptography and Security Pub Date : 2021-01-18 , DOI: arxiv-2101.07365
Amanda Resende, Davis Railsback, Rafael Dowsley, Anderson C. A. Nascimento, Diego F. Aranha

We propose a privacy-preserving Naive Bayes classifier and apply it to the problem of private text classification. In this setting, a party (Alice) holds a text message, while another party (Bob) holds a classifier. At the end of the protocol, Alice will only learn the result of the classifier applied to her text input and Bob learns nothing. Our solution is based on Secure Multiparty Computation (SMC). Our Rust implementation provides a fast and secure solution for the classification of unstructured text. Applying our solution to the case of spam detection (the solution is generic, and can be used in any other scenario in which the Naive Bayes classifier can be employed), we can classify an SMS as spam or ham in less than 340ms in the case where the dictionary size of Bob's model includes all words (n = 5200) and Alice's SMS has at most m = 160 unigrams. In the case with n = 369 and m = 8 (the average of a spam SMS in the database), our solution takes only 21ms.

中文翻译：

基于安全多方计算的快速隐私保护文本分类

我们提出了一种保护隐私的朴素贝叶斯分类器，并将其应用于私人文本分类问题。在此设置中，一方（爱丽丝）持有文本消息，而另一方（鲍勃）持有分类器。在该协议的末尾，Alice将仅学习应用于其文本输入的分类器的结果，而Bob则一无所获。我们的解决方案基于安全多方计算（SMC）。我们的Rust实施为非结构化文本的分类提供了一种快速，安全的解决方案。将我们的解决方案应用于垃圾邮件检测（该解决方案是通用的，并且可以在可以使用Naive Bayes分类器的任何其他方案中使用），在这种情况下，我们可以将SMS归类为垃圾邮件或垃圾邮件，时间少于340ms Bob模型的字典大小包括所有单词（n = 5200）和Alice' SMS的最大m = 160字母组合。在n = 369和m = 8（数据库中垃圾短信的平均值）的情况下，我们的解决方案仅需21毫秒。

更新日期：2021-01-20

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>