Oblivious Sampling with Applications to Two-Party k-Means Clustering,Journal of Cryptology

当前位置： X-MOL 学术 › J. Cryptol. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Oblivious Sampling with Applications to Two-Party k-Means Clustering
Journal of Cryptology ( IF 2.3 ) Pub Date : 2020-05-12 , DOI: 10.1007/s00145-020-09349-w
Paul Bunn , Rafail Ostrovsky

The k -means clustering problem is one of the most explored problems in data mining. With the advent of protocols that have proven to be successful in performing single database clustering, the focus has shifted in recent years to the question of how to extend the single database protocols to a multiple database setting. To date, there have been numerous attempts to create specific multiparty k -means clustering protocols that protect the privacy of each database, but according to the standard cryptographic definitions of “privacy-protection”, so far all such attempts have fallen short of providing adequate privacy. In this paper, we describe a Two-Party k -Means Clustering Protocol that guarantees privacy against an honest-but-curious adversary, and is more efficient than utilizing a general multiparty “compiler” to achieve the same task. In particular, a main contribution of our result is a way to compute efficiently multiple iterations of k -means clustering without revealing the intermediate values. To achieve this, we describe a technique for performing two-party division securely and also introduce a novel technique allowing two parties to securely sample uniformly at random from an unknown domain size. The resulting Division Protocol and Random Value Protocol are of use to any protocol that requires the secure computation of a quotient or random sampling. Our techniques can be realized based on the existence of any semantically secure homomorphic encryption scheme. For concreteness, we describe our protocol based on Paillier Homomorphic Encryption scheme (see Paillier in Advances in: cryptology EURO-CRYPT’99 proceedings, LNCS 1592, pp 223–238, 1999). We will also demonstrate that our protocol is efficient in terms of communication, remaining competitive with existing protocols (such as Jagannathan and Wright in: KDD’05, pp 593–599, 2005) that fail to protect privacy.

中文翻译：

Oblivious Sampling 与两方 k-Means 聚类的应用

k均值聚类问题是数据挖掘中探索最多的问题之一。随着已被证明可成功执行单数据库集群的协议的出现，近年来焦点已转移到如何将单数据库协议扩展到多数据库设置的问题上。迄今为止，已经有许多尝试创建特定的多方 k-means 聚类协议来保护每个数据库的隐私，但是根据“隐私保护”的标准密码定义，到目前为止所有这些尝试都未能提供足够的隐私。在本文中，我们描述了一种两方 k-Means 聚类协议，它可以保证隐私不受诚实但好奇的对手的影响，并且比使用一般的多方“编译器”来完成相同的任务更有效。特别是，我们的结果的一个主要贡献是一种有效计算 k 均值聚类的多次迭代而不显示中间值的方法。为了实现这一点，我们描述了一种安全地执行两方划分的技术，并引入了一种新技术，允许两方从未知域大小中随机均匀地安全采样。由此产生的除法协议和随机值协议可用于需要安全计算商或随机抽样的任何协议。我们的技术可以基于任何语义安全的同态加密方案的存在来实现。为具体起见，我们描述了我们基于 Paillier 同态加密方案的协议（参见 Paillier in Advances in: cryptology EURO-CRYPT'99 会议记录，LNCS 1592，pp 223–238，1999）。

更新日期：2020-05-12

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11