当前位置: X-MOL 学术arXiv.math.ST › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Differentially Private Fréchet Mean on the Manifold of Symmetric Positive Definite (SPD) Matrices
arXiv - MATH - Statistics Theory Pub Date : 2022-08-08 , DOI: arxiv-2208.04245
Saiteja Utpala, Praneeth Vepakomma, Nina Miolane

Differential privacy has become crucial in the real-world deployment of statistical and machine learning algorithms with rigorous privacy guarantees. The earliest statistical queries, for which differential privacy mechanisms have been developed, were for the release of the sample mean. In Geometric Statistics, the sample Fr\'echet mean represents one of the most fundamental statistical summaries, as it generalizes the sample mean for data belonging to nonlinear manifolds. In that spirit, the only geometric statistical query for which a differential privacy mechanism has been developed, so far, is for the release of the sample Fr\'echet mean: the \emph{Riemannian Laplace mechanism} was recently proposed to privatize the Fr\'echet mean on complete Riemannian manifolds. In many fields, the manifold of Symmetric Positive Definite (SPD) matrices is used to model data spaces, including in medical imaging where privacy requirements are key. We propose a novel, simple and fast mechanism - the \emph{Tangent Gaussian mechanism} - to compute a differentially private Fr\'echet mean on the SPD manifold endowed with the log-Euclidean Riemannian metric. We show that our new mechanism obtains quadratic utility improvement in terms of data dimension over the current and only available baseline. Our mechanism is also simpler in practice as it does not require any expensive Markov Chain Monte Carlo (MCMC) sampling, and is computationally faster by multiple orders of magnitude -- as confirmed by extensive experiments.

中文翻译:

对称正定 (SPD) 矩阵流形上的差分私有 Fréchet 均值

在具有严格隐私保证的统计和机器学习算法的实际部署中,差分隐私已变得至关重要。最早的统计查询,已经开发了差分隐私机制,是为了发布样本均值。在几何统计中,样本 Fr\'echet 均值代表了最基本的统计摘要之一,因为它概括了属于非线性流形的数据的样本均值。本着这种精神,到目前为止,唯一开发了差分隐私机制的几何统计查询是为了发布样本 Fr\'echet 均值:\emph{黎曼拉普拉斯机制}最近被提议用于私有化 Fr \'echet 指完全黎曼流形。在很多领域,对称正定 (SPD) 矩阵的流形用于对数据空间进行建模,包括在隐私要求是关键的医学成像中。我们提出了一种新颖、简单且快速的机制 - \emph{Tangent Gaussian mechanism} - 来计算具有对数欧几里得黎曼度量的 SPD 流形上的差分私有 Fr\'echet 均值。我们表明,我们的新机制在当前和唯一可用的基线上在数据维度方面获得了二次效用改进。我们的机制在实践中也更简单,因为它不需要任何昂贵的马尔可夫链蒙特卡罗 (MCMC) 采样,并且计算速度快了多个数量级——正如大量实验所证实的那样。简单而快速的机制 - \emph{正切高斯机制} - 计算 SPD 流形上的差分私有 Fr\'echet 均值,该流形具有对数欧几里得黎曼度量。我们表明,我们的新机制在当前和唯一可用的基线上在数据维度方面获得了二次效用改进。我们的机制在实践中也更简单,因为它不需要任何昂贵的马尔可夫链蒙特卡罗 (MCMC) 采样,并且计算速度快了多个数量级——正如大量实验所证实的那样。简单而快速的机制 - \emph{正切高斯机制} - 计算 SPD 流形上的差分私有 Fr\'echet 均值,该流形具有对数欧几里得黎曼度量。我们表明,我们的新机制在当前和唯一可用的基线上在数据维度方面获得了二次效用改进。我们的机制在实践中也更简单,因为它不需要任何昂贵的马尔可夫链蒙特卡罗 (MCMC) 采样,并且计算速度快了多个数量级——正如大量实验所证实的那样。我们表明,我们的新机制在当前和唯一可用的基线上在数据维度方面获得了二次效用改进。我们的机制在实践中也更简单,因为它不需要任何昂贵的马尔可夫链蒙特卡罗 (MCMC) 采样,并且计算速度快了多个数量级——正如大量实验所证实的那样。我们表明,我们的新机制在当前和唯一可用的基线上在数据维度方面获得了二次效用改进。我们的机制在实践中也更简单,因为它不需要任何昂贵的马尔可夫链蒙特卡罗 (MCMC) 采样,并且计算速度快了多个数量级——正如大量实验所证实的那样。
更新日期:2022-08-09
down
wechat
bug