Abstract
We consider machine learning, particularly regression, using locally-differentially private datasets. The Wasserstein distance is used to define an ambiguity set centered at the empirical distribution of the dataset corrupted by local differential privacy noise. The radius of the ambiguity set is selected based on privacy budget, spread of data, and size of the problem. Machine learning with private dataset is rewritten as a distributionally-robust optimization. For general distributions, the distributionally-robust optimization problem can be relaxed as a regularized machine learning problem with the Lipschitz constant of the machine learning model as a regularizer. For Gaussian data, the distributionally-robust optimization problem can be solved exactly to find an optimal regularizer. Training with this regularizer can be posed as a semi-definite program.
Similar content being viewed by others
References
Dwork, C., Mcsherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Halevi, S., Rabin, T. (eds.) Theory of Cryptography, pp. 265–284. Springer, Berlin (2006)
Dwork, C., Roth, A.: The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci. 9(3–4), 211–407 (2014)
Kairouz, P., Oh, S., Viswanath, P.: Extremal mechanisms for local differential privacy. J. Mach. Learn. Res. 17(1), 492–542 (2016)
Dewri, R.: Local differential perturbations: Location privacy under approximate knowledge attackers. IEEE Trans. Mobile Comput. 12(12), 2360–2372 (2013)
Duchi,J. C., Jordan,M. I., Wainwright,M. J.: “Local privacy and statistical minimax rates,” In: 2013 IEEE 54th Annual Symposium on Foundations of Computer Science, pp. 429–438, (2013)
Ren, X., Yu, C.-M., Yu, W., Yang, S., Yang, X., McCann, J.A., Philip, S.Y.: LoPub: High-dimensional crowdsourced data publication with local differential privacy. IEEE Trans. Inf. Forensics Secur. 13(9), 2151–2166 (2018)
Erlingsson,Ú., Pihur,V., Korolova, A.:“RAPPOR: Randomized aggregatable privacy-preserving ordinal response,” In: Proceedings of the 2014 ACM SIGSAC conference on computer and communications security, pp. 1054–1067, (2014)
Tang,J., Korolova,A., Bai,X., Wang,X., Wang,X.: “Privacy loss in Apple’s implementation of differential privacy on MacOS 10.12,” arXiv preprint arXiv:1709.02753 (2017)
Smith,A., Thakurta,A., Upadhyay, J.:“Is interaction necessary for distributed private learning?,” In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 58–77, IEEE, (2017)
Wang, D., Gaboardi, M., Xu, J.: Empirical riskminimization in non-interactive local differential privacy revisited. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) Advances in Neural Information Processing Systems, pp. 965–974. Neural Information Processing Systems Foundation (2018)
Zheng,K., Mou,W., Wang,L.: “Collect at once, use effectively: Making non-interactive locally private learning possible,” In: Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp. 4130–4139, (2017)
Wang, D., Smith, A., Xu, J.: Noninteractive locally private learning of linear models via polynomial approximations. In: Garivier, A., Kale, S. (eds.) Algorithmic Learning Theory. Proceedings of Machine Learning Research (PMLR), vol. 98, pp. 898–903. Chicago, Illinois, 22–24 Mar 2019
Esfahani, P.M., Kuhn, D.: Data-driven distributionally robust optimization using the Wasserstein metric: Performance guarantees and tractable reformulations. Math. Program. 171(1–2), 115–166 (2018)
Nguyen,V. A., Kuhn,D., Esfahani,P. M.: “Distributionally robust inverse covariance estimation: The wasserstein shrinkage estimator,” arXiv preprint arXiv:1805.07194, (2018)
Ben-Tal, A., El. Ghaoui, L., Nemirovski, A.: Robust Optimization, vol. 28. Princeton University Press, New Jersey (2009)
Postek, K., den Hertog, D., Melenberg, B.: Computationally tractable counterparts of distributionally robust constraints on risk measures. SIAM Rev. 58(4), 603–650 (2016)
Delage, E., Ye, Y.: Distributionally robust optimization under moment uncertainty with application to data-driven problems. Oper. Res. 58(3), 595–612 (2010)
Hu,Z., Hong,L. J.: “Kullback-Leibler divergence constrained distributionally robust optimization,” Available at Optimization Online, (2013)
Sinha,A., Namkoong,H., Duchi,J.: “Certifiable distributional robustness with principled adversarial training,” In: Proceedings of the Machine Learning and Computer Security Workshop (co-located with Conference on Neural Information Processing Systems 2017), vol. 2, (2017)
Chen,R., Paschalidis,I. C.: “A distributionally robust optimization approach for outlier detection,” In: 2018 IEEE Conference on Decision and Control (CDC), pp. 352–357, IEEE, (2018)
Lee,C., Mehrotra,S.: “A distributionally-robust approach for finding support vector machines,” Manuscript, available at http://www.optimization-online.org/DB_HTML/2015/06/4965.html, (2015)
Shafieezadeh Abadeh, S., Mohajerin Esfahani, P.M., Kuhn, D.: Distributionally robust logistic regression. Adv. Neural Inf. Process. Syst. 28, 1576–1584 (2015)
Kuhn,D., Esfahani,P. M., Nguyen,V. A., Shafieezadeh-Abadeh, S.:“Wasserstein distributionally robust optimization: Theory and applications in machine learning,” In: Operations Research & Management Science in the Age of Analytics, pp. 130–166, INFORMS, (2019)
Anthony, M., Bartlett, P.L.: Neural Network Learning: Theoretical Foundations. Cambridge University Press, Cambridge (2009)
Brownlees, C., Joly, E., Lugosi, G., et al.: Empirical risk minimization for heavy-tailed losses. Ann. Stat. 43(6), 2507–2536 (2015)
Farokhi, F.: Deconvoluting kernel density estimation and regression for locally differentially private data. Sci. Rep. 10(1), 1–11 (2020)
Bickel, P.J., Freedman, D.A., et al.: Some asymptotic theory for the bootstrap. Ann. Stat. 9(6), 1196–1217 (1981)
Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, New York (2012)
Kantorovich, L.V., Rubinshtein, G.: On a space of totally additive functions. Vestn. Lening. Univ 13, 52–59 (1958)
Farokhi, F.: Why does regularization help with mitigating poisoning attacks? Neural Processing Letters (2021). https://doi.org/10.1007/s11063-021-10539-1
Hall, R., Rinaldo, A., Wasserman, L.: Random differential privacy. J. Privacy Confident. 4(2), 43–59 (2012)
Rippl, T., Munk, A., Sturm, A.: Limit laws of the empirical wasserstein distance: Gaussian distributions. J. Multi. Anal. 151, 90–109 (2016)
Givens, C.R., Shortt, R.M.: A class of wasserstein metrics for probability distributions. Michigan Math. J. 31(2), 231–240 (1984)
Vandenberghe, L., Boyd, S.: Semidefinite programming. SIAM Rev. 38(1), 49–95 (1996)
Zhang, F.: The Schur complement and its applications, vol. 4. Springer, Berlin (2006)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Farokhi, F. Distributionally-robust machine learning using locally differentially-private data. Optim Lett 16, 1167–1179 (2022). https://doi.org/10.1007/s11590-021-01765-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11590-021-01765-6