Skip to main content
Log in

Privacy-preserving data splitting: a combinatorial approach

  • Published:
Designs, Codes and Cryptography Aims and scope Submit manuscript

Abstract

Privacy-preserving data splitting is a technique that aims to protect data privacy by storing different fragments of data in different locations. In this work we give a new combinatorial formulation to the data splitting problem. We see the data splitting problem as a purely combinatorial problem, in which we have to split data attributes into different fragments in a way that satisfies certain combinatorial properties derived from processing and privacy constraints. Using this formulation, we develop new combinatorial and algebraic techniques to obtain solutions to the data splitting problem. We present an algebraic method which builds an optimal data splitting solution by using Gröbner bases. Since this method is not efficient in general, we also develop a greedy algorithm for finding solutions that are not necessarily minimally sized.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Similar content being viewed by others

References

  1. Abu-Libdeh H., Princehouse L., Weatherspoon H.: RACS: a case for cloud storage diversity. In: Proceedings of the 1st ACM symposium on Cloud computing. ACM, New York (2010).

  2. Aggarwal G., Bawa M., Ganesan P., Garcia-Molina H., Kenthapadi K., Motwani R., Srivastava U., Thomas D., Xu Y.: Two can keep a secret: a distributed architecture for secure database services. In: Conference on Innovative Data Systems Research, vol. 2005, pp. 186–199 (2005).

  3. Beimel A., Farràs O., Mintz Y.: Secret sharing schemes for very dense graphs. J. Cryptol. 29(2), 336–362 (2016).

    Article  MathSciNet  Google Scholar 

  4. Brélaz D.: New methods to color the vertices of a graph. Commun. ACM 22(4), 251–256 (1979).

    Article  MathSciNet  Google Scholar 

  5. Brinkman R., Maubach S., Jonker W.: A lucky dip as a secure data store. In: Proceedings of Workshop on Information and System Security (2006).

  6. Calviño A., Ricci S., Domingo-Ferrer J.: Privacy-preserving distributed statistical computation to a semi-honest multi-cloud. In: 2015 IEEE Conference on Communications and Network Security (CNS), pp. 506–514 (2015). https://doi.org/10.1109/CNS.2015.7346863.

  7. Cao N., Wang C., Li M., Ren K., Lou W.: Privacy-preserving multi-keyword ranked search over encrypted cloud data. IEEE Trans. Parallel Distrib. Syst. 25(1), 222–233 (2014).

    Article  Google Scholar 

  8. Carter M.W.: A rvey of practical applications of examination timetabling algorithms. Oper. Res. 34(2), 193–202 (1986).

    Article  MathSciNet  Google Scholar 

  9. Ciriani V., De Capitani di Vimercati S., Foresti S., Jajodia S., Paraboschi S., Samarati P.: Fragmentation and encryption to enforce privacy in data storage. In Computer Security – ESORICS 2007. Lecture Notes in Computer Science, vol. 4734, pp. 171–186. Springer, Heidelberg (2007).

  10. Ciriani V., De Capitani di Vimercati S., Foresti S., Jajodia S., Paraboschi S., Samarati P.: Combining fragmentation and encryption to protect privacy in data storage. ACM Trans. Inf. Syst. Secur. 13(3), 221–223 (2010).

    Article  Google Scholar 

  11. Ciriani V., Capitani De, di Vimercati S., Foresti S., Jajodia S., Paraboschi S., Samarati P.: Selective data outsourcing for enforcing privacy. J. Comput. Secur. 19(3), 531–566 (2011).

    Article  Google Scholar 

  12. Clifton C., Kantarcioglu M., Vaidya J., Lin X., Zhu M.Y.: Tools for privacy preserving distributed data mining. ACM SIGKDD Explor. Newsl. 4(2), 28–34 (2002).

    Article  Google Scholar 

  13. Cox D., Little J., O’shea D.: Ideals, Varieties, and Algorithms. Springer, New York (1992).

    Book  Google Scholar 

  14. De Loera J.A.: Gröbner bases and graph colorings. Beitr. Algebra Geom. 36(1), 89–96 (1995).

    MATH  Google Scholar 

  15. De Loera J.A., Margulies S., Pernpeintner M., Riedl E., Rolnick D., Spencer G., Stasi D., Swenson J.: Graph-coloring ideals: Nullstellensatz certificates, Gröbner bases for chordal graphs, and hardness of Gröbner bases. In: Proceedings of the 2015 ACM on International Symposium on Symbolic and Algebraic Computation, pp. 133–140 (2015).

  16. Dev H., Sen T., Basak B., Ali M.E.: An approach to protect the privacy of cloud data from data mining based attacks. In: High Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion. IEEE (2012).

  17. Domingo-Ferrer J., Farràs O., Ribes-González J., Sánchez D.: Privacy-preserving cloud computing on sensitive data: a survey of methods, products and challenges. Comput. Commun. 140–141, 38–60 (2019).

    Article  Google Scholar 

  18. Du W., Yunghsiang S.H., Shigang C.: Privacy-preserving multivariate statistical analysis: linear regression and classification. In: Proceedings of the 2004 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics (2004).

  19. Erdős P., Goodman A.W., Pósa L.: The representation of a graph by set intersections. Can. J. Math. 18, 106–112 (1966).

    Article  MathSciNet  Google Scholar 

  20. Faugere J.C., Gianni P., Lazard D., Mora T.: Efficient computation of zero-dimensional Gröbner bases by change of ordering. J. Symb. Comput. 16(4), 329–44 (1993).

    Article  Google Scholar 

  21. Farràs O., Ribes-González J., Ricci S.: Local bounds for the optimal information ratio of secret sharing schemes. Des. Codes Cryptogr. 87(6), 1323–1344 (2019).

    Article  MathSciNet  Google Scholar 

  22. Ganapathy V., Thomas D., Feder T., Garcia-Molina H., Motwani R.: Distributing data for secure database services. Trans. Data Privacy 5(1), 253–272 (2012).

    MathSciNet  Google Scholar 

  23. Goethals B., Laur S., Lipmaa H., Mielikäinen T.: On private scalar product computation for privacy-preserving data mining. In: International Conference on Information Security and Cryptology. Springer, Berlin (2004).

  24. Guruswami V., Hastad J., Sudan M.: Hardness of approximate hypergraph coloring. SIAM J. Comput. 31(6), 1663–1686 (2002).

    Article  MathSciNet  Google Scholar 

  25. Hall Jr. M.: A problem in partitions. Bull. Am. Math. Soc. 47, 801–807 (1941).

    MathSciNet  MATH  Google Scholar 

  26. Hillar C.J., Windfeldt T.: Algebraic characterization of uniquely vertex colorable graphs. J. Combin. Theory Ser. B 98(2), 400–14 (2008).

    Article  MathSciNet  Google Scholar 

  27. Kantarcioglu M.: A survey of privacy-preserving methods across horizontally partitioned data. In: Privacy-Preserving Data Mining, pp. 313–335. Springer, Boston (2008).

  28. Leighton F.T.: A graph coloring algorithm for large scheduling problems. J. Res. Natl Bur. Stand. 84, 489–506 (1979).

    Article  MathSciNet  Google Scholar 

  29. Levy-dit-Vehel, F., Marinari, M.G., Perret, L., Traverso, C.: A survey on Polly Cracker systems. In: Gröbner Bases, Coding, and Cryptography, pp. 285–305. Springer, Berlin (2009).

  30. Loera J.A., Lee J., Margulies S., Onn S.: Expressing combinatorial problems by systems of polynomial equations and Hilbert’s Nullstellensatz. Combin. Probab. Comput. 18(4), 551–82 (2009).

    Article  MathSciNet  Google Scholar 

  31. Ricci S., Domingo-Ferrer J., Sánchez D.: Privacy-preserving cloud-based statistical analyses on sensitive categorical data. In: Modeling Decisions for Artificial Intelligence. Springer, Cham (2016).

  32. Sánchez D., Batet M.: Privacy-preserving data outsourcing in the cloud via semantic data splitting. Comput. Commun. 110, 187–201 (2017).

    Article  Google Scholar 

  33. Shan Z., Ren K., Blanton M., Wang C.: Practical secure computation outsourcing: a survey. ACM Comput. Surv. 51(2), Article No. 31 (2018).

    Article  Google Scholar 

  34. Spencer J.: Ten lectures on the probabilistic method. SIAM Regional Conference Series in Applied Mathematics, vol. 52. SIAM, Philadelphia (1987).

  35. Sweeney L.: Simple demographics often identify people uniquely. Health (San Francisco) 671, 1–34 (2000).

    Google Scholar 

  36. Tang J., Cui Y., Li Q., Ren K., Liu J., Buyya R.: Ensuring security and privacy preservation for cloud data services. ACM Comput. Surv. 49(1), Article No. 13 (2016).

    Article  Google Scholar 

  37. The Sage Mathematical Software System. http://www.sagemath.org/. Accessed 10 Jan 2021.

  38. Tsukiyama S., Ide M., Ariyoshi H., Shirakawa I.: A new algorithm for generating all the maximal independent sets. SIAM J. Comput. 6(3), 505–517 (1977).

    Article  MathSciNet  Google Scholar 

  39. Welsh D.J.A., Powell M.B.: An upper bound for the chromatic number of a graph and its application to timetabling problems. Comput. J. 10(1), 85–86 (1967).

    Article  Google Scholar 

  40. Yang Q., Wu X.: 10 challenging problems in data mining research. Int. J. Inf. Technol. Decis. Mak. 5(04), 597–604 (2006).

    Article  Google Scholar 

Download references

Acknowledgements

This article is supported by the Ministry of the Interior of the Czech Republic (grant VJ01030002), by the Government of Catalonia (grant 2017 SGR 705), by the European Commission (project H2020-871042 “SoBigData++”), by the Spanish Government (project RTI2018-095094-B-C21, “CONSENT”), and by the DRAC project, which is co-financed by the European Union Regional Development Fund within the framework of the ERDF Operational Program of Catalonia 2014-2020 with a grant of 50% of total cost eligible.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sara Ricci.

Additional information

Communicated by C. J. Colbourn.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Farràs, O., Ribes-González, J. & Ricci, S. Privacy-preserving data splitting: a combinatorial approach. Des. Codes Cryptogr. 89, 1735–1756 (2021). https://doi.org/10.1007/s10623-021-00884-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10623-021-00884-6

Keywords

Mathematics Subject Classification

Navigation