Abstract
Private processing of database queries protects the confidentiality of sensitive data when queries are answered. It is important to design collusion-resistant protocols ensuring that privacy remains protected even when a certain number of honest-but-curious participants collude to share their knowledge in order to gain unauthorised access to sensitive information. A novel setting arises when aggregated queries need to be answered for a large distributed database, but legal requirements or commercial interests forbid making access to records in each subdatabase available to other counterparts. For example, a very large number of medical records may be stored in a distributed database, which is a union of several separate databases from different hospitals, or even from different countries. The present article introduces and investigates two protocols for collusion-resistant private processing of aggregated queries in this novel setting: Accelerated Multi-round Iterative Protocol (AMIP) and Restricted Multi-round Iterative Protocol (RMIP). We define a large collection of query functions and show that AMIP and RMIP protocols can answer all queries in this collection. Our experiments demonstrate that the AMIP protocol outperforms all other applicable algorithms, and this achievement is especially significant in terms of the communication complexity.
Similar content being viewed by others
References
Amagata, D., Sasaki, Y., Hara, T., Nishio, S.: Probabilistic nearest neighbor query processing on distributed uncertain data. Distrib. Parallel Databases 34, 259–287 (2016)
Drissi, A., Nait-Bahloul, S., Benouaret, K., Benslimane, D.: Horizontal fragmentation for fuzzy querying databases. Distrib. Parallel Databases (2019). https://doi.org/10.1007/s10619-018-7250-4
Guzun, G., Canahuate, G.: High-dimensional similarity searches using query driven dynamic quantization and distributed indexing. Distrib. Parallel Databases (2019). https://doi.org/10.1007/s10619-019-07266-x
Mershad, K., Malluhi, Q.M., Ouzzani, M., Tang, M., Gribskov, M., Aref, W.G., Prakash, D.: COACT: a query interface language for collaborative databases. Distrib. Parallel Databases 36, 121–151 (2018)
Wang, X., Shen, D., Yu, G.: Uncertain top-k query processing in distributed environments. Distrib. Parallel Databases 34, 567–589 (2016)
Ebenstein, R., Agrawal, G.: DistriPlan: an optimized join execution framework for geo-distributed scientific data. Distrib. Parallel Databases 38, 1–26 (2019)
Jafarinejad, M., Amini, M.: Multi-join query optimization in bucket-based encrypted databases using an enhanced ant colony optimization algorithm. Distrib. Parallel Databases 36, 399–441 (2018)
Li, H., Cui, J., Meng, X., Ma, J.: IHP: improving the utility in differential private histogram publication. Distrib. Parallel Databases 37, 1–30 (2019)
Örencik, C., Savaş, E.: An efficient privacy-preserving multi-keyword search over encrypted cloud data with ranking. Distrib. Parallel Databases 32, 119–160 (2014)
Wang, S., Agrawal, D., El Abbadi, A.: Towards practical private processing of database queries over public data. Distrib. Parallel Databases 32, 65–89 (2014)
Cho, H.J., Kwon, S.J., Jin, R., Chung, T.S.: A privacy-aware monitoring algorithm for moving k-nearest neighbor queries in road networks. Distrib. Parallel Databases 33, 319–352 (2015)
Huang, J., Qi, J., Xu, Y., Chen, J.: A privacy-enhancing model for location-based personalized recommendations. Distrib. Parallel Databases 33, 253–276 (2015)
Kafali, O., Günay, A., Yolum, P.: Detecting and predicting privacy violations in online social networks. Distrib. Parallel Databases 32, 161–190 (2014)
Omer, M.Z., Gao, H., Mustafa, N.: Privacy-preserving of SVM over vertically partitioned with imputing missing data. Distrib. Parallel Databases 35, 363–382 (2017)
Sellami, M., Hacid, M.S., Gammoudi, M.M.: A FCA framework for inference control in data integration systems. Distrib. Parallel Databases 37, 1–44 (2019)
Belyaev, K., Sun, W., Ray, I., Ray, I.: On the design and analysis of protocols for personal health record storage on personal data server devices. Future Gener. Comput. Syst. 80, 467–482 (2018)
Karapiperis, D., Gkoulalas-Divanis, A., Verykios, V.S.: Summarizing and linking electronic health records. Distrib. Parallel Databases 38, 1–40 (2019)
Teng, D., Kong, J., Wang, F.: Scalable and flexible management of medical image big data. Distrib. Parallel Databases 37, 235–250 (2019)
Wiese, I., Sarna, N., Wiese, L.: Concept acquisition and improved in-database similarity analysis for medical data. Distrib. Parallel Databases 38, 1–25 (2019)
Forkan, A., Khalil, I., Atiquzzaman, M.: ViSiBiD: a learning model for early discovery and real-time prediction of severe clinical events using vital signs as big data. Comput. Netw. 113, 244–257 (2017)
Singh, K., Rong, J., Batten, L.: Sharing sensitive medical data sets for research purposes: a case study. In: Proceedings of 2014 IEEE International Conference Data Science and Advanced Analytics, DSAA 2014, pp. 555–562 (2014)
Zhang, C., Zhu, L., Xu, C., Lu, R.: PPDP: an efficient and privacy-preserving disease prediction scheme in cloud-based e-healthcare system. Future Gener. Comput. Syst. 79, 16–25 (2018)
Banerjee, M., Chen, Z., Gangopadhyay, A.: A generic and distributed privacy preserving classification method with a worst-case privacy guarantee. Distrib. Parallel Databases 32, 5–35 (2014)
Pieprzyk, J., Hardjono, T., Seberry, J.: Fundamentals of Computer Security. Springer, Berlin (2003)
Yi, X., Bouguettaya, A., Georgakopoulos, D., Song, A., Willemson, J.: Privacy protection for wireless medical sensor data. IEEE Trans. Depend. Secur. Comput. 13, 369–380 (2016)
Yi, X., Paulet, R., Bertino, E.: Homomorphic Encryption and Applications. Springer, New York (2014)
Louhichi, S., Gzara, M., Ben-Abdallah, H.: MDCUT2: a multi-density clustering algorithm with automatic detection of density variation in data with noise. Distrib. Parallel Databases 37, 73–99 (2019)
Wu, K., Rusu, F.: Special issue on scientific and statistical data management. Distrib. Parallel Databases 37, 1–3 (2019)
Zhang, X., Zheng, F., Nguyen, B.: DeStager: feature guided in-situ data management in distributed deep memory hierarchies. Distrib. Parallel Databases 37, 209–231 (2019)
Kelarev, A., Yi, X., Badsha, S., Yang, X., Rylands, L., Seberry, J.: A multistage protocol for aggregated queries in distributed cloud databases with privacy protection. Future Gener. Comput. Syst. 90, 368–380 (2019)
Abawajy, J., Kelarev, A., Yi, X., Jelinek, H.F.: Minimal ensemble based on subset selection using ECG to diagnose categories of CAN. Comput. Methods Progr. Biomed. 160, 85–94 (2018)
Dai, H., Wang, M., Yi, X., Yang, G., Bao, J.: Secure MAX/MIN queries in two-tiered wireless sensor networks. IEEE Access 5, 14478–14489 (2017)
Li, W., Santos, I., Delicato, F.C., Pires, P.F., Pirmez, L., Wei, W., Song, H., Zomaya, A., Khan, S.: System modelling and performance evaluation of a three-tier cloud of things. Future Gener. Comput. Syst. 70, 104–125 (2017)
Wang, Y., Luo, J., Song, A., Dong, F.: OATS: online aggregation with two-level sharing strategy in cloud. Distrib. Parallel Databases 32, 467–505 (2014)
Zhang, M., Li, H., Liu, L., Buyya, R.: An adaptive multi-objective evolutionary algorithm for constrained workflow scheduling in clouds. Distrib. Parallel Databases 36, 339–368 (2018)
Zhang, S., Wang, G., Liu, Q., Abawajy, J.H.: A trajectory privacy-preserving scheme based on query exchange in mobile social networks. Soft Comput. 22, 6121–6133 (2018)
Goonetilleke, O., Koutra, D., Liao, K., Sellis, T.: On effective and efficient graph edge labeling. Distrib. Parallel Databases 37, 5–38 (2019)
That, D.H.T., Wagner, J., Rasin, A., Malik, T.: PLI+: efficient clustering of cloud databases. Distrib. Parallel Databases 37, 177–208 (2019)
Singh, K., Batten, L.: Aggregating privatized medical data for secure querying applications. Future Gener. Comput. Syst. 72, 250–263 (2017)
Papapetrou, O., Garofalakis, M.: Monitoring distributed fragmented skylines. Distrib. Parallel Databases 36, 675–715 (2018)
Jang, M., Song, Y., Chang, J.W.: A parallel computation of skyline using multiple regression analysis-based filtering on mapreduce. Distrib. Parallel Databases 35, 383–409 (2017)
Atzeni, P., Bellomarini, L., Bugiotti, F., De Leonardis, M.: Executable schema mappings for statistical data processing. Distrib. Parallel Databases 36, 265–300 (2018)
Zhu, Y., Xu, Q., Shi, H., Samsudin, J.: An efficient distributed search solution for federated cloud. Distrib. Parallel Databases 35, 411–433 (2017)
Au, M.H., Yuen, T.H., Liu, J.K., Susilo, W., Huang, X., Xiang, Y., Jiang, Z.L.: A general framework for secure sharing of personal health records in cloud system. J. Comput. Syst. Sci. 90, 46–62 (2017)
Vimalachandran, P., Wang, H., Zhang, Y., Heyward, B., Zhao, Y.: Preserving patient-centred controls in electronic health record systems: A reliance-based model implication. In: Proceeding of the 2017 International Conference on Orange Technologies, ICOT 2017, vol. 2018-January, pp. 37–44 (2018)
Vimalachandran, P., Wang, H., Zhang, Y., Zhuo, G., Kuang, H.: Cryptographic access control in electronic health record systems: a security implication. LNCS 10570, 540–549 (2017)
Liu, G., Yang, G., Wang, H., Xiang, Y., Dai, H.: A novel secure scheme for supporting complex SQL queries over encrypted databases in cloud computing. Secur. Commun. Netw. 2018, 7383514 (2018)
Sookhak, M., Yu, F.R., Khan, M.K., Xiang, Y., Buyya, R.: Attribute-based data access control in mobile cloud computing: taxonomy and open issues. Future Gener. Comput. Syst. 72, 273–287 (2017)
NIST/SEMATECH: E-handbook of Statistical Methods. http://www.itl.nist.gov/div898/handbook/ (2019). Accessed 5 May 2019
Ben-Or, M., Goldwasser, S., Wigderson, A.: Completeness theorems for non-cryptographic fault-tolerant distributed computation. In: Proceedings of 20th Annual ACM Symposium on Theory of Computing, STOC’88, ACM, pp. 1–10 (1988)
Shamir, A.: How to share a secret. Commun. ACM 22, 612–613 (1979)
Miller, K.: Athletic involvement study (of students in a Northeastern University in the United States). Inter-university Consortium for Political and Social Research [distributor], Ann Arbor, MI. https://doi.org/10.3886/ICPSR33661.v1 (2013). Accessed 15 Nov 2018
Kaplan, G.A.: Alameda County [California] health and ways of living study, 1974 panel. Inter-university Consortium for Political and Social Research [distributor], Ann Arbor, MI. https://doi.org/10.3886/ICPSR06838.v2 (2018). Accessed 15 Nov 2018
Wright, H.H., Capilouto, G.J.: Discourse processing in healthy aging in the United States. nter-university Consortium for Political and Social Research [distributor], Ann Arbor, MI (2017) https://doi.org/10.3886/ICPSR36634.v1. Accessed 15 Nov 2018
United States Department of Health and Human Services, Centers for Disease Control and Prevention, National Center for Health Statistics: National health interview survey, 1983. Inter-university Consortium for Political and Social Research [distributor], Ann Arbor, MI (2011) https://doi.org/10.3886/ICPSR08603.v4. Accessed 15 Nov 2018
Kenny, R.A.: The Irish Longitudinal Study on Ageing (TILDA). Inter-university Consortium for Political and Social Research [distributor], Ann Arbor, MI (2018). https://doi.org/10.3886/ICPSR34315.v2. Accessed 15 Nov 2018
Ryff, C., Almeida, D., Ayanian, J., Binkley, N., Carr, D.S., Coe, C., Williams, D.: Midlife in the United States (MIDUS 3) 2013-2014. Inter-university Consortium for Political and Social Research [distributor], Ann Arbor, MI (2017). https://doi.org/10.3886/ICPSR36346.v6. Accessed 15 Nov 2018
United States Department of Health and Human Services, Centers for Disease Control and Prevention, National Center for Health Statistics: National health interview survey, 2000. Inter-university Consortium for Political and Social Research [distributor], Ann Arbor, MI (2017). https://doi.org/10.3886/ICPSR03381.v2. Accessed 15 Nov 2018
Harris, K.M., Udry, J.R.: National longitudinal study of adolescent to adult health (add health), 1994–2008. Carolina Population Center, University of North Carolina-Chapel Hill [distributor], Inter-university Consortium for Political and Social Research [distributor], Ann Arbor, MI (2018). https://doi.org/10.3886/ICPSR21600.v21. Accessed 15 Nov 2018
Ryff, C., Kawakami, N., Kitayama, S., Karasawa, M., Markus, H., Coe, C.: Survey of midlife in Japan (MIDJA 2): Biomarker project, 2013-2014. Inter-university Consortium for Political and Social Research [distributor], Ann Arbon, MI (2018). https://doi.org/10.3886/ICPSR36530.v4. Accessed 15 Nov 2018
Acknowledgements
The authors are grateful to two anonymous reviewers for comments that have helped to improve this article.
Funding
This work has been supported by the Australian Research Council, Discovery Grant DP160100913.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Rylands, L., Seberry, J., Yi, X. et al. Collusion-resistant protocols for private processing of aggregated queries in distributed databases. Distrib Parallel Databases 39, 97–127 (2021). https://doi.org/10.1007/s10619-020-07293-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10619-020-07293-z