Healthcare Cramér Generative Adversarial Network (HCGAN)

Indhumathi, R.; Devi, S. Sathiya

doi:10.1007/s10619-021-07346-x

Healthcare Cramér Generative Adversarial Network (HCGAN)

Published: 25 June 2021

Volume 40, pages 657–673, (2022)
Cite this article

Distributed and Parallel Databases Aims and scope Submit manuscript

R. Indhumathi¹ &
S. Sathiya Devi²

499 Accesses
6 Citations
3 Altmetric
Explore all metrics

Abstract

Medical data is shared with a wide range for various research purposes and an extensive amount of research has been developed in the data privacy community for anonymization. Unfortunately, Data anonymization techniques do not provide data privacy guarantees and synthetic data generation is an alternative approach in data anonymization. Deep learning has recently achieved more reputation for its high accuracy and privacy concern. Nowadays, deep learning is extensively applied in the medical field for classification, segmentation and privacy-preserving. Using Deep learning, synthetic data can be generated to improve the privacy of the original medical data and also to prevent attacks. Deep learning models capture the relationship between multiple features in medical data. In this research, Healthcare Cramér Generative Adversarial Network (HCGAN) is proposed, where (i) the Quasi Identifiers (QI) are identified in medical data and separated as QI attributes and the remaining attributes are considered as Sensitive Attributes (SA) (ii) f–differential privacy anonymization technique is applied only to the identified QI and the final result is combined with the SA attribute (iii) The anonymized medical data is used as real data for training Cramér Generative Adversarial Network (GAN) where Cramér distance is used to improve the efficiency of the model. (iv) Finally, Privacy is checked by overcoming the attacks. The result shows that the HCGAN method effectively prevents attacks during the training and testing phase compared to Wasserstein GAN. The result demonstrates that health care GAN generates synthetic data that can provide high privacy and overcome various attacks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Revolutionizing healthcare: the role of artificial intelligence in clinical practice

Article Open access 22 September 2023

Big data in healthcare: management, analysis and future prospects

Article Open access 19 June 2019

Deep learning modelling techniques: current progress, applications, advantages, and challenges

Article Open access 17 April 2023

Abbreviations

D:: Randomized algorithm
G:: Generator model
h:: Transformation
\(l\) :: Cramér distance
M:: Noise prior
P :: Generator distribution
p:: Joint probability
Q :: Target distribution
R:: Data instances
S, T, S′, T′:: Random variables
\(T\) :: Set of labels
X, Y :: Distribution function
y:: Adjacent database
ε:: Privacy budget
δ:: Failure rate
θ:: Sensitivity
\(\sigma\) :: Common variance
\(\nabla\) :: Stochastic gradient

References

Office of the National Coordinator for Health Information Technology. Guide to Privacy and Security of Health Information. http://www.healthit.gov/sites/default/files/pdf/privacy/privacy-and-security-guide.pdf. Accessed 10 Aug 2012
Sathiya Devi, S., Indhumathi, R.: A study on privacy-preserving approaches in online social networks for data publishing. In: Proceedings of the Advances in Intelligent Systems and Computing (2019)
Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., et al.: Generative adversarial nets. In: Proceedings of the Advances in Neural Information Processing Systems (2014)
Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: 2nd International Conference on Learning Representations, ICLR 2014 - Conference Track Proceedings (2014)
Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. (2006). https://doi.org/10.1162/neco.2006.18.7.1527
Article MathSciNet MATH Google Scholar
Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. (2002). https://doi.org/10.1142/S0218488502001648
Article MathSciNet MATH Google Scholar
Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, M.: ℓ-Diversity: privacy beyond k-anonymity. In: Proceedings of the International Conference on Data Engineering (2006)
Ninghui, L., Tiancheng, L., Venkatasubramanian, S.: t-Closeness: privacy beyond k-anonymity and ℓ-diversity. In: Proceedings of the International Conference on Data Engineering (2007)
Dwork, C., Roth, A.: The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci. (2013). https://doi.org/10.1561/0400000042
Article MathSciNet MATH Google Scholar
Gardner, J., Xiong, L.: HIDE: An integrated system for health information DE-identification. In: Proceedings of the IEEE Symposium on Computer-Based Medical Systems (2008)
Loukides, G., Liagouris, J., Gkoulalas-Divanis, A., Terrovitis, M.: Disassociation for electronic health record privacy. J. Biomed. Inf. (2014). https://doi.org/10.1016/j.jbi.2014.05.009
Article Google Scholar
Prasser, F., Spengler, H., Bild, R., et al.: Privacy-enhancing ETL-processes for biomedical data. Int. J. Med. Inf. (2019). https://doi.org/10.1016/j.ijmedinf.2019.03.006
Article Google Scholar
Lu, Y., Sinnott, R.O., Verspoor, K..: A semantic-based k-anonymity scheme for health record linkage. In: Studies in Health Technology and Informatics (2017)
Lee, H., Kim, S., Kim, J.W., Chung, Y.D.: Utility-preserving anonymization for health data publishing. BMC Med. Inf. Decis. Mak. (2017). https://doi.org/10.1186/s12911-017-0499-0
Article Google Scholar
Zhang, J., Cormode, G., Procopiuc, C.M., et al.: PrivBayes: Private data release via Bayesian networks. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (2014)
Acs, G., Melis, L., Castelluccia, C., De Cristofaro, E.: Differentially private mixture of generative neural networks. IEEE Trans Knowl Data Eng (2019). https://doi.org/10.1109/TKDE.2018.2855136
Article Google Scholar
Kaushik, S., Choudhury, A., Natarajan, S., Pickett, L.A., Dutt, V.: Medicine expenditure prediction via a variance-based generative adversarial network. IEEE Access 8, 110947–110958 (2020)
Article Google Scholar
Li, Y., Wang, Y., Wang, Y., Ke, L., Tan, Y.A.: A feature-vector generative adversarial network for evading PDF malware classifiers. Inf. Sci. 523, 38–48 (2020)
Article Google Scholar
Beaulieu-Jones, B.K., Wu, Z.S., Williams, C., Lee, R., Bhavnani, S.P., Byrd, J.B., Greene, C.S.: Privacy-preserving generative deep neural networks support clinical data sharing. Circulation 12(7), e005122 (2019)
Google Scholar
Choi, E., Biswal, S., Malin, B., Duke, J., Stewart, W.F. and Sun, J.: Generating multi-label discrete patient records using generative adversarial networks. In: Proceedings of the Machine Learning for Healthcare Conference. pp. 286–305. PMLR (2017)
Abay, N.C., Zhou, Y., Kantarcioglu, M., et al.: Privacy preserving synthetic data release using deep learning. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2019)
Shokri, R., Stronati, M., Song, C., Shmatikov, V.: Membership Inference Attacks Against Machine Learning Models. In: Proceedings of the IEEE Symposium on Security and Privacy (2017)
Fredrikson, M., Jha, S., Ristenpart, T.: Model inversion attacks that exploit confidence information and basic countermeasures. In: Proceedings of the ACM Conference on Computer and Communications Security (2015)
Fredrikson, M., Lantz, E., Jha, S., et al.: Privacy in pharmacogenetics: An end-to-end case study of personalized warfarin dosing. In: Proceedings of the 23rd USENIX Security Symposium (2014)
Hitaj, B., Ateniese, G., Perez-Cruz, F.: Deep Models under the GAN: Information leakage from collaborative deep learning. In: Proceedings of the ACM Conference on Computer and Communications Security (2017)
Elliot, M.J., Manning, A., Mayes, K., Gurd, J., Bane, M.: "SUDA: A Program for Detecting Special Uniques”. In: Proceedings of UNECE Work Session on Statistical Data Confidentiality (2005)
Manning, A.M., Haglin, D.J.: A new algorithm for finding minimal sample uniques for use in statistical disclosure assessment. In: Proceedings of the IEEE International Conference on Data Mining, ICDM (2005)
Lodha, S., Thomas, D.: Probabilistic anonymity. Lect Notes Comput Sci (including Subser Lect Notes Artif Intell Lect Notes Bioinformatics). https://doi.org/10.1007/978-3-540-78478-4_4 (2008)
Motwani, R., Xu, Y.: Efficient Algorithms for Masking and Finding Quasi-Identifiers. VLDB ’07 (2007)
Dwork, C., Rothblum, G.N., Vadhan, S.: Boosting and differential privacy. In: Proceedings - Annual IEEE Symposium on Foundations of Computer Science, FOCS (2010)
Dwork, C.: Differential Privacy: A Survey of Results. Theory and Applications of Models of Computation. Springer, Berlin (2008)
MATH Google Scholar
Dwork, C., Kenthapadi, K., McSherry, F., et al.: Our data, ourselves: Privacy via distributed noise generation. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2006)
Dong, J., Roth, A., Su, W.J.: Gaussian differential privacy. arXiv (2019)
Arjovsky, M., Bottou, L.: Towards principled methods for training generative adversarial networks. In: Proceedings of the 5th International Conference on Learning Representations, ICLR 2017 - Conference Track Proceedings (2017)
Andoni. A., Indyk, P., Krauthgamer, R.: Earth Mover Distance over high-dimensional spaces. In: Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms (2008)
Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein generative adversarial networks. In: Proceedings of the 34th International Conference on Machine Learning, ICML 2017 (2017)
Bellemare, M.G., Danihelka, I., Dabney, W., et al.: The cramer distance as a solution to biased wasserstein gradients. arXiv (2017)
Cerda, P., Varoquaux, G.: Encoding high-cardinality string categorical variables. arXiv (2013)
Lichman, M.: UCI Machine Learning Repository. http://archive.uci.edu/ml (2013)
Hospital discharge data public use data life (2018)
Abdar, M., Zomorodi-Moghadam, M., Zhou, X., Gururajan, R., Tao, X., Barua, P.D., Gururajan, R.: A new nested ensemble technique for automated diagnosis of breast cancer. Pattern Recogn. Lett. 132, 123–131 (2020)
Article Google Scholar
Kadam, V.J., Jadhav, S.M., Vijayakumar, K.: Breast cancer diagnosis using feature ensemble learning based on stacked sparse autoencoders and softmax regression. J. Med. Syst. 43(8), 1–11 (2019)
Article Google Scholar

Download references

Funding

This research received no external funding.

Author information

Authors and Affiliations

Department of CSE, M.A.M College of Engineering & Technology, Trichy, Tamil Nadu, India
R. Indhumathi
Department of CSE, UCE, BIT Campus, Trichy, Tamil Nadu, India
S. Sathiya Devi

Authors

R. Indhumathi
View author publications
You can also search for this author in PubMed Google Scholar
S. Sathiya Devi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to R. Indhumathi.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Indhumathi, R., Devi, S.S. Healthcare Cramér Generative Adversarial Network (HCGAN). Distrib Parallel Databases 40, 657–673 (2022). https://doi.org/10.1007/s10619-021-07346-x

Download citation

Accepted: 18 June 2021
Published: 25 June 2021
Issue Date: December 2022
DOI: https://doi.org/10.1007/s10619-021-07346-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Healthcare Cramér Generative Adversarial Network (HCGAN)

Abstract

Access this article

Similar content being viewed by others

Revolutionizing healthcare: the role of artificial intelligence in clinical practice

Big data in healthcare: management, analysis and future prospects

Deep learning modelling techniques: current progress, applications, advantages, and challenges

Abbreviations

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Healthcare Cramér Generative Adversarial Network (HCGAN)

Abstract

Access this article

Similar content being viewed by others

Revolutionizing healthcare: the role of artificial intelligence in clinical practice

Big data in healthcare: management, analysis and future prospects

Deep learning modelling techniques: current progress, applications, advantages, and challenges

Abbreviations

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation