Abstract
With the emergence of AI technologies, intrinsic value of data is released and takes tremendous effects on numerous industries. In the context of regional medical care, data sharing and cooperating is in high demand, which can bring both financial and societal benefits. At present, however, medical data are locked inside medical facilities owing to legal risks and economic considerations. How to bring AI technologies into full play under this circumstance is a big challenge. In this paper, we propose Jupiter, an easy-to-use, secure, and high-performance platform for federated machine learning. Jupiter constructs a secure and highperformance aggregator cluster with SGX to efficiently aggregate the encrypted model parameters. Jupiter employs a stateful design to cooperate with medical facilities in regional medical systems with a fixed network connection. By providing an innovative programming abstraction, Jupiter makes model development more friendly to developers. The experiments show that with a low memory footprint, the throughput of a single node on an ordinary PC can reach 300 MB/s (with slice size fixed to 64 KB), and the aggregation primitive we built can process 11k aggregations per second.
Similar content being viewed by others
References
Mäenpää T, Suominen T, Asikainen P, et al. The outcomes of regional healthcare information systems in health care: a review of the research literature. Int J Med Inf, 2009, 78: 757–771
Adler-Milstein J, McAfee A P, Bates D W, et al. The state of regional health information organizations: current activities and financing. Health Affairs, 2007, 26: 60–69
Miotto R, Li L, Kidd B A, et al. Deep patient: an unsupervised representation to predict the future of patients from the electronic health records. Sci Rep, 2016, 6: 26094
Konečnä J, McMahan H B, Yu F X, et al. Federated learning: strategies for improving communication efficiency. 2016. ArXiv:1610.05492
Zhang S, Zhang S, Chen X, et al. Cloud computing research and development trend. In: Proceedings of 2010 2nd International Conference on Future Networks, 2010. 93–97
Wang H, Shi P, Zhang Y. Jointcloud: a cross-cloud cooperation architecture for integrated internet service customization. In: Proceedings of 2017 IEEE 37th International Conference On Distributed Computing Systems (ICDCS), 2017. 1846–1855
Jiang Z, Yin H. Adaptive routing algorithm for joint cloud video delivery. In: Proceedings of 2017 IEEE 37th International Conference on Distributed Computing Systems Workshops (ICDCSW), 2017. 316–319
Shi P, Wang H, Yue X, et al. Corporation architecture for multiple cloud service providers in jointcloud computing. In: Proceedings of 2017 IEEE 37th International Conference on Distributed Computing Systems Workshops (ICDCSW), 2017. 294–298
McMahan H B, Moore E, Ramage D, et al. Communication-efficient learning of deep networks from decentralized data. 2016. ArXiv:1602.05629
Li Q, Wen Z, He B. Federated learning systems: vision, hype and reality for data privacy and protection. 2019. ArXiv:1907.09693
Liu D, Miller T, Sayeed R, et al. FADL: federated-autonomous deep learning for distributed electronic health record. 2018. ArXiv:1811.11400
Yang Q, Liu Y, Chen T, et al. Federated machine learning. ACM Trans Intell Syst Technol, 2019, 10: 1–19
Sahu A K, Li T, Sanjabi M, et al. On the convergence of federated optimization in heterogeneous networks. 2018. ArXiv:1812.06127
Jiang P, Agrawal G. A linear speedup analysis of distributed deep learning with sparse and quantized communication. In: Proceedings of Advances in Neural Information Processing Systems, 2018. 2525–2536
Shokri R, Shmatikov V. Privacy-preserving deep learning. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, 2015. 1310–1321
Melis L, Song C, de Cristofaro E, et al. Inference attacks against collaborative learning. 2018. ArXiv:1805.04049
Bonawitz K, Ivanov V, Kreuter B, et al. Practical secure aggregation for privacy-preserving machine learning. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, 2017. 1175–1191
Geyer R C, Klein T, Nabi M. Differentially private federated learning: a client level perspective. 2017. ArXiv:1712.07557
McMahan H B, Ramage D, Talwar K, et al. Learning differentially private recurrent language models. 2017. ArXiv:1710.06963
Wang H, Sievert S, Liu S, et al. ATOMO: communication-efficient learning via atomic sparsification. In: Proceedings of Advances in Neural Information Processing Systems, 2018. 9850–9861
Bonawitz K, Eichner H, Grieskamp W, et al. Towards federated learning at scale: system design. 2019. ArXiv:1902.01046
Costan V, Devadas S. Intel SGX explained. IACR Cryptol ePrint Archive, 2016, 2016: 1–118
Taassori M, Shafiee A, Balasubramonian R. VAULT: reducing paging overheads in SGX with efficient integrity verification structures. In: Proceedings of the 23rd International Conference on Architectural Support for Programming Languages and Operating Systems, 2018. 665–678
Orenbach M, Lifshits P, Minkin M, et al. Eleos: exitless OS services for SGX enclaves. In: Proceedings of the 12th European Conference on Computer Systems, 2017. 238–253
Xing B C, Shanahan M, Leslie-Hurd R. Intel® software guard extensions (Intel® SGX) software support for dynamic memory allocation inside an enclave. In: Proceedings of the Hardware and Architectural Support for Security and Privacy, 2016. 11
Weisse O, Bertacco V, Austin T. Regaining lost cycles with hotcalls: a fast interface for SGX secure enclaves. SIGARCH Comput Archit News, 2017, 45: 81–93
Krahn R, Trach B, Vahldiek-Oberwagner A, et al. Pesos: policy enhanced secure object store. In: Proceedings of the 13th ACM European Conference on Computer Systems (EuroSys), 2018. 25
Priebe C, Vaswani K, Costa M. EnclaveDB: a secure database using SGX. In: Proceedings of 2018 IEEE Symposium on Security and Privacy (SP), 2018. 264–278
Kim T, Park J, Woo J, et al. Shieldstore: shielded in-memory key-value storage with SGX. In: Proceedings of the 14th ACM European Conference on Computer Systems (EuroSys), 2019. 14
Bailleu M, Thalheim J, Bhatotia P, et al. SPEICHER: securing LSM-based key-value stores using shielded execution. In: Proceedings of 17th USENIX Conference on File and Storage Technologies (FAST 19), 2019. 173–190
Tsai C C, Porter D E, Vij M. Graphene-SGX: a practical library OS for unmodified applications on SGX. In: Proceedings of 2017 USENIX Annual Technical Conference (USENIX ATC 17), 2017. 645–658
Arnautov S, Trach B, Gregor F, et al. SCONE: secure linux containers with Intel SGX. In: Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), 2016. 689–703
Ahmad A, Kim K, Sarfaraz M I, et al. Obliviate: a data oblivious filesystem for Intel SGX. In: Proceedings of Network and Distributed System Security Symposium, 2018
Shinde S, Wang S, Yuan P, et al. BesFS: mechanized proof of an iago-safe filesystem for enclaves. 2018. ArXiv:1807.00477
Duan H, Wang C, Yuan X, et al. Lightbox: full-stack protected stateful middlebox at lightning speed. In: Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, 2019. 2351–2367
Poddar R, Lan C, Popa R A, et al. Safebricks: shielding network functions in the cloud. In: Proceedings of the 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18), 2018. 201–216
Kim S, Han J, Ha J, et al. Enhancing security and privacy of tor’s ecosystem by using trusted execution environments. In: Proceedings of the 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17), 2017. 145–161
Goltzsche D, Wulf C, Muthukumaran D, et al. Trustjs: trusted client-side execution of javascript. In: Proceedings of the 10th European Workshop on Systems Security, 2017. 7
Ghosn A, Larus J R, Bugnion E. Secured routines: language-based construction of trusted execution environments. In: Proceedings of 2019 USENIX Annual Technical Conference (USENIX ATC 19), 2019. 571–586
Zheng W, Dave A, Beekman J G, et al. Opaque: an oblivious and encrypted distributed analytics platform. In: Proceedings of the 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17), 2017. 283–298
Havet A, Pires R, Felber P, et al. Securestreams: a reactive middleware framework for secure data stream processing. In: Proceedings of the 11th ACM International Conference on Distributed and Event-based Systems, 2017. 124–133
Schuster F, Costa M, Fournet C, et al. VC3: trustworthy data analytics in the cloud using SGX. In: Proceedings of 2015 IEEE Symposium on Security and Privacy, 2015. 38–54
Sheth A P, Larson J A. Federated database systems for managing distributed, heterogeneous, and autonomous databases. ACM Comput Surv, 1990, 22: 183–236
Jayarajan A, Wei J, Gibson G, et al. Priority-based parameter propagation for distributed DNN training. 2019. ArXiv:1905.03960
Nasr M, Shokri R, Houmansadr A. Comprehensive privacy analysis of deep learning: passive and active white-box inference attacks against centralized and federated learning. In: Proceedings of 2019 IEEE Symposium on Security and Privacy (SP), 2019. 739–753
Cho J, Chang H, Mukherjee S, et al. Typhoon: an SDN enhanced real-time big data streaming framework. In: Proceedings of the 13th International Conference on Emerging Networking Experiments and Technologies, 2017. 310–322
Acknowledgements
This work was supported by National Natural Science Foundation of China (Grant Nos. 62041203, 92067206, 61972222) and National Key Research and Development Program of China (Grant No. 2018YFB2100804).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Xing, J., Tian, J., Jiang, Z. et al. Jupiter: a modern federated learning platform for regional medical care. Sci. China Inf. Sci. 64, 202101 (2021). https://doi.org/10.1007/s11432-020-3062-8
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11432-020-3062-8