当前位置: X-MOL 学术arXiv.cs.CR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Citadel: Protecting Data Privacy and Model Confidentiality for Collaborative Learning with SGX
arXiv - CS - Cryptography and Security Pub Date : 2021-05-04 , DOI: arxiv-2105.01281
Chengliang Zhang, Junzhe Xia, Baichen Yang, Huancheng Puyang, Wei Wang, Ruichuan Chen, Istemi Ekin Akkus, Paarijaat Aditya, Feng Yan

With the advancement of machine learning (ML) and its growing awareness, many organizations who own data but not ML expertise (data owner) would like to pool their data and collaborate with those who have expertise but need data from diverse sources to train truly generalizable models (model owner). In such collaborative ML, the data owner wants to protect the privacy of its training data, while the model owner desires the confidentiality of the model and the training method which may contain intellectual properties. However, existing private ML solutions, such as federated learning and split learning, cannot meet the privacy requirements of both data and model owners at the same time. This paper presents Citadel, a scalable collaborative ML system that protects the privacy of both data owner and model owner in untrusted infrastructures with the help of Intel SGX. Citadel performs distributed training across multiple training enclaves running on behalf of data owners and an aggregator enclave on behalf of the model owner. Citadel further establishes a strong information barrier between these enclaves by means of zero-sum masking and hierarchical aggregation to prevent data/model leakage during collaborative training. Compared with the existing SGX-protected training systems, Citadel enables better scalability and stronger privacy guarantees for collaborative ML. Cloud deployment with various ML models shows that Citadel scales to a large number of enclaves with less than 1.73X slowdown caused by SGX.

中文翻译:

Citadel:通过SGX保护数据隐私和模型机密以进行协作学习

随着机器学习(ML)的发展和认知度的提高,许多拥有数据但不具备ML专业知识的组织(数据所有者)希望汇总其数据并与那些拥有专业知识但需要来自不同来源的数据的人进行协作以培训真正的通用性模型(模型所有者)。在这种协作式ML中,数据所有者希望保护其训练数据的私密性,而模型所有者则希望模型和训练方法的机密性(可能包含知识产权)。但是,现有的私有ML解决方案(例如联合学习和拆分学习)无法同时满足数据和模型所有者的隐私要求。本文介绍了城堡,可扩展的协作式ML系统,借助英特尔SGX可以保护不受信任的基础架构中的数据所有者和模型所有者的隐私。Citadel在代表数据所有者运行的多个培训区域和代表模型所有者运行的聚合器区域之间执行分布式培训。Citadel还通过零和屏蔽和分层聚合在这些飞地之间建立了强大的信息屏障,以防止在协作训练期间数据/模型泄漏。与现有的受SGX保护的培训系统相比,Citadel为协作式机器学习提供了更好的可伸缩性和更强的隐私保证。具有各种ML模型的云部署表明,Citadel可扩展到大量飞地,而SGX导致的速度降低不到1.73倍。Citadel在代表数据所有者运行的多个培训区域和代表模型所有者运行的聚合器区域之间执行分布式培训。Citadel还通过零和屏蔽和分层聚合在这些飞地之间建立了强大的信息屏障,以防止在协作训练期间数据/模型泄漏。与现有的受SGX保护的培训系统相比,Citadel为协作式机器学习提供了更好的可伸缩性和更强的隐私保证。具有各种ML模型的云部署表明,Citadel可扩展到大量飞地,而SGX导致的速度降低不到1.73倍。Citadel在代表数据所有者运行的多个培训区域和代表模型所有者运行的聚合器区域之间执行分布式培训。Citadel还通过零和屏蔽和分层聚合在这些飞地之间建立了强大的信息屏障,以防止在协作训练期间数据/模型泄漏。与现有的受SGX保护的培训系统相比,Citadel为协作式机器学习提供了更好的可伸缩性和更强的隐私保证。具有各种ML模型的云部署表明,Citadel可扩展到大量飞地,而SGX导致的速度降低不到1.73倍。Citadel还通过零和屏蔽和分层聚合在这些飞地之间建立了强大的信息屏障,以防止在协作训练期间数据/模型泄漏。与现有的受SGX保护的培训系统相比,Citadel为协作式机器学习提供了更好的可伸缩性和更强的隐私保证。具有各种ML模型的云部署表明,Citadel可扩展到大量飞地,而SGX导致的速度降低不到1.73倍。Citadel还通过零和屏蔽和分层聚合在这些飞地之间建立了强大的信息屏障,以防止在协作训练期间数据/模型泄漏。与现有的受SGX保护的培训系统相比,Citadel为协作式机器学习提供了更好的可伸缩性和更强的隐私保证。具有各种ML模型的云部署表明,Citadel可扩展到大量飞地,而SGX导致的速度降低不到1.73倍。
更新日期:2021-05-05
down
wechat
bug