当前位置: X-MOL 学术Nat. Methods › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Sketching algorithms for genomic data analysis and querying in a secure enclave.
Nature Methods ( IF 36.1 ) Pub Date : 2020-03-04 , DOI: 10.1038/s41592-020-0761-8
Can Kockan 1, 2 , Kaiyuan Zhu 1, 2 , Natnatee Dokmai 1 , Nikolai Karpov 1 , M Oguzhan Kulekci 3 , David P Woodruff 4 , S Cenk Sahinalp 2
Affiliation  

Genome-wide association studies (GWAS), especially on rare diseases, may necessitate exchange of sensitive genomic data between multiple institutions. Since genomic data sharing is often infeasible due to privacy concerns, cryptographic methods, such as secure multiparty computation (SMC) protocols, have been developed with the aim of offering privacy-preserving collaborative GWAS. Unfortunately, the computational overhead of these methods remain prohibitive for human-genome-scale data. Here we introduce SkSES (https://github.com/ndokmai/sgx-genome-variants-search), a hardware-software hybrid approach for privacy-preserving collaborative GWAS, which improves the running time of the most advanced cryptographic protocols by two orders of magnitude. The SkSES approach is based on trusted execution environments (TEEs) offered by current-generation microprocessors-in particular, Intel's SGX. To overcome the severe memory limitation of the TEEs, SkSES employs novel 'sketching' algorithms that maintain essential statistical information on genomic variants in input VCF files. By additionally incorporating efficient data compression and population stratification reduction methods, SkSES identifies the top k genomic variants in a cohort quickly, accurately and in a privacy-preserving manner.

中文翻译:

用于在安全区域内进行基因组数据分析和查询的草图绘制算法。

全基因组关联研究(GWAS),尤其是关于罕见病的研究,可能需要在多个机构之间交换敏感的基因组数据。由于出于隐私考虑,基因组数据共享通常不可行,因此开发了诸如安全多方计算(SMC)协议之类的加密方法,目的是提供保护隐私的协作GWAS。不幸的是,这些方法的计算开销对于人类基因组规模的数据仍然是不允许的。在这里,我们介绍SkSES(https://github.com/ndokmai/sgx-genome-variants-search),这是一种用于保护隐私的协作GWAS的硬件-软件混合方法,它将最先进的加密协议的运行时间缩短了两个数量级。SkSES方法基于当前微处理器(尤其是英特尔的SGX)提供的可信执行环境(TEE)。为了克服TEE的严格内存限制,SkSES采用了新颖的“素描”算法,该算法在输入VCF文件中维护了有关基因组变异的基本统计信息。通过额外结合有效的数据压缩和减少人群分层的方法,SkSES可以快速,准确且以隐私保护的方式在队列中识别出前k个基因组变异。
更新日期:2020-03-04
down
wechat
bug