当前位置: X-MOL 学术IEEE J. Biomed. Health Inform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
DNA-SeAl: Sensitivity Levels to Optimize the Performance of Privacy-Preserving DNA Alignment
IEEE Journal of Biomedical and Health Informatics ( IF 6.7 ) Pub Date : 2020-03-01 , DOI: 10.1109/jbhi.2019.2914952
Maria Fernandes , Jeremie Decouchant , Marcus Volp , Francisco M. Couto , Paulo Esteves-Verissimo

The advent of next-generation sequencing (NGS) machines made DNA sequencing cheaper, but also put pressure on the genomic life-cycle, which includes aligning millions of short DNA sequences, called reads, to a reference genome. On the performance side, efficient algorithms have been developed, and parallelized on public clouds. On the privacy side, since genomic data are utterly sensitive, several cryptographic mechanisms have been proposed to align reads more securely than the former, but with a lower performance. This paper presents DNA-SeAl a novel contribution to improving the privacy × performance product in current genomic workflows. First, building on recent works that argue that genomic data needs to be treated according to a threat-risk analysis, we introduce a multi-level sensitivity classification of genomic variations designed to prevent the amplification of possible privacy attacks. We show that the usage of sensitivity levels reduces future re-identification risks, and that their partitioning helps prevent linkage attacks. Second, after extending this classification to reads, we show how to align and store reads using different security levels. To do so, DNA-SeAl extends a recent reads filter to classify unaligned reads into sensitivity levels, and adapts existing alignment algorithms to the reads sensitivity. We show that using DNA-SeAl allows high performance gains whilst enforcing high privacy levels in hybrid cloud environments.

中文翻译:

DNA-SeAl:灵敏度水平,以优化隐私保护DNA对齐方式的性能

下一代测序(NGS)机器的出现使DNA测序便宜了,但也给基因组生命周期带来了压力,其中包括将数百万个短的DNA序列(称为读数)与参考基因组进行比对。在性能方面,已经开发了有效的算法,并将其在公共云上并行化。在隐私方面,由于基因组数据非常敏感,因此提出了几种加密机制,比前者更安全地对齐读取,但性能较低。本文介绍了DNA-SeAl对改善当前基因组工作流程中的隐私×性能产品的新贡献。首先,基于最近的研究认为需要根据威胁风险分析来处理基因组数据,我们介绍了基因组变异的多级敏感性分类,旨在防止可能的隐私攻击放大。我们表明,使用敏感度级别可以减少将来的重新识别风险,并且它们的分区有助于防止链接攻击。其次,将分类扩展到读取之后,我们将展示如何使用不同的安全级别来对齐和存储读取。为此,DNA-SeA1扩展了最近的读取过滤器,以将未比对的读取分类为敏感度水平,并使现有的比对算法适应读取的敏感度。我们证明了使用DNA-SeAl可以实现高性能,同时在混合云环境中强制执行较高的隐私级别。并且它们的分区有助于防止链接攻击。其次,在将这种分类扩展到读后,我们展示了如何使用不同的安全级别来对齐和存储读。为此,DNA-SeA1扩展了最近的读取过滤器,以将未比对的读取分类为敏感度水平,并使现有的比对算法适应读取的敏感度。我们证明了使用DNA-SeAl可以实现高性能,同时在混合云环境中强制执行较高的隐私级别。并且它们的分区有助于防止链接攻击。其次,将分类扩展到读取之后,我们将展示如何使用不同的安全级别来对齐和存储读取。为此,DNA-SeA1扩展了最近的读取过滤器,以将未比对的读取分类为敏感度水平,并使现有的比对算法适应读取的敏感度。我们证明了使用DNA-SeAl可以实现高性能,同时在混合云环境中强制执行较高的隐私级别。
更新日期:2020-03-01
down
wechat
bug