当前位置: X-MOL 学术Science › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Identity inference of genomic data using long-range familial searches
Science ( IF 44.7 ) Pub Date : 2018-10-11 , DOI: 10.1126/science.aau4832
Yaniv Erlich 1, 2, 3, 4 , Tal Shor 1 , Itsik Pe'er 2, 3 , Shai Carmi 5
Affiliation  

Detecting familial matches Recent advances in DNA technology and companies that provide array-based testing have led to services that collect, share, and analyze volunteered genomic information. Privacy concerns have been raised, especially in light of the use of these services by law enforcement to identify suspects in criminal cases. Testing models of relatedness, Erlich et al. show that many individuals of European ancestry in the United States—even those that have not undergone genetic testing—can be identified on the basis of available genetic information. These results indicate a need for procedures to help maintain genetic privacy for individuals. Science, this issue p. 690 Genetic privacy is difficult to maintain in light of forensic searches of genetic genealogical databases. Consumer genomics databases have reached the scale of millions of individuals. Recently, law enforcement authorities have exploited some of these databases to identify suspects via distant familial relatives. Using genomic data of 1.28 million individuals tested with consumer genomics, we investigated the power of this technique. We project that about 60% of the searches for individuals of European descent will result in a third-cousin or closer match, which theoretically allows their identification using demographic identifiers. Moreover, the technique could implicate nearly any U.S. individual of European descent in the near future. We demonstrate that the technique can also identify research participants of a public sequencing project. On the basis of these results, we propose a potential mitigation strategy and policy implications for human subject research.

中文翻译:

使用远程家族搜索对基因组数据进行身份推断

检测家族匹配 DNA 技术的最新进展以及提供基于阵列的检测的公司已经导致了收集、共享和分析自愿提供的基因组信息的服务。隐私问题引起了人们的关注,特别是考虑到执法部门使用这些服务来识别刑事案件中的嫌疑人。相关性测试模型,Erlich 等。表明在美国的许多欧洲血统的人——即使是那些没有经过基因检测的人——都可以根据可用的遗传信息进行识别。这些结果表明需要程序来帮助维护个人的遗传隐私。科学,这个问题 p。690 鉴于对遗传系谱数据库的法医搜索,很难维护遗传隐私。消费者基因组数据库已达到数百万人的规模。最近,执法部门利用其中一些数据库通过远亲来识别嫌疑人。使用消费者基因组学测试的 128 万个人的基因组数据,我们研究了这项技术的力量。我们预计,大约 60% 的欧洲血统个人搜索将导致三表亲或更近的匹配,这在理论上允许使用人口统计标识符来识别他们。此外,该技术在不久的将来可能涉及几乎所有欧洲血统的美国个人。我们证明该技术还可以识别公共测序项目的研究参与者。在这些结果的基础上,
更新日期:2018-10-11
down
wechat
bug