当前位置: X-MOL 学术Nat. Genet. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Aberrant splicing prediction across human tissues
Nature Genetics ( IF 30.8 ) Pub Date : 2023-05-04 , DOI: 10.1038/s41588-023-01373-3
Nils Wagner 1, 2 , Muhammed H Çelik 1, 3 , Florian R Hölzlwimmer 1 , Christian Mertes 1, 4 , Holger Prokisch 5, 6 , Vicente A Yépez 1 , Julien Gagneur 1, 2, 5, 6
Affiliation  

Aberrant splicing is a major cause of genetic disorders but its direct detection in transcriptomes is limited to clinically accessible tissues such as skin or body fluids. While DNA-based machine learning models can prioritize rare variants for affecting splicing, their performance in predicting tissue-specific aberrant splicing remains unassessed. Here we generated an aberrant splicing benchmark dataset, spanning over 8.8 million rare variants in 49 human tissues from the Genotype-Tissue Expression (GTEx) dataset. At 20% recall, state-of-the-art DNA-based models achieve maximum 12% precision. By mapping and quantifying tissue-specific splice site usage transcriptome-wide and modeling isoform competition, we increased precision by threefold at the same recall. Integrating RNA-sequencing data of clinically accessible tissues into our model, AbSplice, brought precision to 60%. These results, replicated in two independent cohorts, substantially contribute to noncoding loss-of-function variant identification and to genetic diagnostics design and analytics.



中文翻译:

跨人体组织的异常剪接预测

异常剪接是遗传疾病的主要原因,但其在转录组中的直接检测仅限于临床可及的组织,如皮肤或体液。虽然基于 DNA 的机器学习模型可以优先考虑影响剪接的罕见变异,但它们在预测组织特异性异常剪接方面的表现仍未得到评估。在这里,我们生成了一个异常剪接基准数据集,涵盖来自基因型组织表达 (GTEx) 数据集的 49 种人体组织中超过 880 万个罕见变异。在 20% 的召回率下,最先进的基于 DNA 的模型可实现最高 12% 的精度。通过映射和量化组织特异性剪接位点在整个转录组范围内的使用和对亚型竞争的建模,我们在相同的召回率下将精确度提高了三倍。将临床可及组织的 RNA 测序数据整合到我们的模型中,AbSplice,将精度提高到 60%。这些结果在两个独立的队列中重复,极大地促进了非编码功能丧失变异的识别以及遗传诊断设计和分析。

更新日期:2023-05-05
down
wechat
bug