当前位置: X-MOL 学术Science › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Transcriptomic signatures across human tissues identify functional rare genetic variation
Science ( IF 44.7 ) Pub Date : 2020-09-10 , DOI: 10.1126/science.aaz5900
Nicole M Ferraro 1 , Benjamin J Strober 2 , Jonah Einson 3, 4 , Nathan S Abell 5 , Francois Aguet 6 , Alvaro N Barbeira 7 , Margot Brandt 4, 8 , Maja Bucan 9 , Stephane E Castel 4, 8 , Joe R Davis 10 , Emily Greenwald 5 , Gaelen T Hess 5 , Austin T Hilliard 11 , Rachel L Kember 9 , Bence Kotis 12 , YoSon Park 13 , Gina Peloso 14 , Shweta Ramdas 9 , Alexandra J Scott 15 , Craig Smail 1 , Emily K Tsang 10 , Seyedeh M Zekavat 16 , Marcello Ziosi 4 , Aradhana 5 , , Kristin G Ardlie 6 , Themistocles L Assimes 11, 17 , Michael C Bassik 5 , Christopher D Brown 9 , Adolfo Correa 18 , Ira Hall 15 , Hae Kyung Im 7 , Xin Li 10, 19 , Pradeep Natarajan 20, 21, 22 , , Tuuli Lappalainen 4, 8 , Pejman Mohammadi 4, 12, 23 , Stephen B Montgomery 5, 10 , Alexis Battle 2, 24
Affiliation  

Outliers in the human transcriptome reveal the functional effects of rare genetic variants. The great seismic quiet period Every human genome contains tens of thousands of rare genetic variants—which include single nucleotide changes, insertions or deletions, and larger structural variants—and some may have a functional effect. Ferraro et al. examined data from individuals in the Genotype-Tissue Expression (GTEx) project for outliers across tissues caused by gene expression, splicing, and allele-specific expression. Single rare variants were observed that affected the expression and allele-specific expression of multiple genes and, in the case of a gene fusion event, splicing. Experimental and computational validation suggest that many individuals carry more than 50 rare variants that affect transcription in some way. Although most variants were predicted to not affect an individual's phenotype, a small percentage showed likely disease-related associations, emphasizing the importance of studying the impact of rare genetic variation on the transcriptome. Science, this issue p. eaaz5900 INTRODUCTION The human genome contains tens of thousands of rare (minor allele frequency <1%) variants, some of which contribute to disease risk. Using 838 samples with whole-genome and multitissue transcriptome sequencing data in the Genotype-Tissue Expression (GTEx) project version 8, we assessed how rare genetic variants contribute to extreme patterns in gene expression (eOutliers), allelic expression (aseOutliers), and alternative splicing (sOutliers). We integrated these three signals across 49 tissues with genomic annotations to prioritize high-impact rare variants (RVs) that associate with human traits. RATIONALE Outlier gene expression aids in identifying functional RVs. Transcriptome sequencing provides diverse measurements beyond gene expression, including allele-specific expression and alternative splicing, which can provide additional insight into RV functional effects. RESULTS After identifying multitissue eOutliers, aseOutliers, and sOutliers, we found that outlier individuals of each type were significantly more likely to carry an RV near the corresponding gene. Among eOutliers, we observed strong enrichment of rare structural variants. sOutliers were particularly enriched for RVs that disrupted or created a splicing consensus sequence. aseOutliers provided the strongest enrichment signal when evaluated from just a single tissue. We developed Watershed, a probabilistic model for personal genome interpretation that improves over standard genomic annotation–based methods for scoring RVs by integrating these three transcriptomic signals from the same individual and replicates in an independent cohort. To assess whether outlier RVs identified in GTEx associate with traits, we evaluated these variants for association with diverse traits in the UK Biobank, the Million Veterans Program, and the Jackson Heart Study. We found that transcriptome-assisted prioritization identified RVs with larger trait effect sizes and were better predictors of effect size than genomic annotation alone. CONCLUSION With >800 genomes matched with transcriptomes across 49 tissues, we were able to study RVs that underlie extreme changes in the transcriptome. To capture the diversity of these extreme changes, we developed and integrated approaches to identify expression, allele-specific expression, and alternative splicing outliers, and characterized the RV landscape underlying each outlier signal. We demonstrate that personal genome interpretation and RV discovery is enhanced by using these signals. This approach provides a new means to integrate a richer set of functional RVs into models of genetic burden, improve disease gene identification, and enable the delivery of precision genomics. Transcriptomic signatures identify functional rare genetic variation. We identified genes in individuals that show outlier expression, allele-specific expression, or alternative splicing and assessed enrichment of nearby rare variation. We integrated these three outlier signals with genomic annotation data to prioritize functional RVs and to intersect those variants with disease loci to identify potential RV trait associations. Rare genetic variants are abundant across the human genome, and identifying their function and phenotypic impact is a major challenge. Measuring aberrant gene expression has aided in identifying functional, large-effect rare variants (RVs). Here, we expanded detection of genetically driven transcriptome abnormalities by analyzing gene expression, allele-specific expression, and alternative splicing from multitissue RNA-sequencing data, and demonstrate that each signal informs unique classes of RVs. We developed Watershed, a probabilistic model that integrates multiple genomic and transcriptomic signals to predict variant function, validated these predictions in additional cohorts and through experimental assays, and used them to assess RVs in the UK Biobank, the Million Veterans Program, and the Jackson Heart Study. Our results link thousands of RVs to diverse molecular effects and provide evidence to associate RVs affecting the transcriptome with human traits.

中文翻译:


人体组织的转录组特征识别功能性罕见遗传变异



人类转录组中的异常值揭示了罕见遗传变异的功能影响。大地震平静期 每个人类基因组都包含数以万计的罕见遗传变异,其中包括单核苷酸变化、插入或缺失,以及更大的结构变异,其中一些可能具有功能效应。费拉罗等人。检查了基因型组织表达(GTEx)项目中个体的数据,以查找由基因表达、剪接和等位基因特异性表达引起的跨组织异常值。观察到单个罕见变异影响多个基因的表达和等位基因特异性表达,并且在基因融合事件的情况下影响剪接。实验和计算验证表明,许多个体携带 50 多种罕见变异,这些变异以某种方式影响转录。尽管大多数变异被预测不会影响个体的表型,但一小部分显示出可能与疾病相关的关联,这强调了研究罕见遗传变异对转录组的影响的重要性。科学,本期第 14 页。 eaaz5900 简介 人类基因组包含数以万计的罕见(次要等位基因频率 <1%)变异,其中一些变异会导致疾病风险。使用基因型组织表达 (GTEx) 项目第 8 版中的 838 个样本以及全基因组和多组织转录组测序数据,我们评估了罕见的遗传变异如何导致基因表达 (eOutliers)、等位基因表达 (aseOutliers) 和替代的极端模式拼接(sOutliers)。我们将 49 个组织的这三个信号与基因组注释整合起来,以优先考虑与人类特征相关的高影响力罕见变异 (RV)。基本原理 异常基因表达有助于识别功能性 RV。 转录组测序提供了基因表达之外的多种测量,包括等位基因特异性表达和选择性剪接,这可以提供对 RV 功能效应的额外见解。结果在识别多组织 eOutliers、aseOutliers 和 sOutliers 后,我们发现每种类型的异常个体更有可能在相应基因附近携带 RV。在 eOutliers 中,我们观察到罕见结构变异的强烈富集。对于破坏或创建剪接共有序列的 RV,异常值尤其丰富。仅从单个组织进行评估时,aseOutliers 提供了最强的富集信号。我们开发了 Watershed,这是一种用于个人基因组解释的概率模型,通过整合来自同一个体的这三个转录组信号并在独立队列中复制,改进了基于标准基因组注释的 RV 评分方法。为了评估 GTEx 中发现的异常 RV 是否与性状相关,我们评估了这些变异与英国生物库、百万退伍军人计划和杰克逊心脏研究中不同性状的关联。我们发现,转录组辅助优先级确定了具有较大性状效应大小的 RV,并且比单独的基因组注释能够更好地预测效应大小。结论 通过 >800 基因组与 49 个组织的转录组相匹配,我们能够研究导致转录组极端变化的 RV。为了捕捉这些极端变化的多样性,我们开发并集成了识别表达、等位基因特异性表达和选择性剪接异常值的方法,并表征了每个异常信号背后的 RV 景观。 我们证明,通过使用这些信号可以增强个人基因组解释和 RV 发现。这种方法提供了一种新方法,可以将更丰富的功能 RV 集成到遗传负担模型中,改善疾病基因识别,并实现精准基因组学的交付。转录组特征可识别功能性罕见遗传变异。我们鉴定了个体中表现出异常表达、等位基因特异性表达或选择性剪接的基因,并评估了附近罕见变异的富集度。我们将这三个异常信号与基因组注释数据相结合,以优先考虑功能性 RV,并将这些变异与疾病位点相交叉,以确定潜在的 RV 性状关联。人类基因组中存在大量罕见的遗传变异,识别它们的功能和表型影响是一项重大挑战。测量异常基因表达有助于识别功能性、大效应的罕见变异 (RV)。在这里,我们通过分析基因表达、等位基因特异性表达和来自多组织 RNA 测序数据的选择性剪接,扩大了对遗传驱动的转录组异常的检测,并证明每个信号告知独特的 RV 类别。我们开发了 Watershed,这是一种概率模型,集成了多个基因组和转录组信号来预测变异功能,在其他队列中通过实验分析验证了这些预测,并使用它们来评估英国生物库、百万退伍军人计划和杰克逊心脏的 RV学习。我们的结果将数千个 RV 与不同的分子效应联系起来,并提供了将影响转录组的 RV 与人类特征关联起来的证据。
更新日期:2020-09-10
down
wechat
bug