当前位置: X-MOL 学术Genome Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The motif composition of variable number tandem repeats impacts gene expression
Genome Research ( IF 7 ) Pub Date : 2023-04-01 , DOI: 10.1101/gr.276768.122
Tsung-Yu Lu 1 , Paulina N Smaruj 1 , Geoffrey Fudenberg 1 , Nicholas Mancuso 1, 2 , Mark J P Chaisson 3, 4
Affiliation  

Understanding the impact of DNA variation on human traits is a fundamental question in human genetics. Variable number tandem repeats (VNTRs) make up ∼3% of the human genome but are often excluded from association analysis owing to poor read mappability or divergent repeat content. Although methods exist to estimate VNTR length from short-read data, it is known that VNTRs vary in both length and repeat (motif) composition. Here, we use a repeat-pangenome graph (RPGG) constructed on 35 haplotype-resolved assemblies to detect variation in both VNTR length and repeat composition. We align population-scale data from the Genotype-Tissue Expression (GTEx) Consortium to examine how variations in sequence composition may be linked to expression, including cases independent of overall VNTR length. We find that 9422 out of 39,125 VNTRs are associated with nearby gene expression through motif variations, of which only 23.4% are accessible from length. Fine-mapping identifies 174 genes to be likely driven by variation in certain VNTR motifs and not overall length. We highlight two genes, CACNA1C and RNF213, that have expression associated with motif variation, showing the utility of RPGG analysis as a new approach for trait association in multiallelic and highly variable loci.

中文翻译:

可变数量串联重复序列的基序组成影响基因表达

了解 DNA 变异对人类特征的影响是人类遗传学的一个基本问题。可变数目串联重复 (VNTR) 占人类基因组的 3%,但由于读取映射能力差或重复内容不同,通常被排除在关联分析之外。尽管存在从短读数据估计 VNTR 长度的方法,但众所周知,VNTR 在长度和重复(基序)组成方面都有所不同。在这里,我们使用在 35 个单倍型解析组件上构建的重复泛基因组图 (RPGG) 来检测 VNTR 长度和重复组成的变化。我们比对来自基因型-组织表达 (GTEx) 联盟的种群规模数据,以检查序列组成的变化如何与表达相关联,包括与整体 VNTR 长度无关的情况。我们发现 39 个中有 9422 个,125 个 VNTR 通过基序变异与附近的基因表达相关,其中只有 23.4% 可以从长度上获得。精细定位确定了 174 个基因可能是由某些 VNTR 基序的变异驱动的,而不是总长度。我们强调两个基因,CACNA1CRNF213具有与基序变异相关的表达,显示了 RPGG 分析作为多等位基因和高度可变位点性状关联的新方法的效用。
更新日期:2023-04-01
down
wechat
bug