当前位置: X-MOL 学术Mobile DNA › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Comparative analysis on the expression of L1 loci using various RNA-Seq preparations.
Mobile DNA ( IF 4.7 ) Pub Date : 2020-01-06 , DOI: 10.1186/s13100-019-0194-z
Tiffany Kaul 1 , Maria E Morales 1 , Alton O Sartor 1, 2 , Victoria P Belancio 1, 3 , Prescott Deininger 1, 4
Affiliation  

Retrotransposons are one of the oldest evolutionary forces shaping mammalian genomes, with the ability to mobilize from one genomic location to another. This mobilization is also a significant factor in human disease. The only autonomous human retroelement, L1, has propagated to make up 17% of the human genome, accumulating over 500,000 copies. The majority of these loci are truncated or defective with only a few reported to remain capable of retrotransposition. We have previously published a strand-specific RNA-Seq bioinformatics approach to stringently identify at the locus-specific level the few expressed full-length L1s using cytoplasmic RNA. With growing repositories of RNA-Seq data, there is potential to mine these datasets to identify and study expressed L1s at single-locus resolution, although many datasets are not strand-specific or not generated from cytoplasmic RNA. We developed whole-cell, cytoplasmic and nuclear RNA-Seq datasets from 22Rv1 prostate cancer cells to test the influence of different preparations on the quality and effort needed to measure L1 expression. We found that there was minimal data loss in the identification of full-length expressed L1 s using whole cell, strand-specific RNA-Seq data compared to cytoplasmic, strand-specific RNA-Seq data. However, this was only possible with an increased amount of manual curation of the bioinformatics output to eliminate increased background. About half of the data was lost when the sequenced datasets were non-strand specific. The results of these studies demonstrate that with rigorous manual curation the utilization of stranded RNA-Seq datasets allow identification of expressed L1 loci from either cytoplasmic or whole-cell RNA-Seq datasets.

中文翻译:

使用各种 RNA-Seq 制剂对 L1 基因座表达的比较分析。

逆转录转座子是塑造哺乳动物基因组的最古老的进化力量之一,具有从一个基因组位置移动到另一个位置的能力。这种动员也是人类疾病的一个重要因素。唯一的自主人类逆转录因子 L1 已繁殖到人类基因组的 17%,累积了超过 500,000 个拷贝。这些基因座中的大多数被截断或有缺陷,据报道只有少数仍然能够进行逆转录转座。我们之前发表了一种链特异性 RNA-Seq 生物信息学方法,可在基因座特异性水平上严格鉴定少数使用细胞质 RNA 表达的全长 L1。随着 RNA-Seq 数据库的增长,有可能挖掘这些数据集以识别和研究以单基因座分辨率表达的 L1,尽管许多数据集不是链特异性的,也不是由细胞质 RNA 生成的。我们开发了来自 22Rv1 前列腺癌细胞的全细胞、细胞质和核 RNA-Seq 数据集,以测试不同制剂对测量 L1 表达所需的质量和工作量的影响。我们发现,与细胞质、链特异性 RNA-Seq 数据相比,使用全细胞、链特异性 RNA-Seq 数据识别全长表达的 L1 时数据丢失最少。然而,这只有通过增加对生物信息学输出的手动管理以消除增加的背景才能实现。当测序数据集是非链特定的时,大约一半的数据丢失了。
更新日期:2020-01-06
down
wechat
bug