当前位置: X-MOL 学术Mobile DNA › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Tools and best practices for retrotransposon analysis using high-throughput sequencing data
Mobile DNA ( IF 4.7 ) Pub Date : 2019-12-29 , DOI: 10.1186/s13100-019-0192-1
Aurélie Teissandier 1, 2, 3, 4 , Nicolas Servant 1, 2, 3 , Emmanuel Barillot 1, 2, 3 , Deborah Bourc'his 1, 4
Affiliation  

Sequencing technologies give access to a precise picture of the molecular mechanisms acting upon genome regulation. One of the biggest technical challenges with sequencing data is to map millions of reads to a reference genome. This problem is exacerbated when dealing with repetitive sequences such as transposable elements that occupy half of the mammalian genome mass. Sequenced reads coming from these regions introduce ambiguities in the mapping step. Therefore, applying dedicated parameters and algorithms has to be taken into consideration when transposable elements regulation is investigated with sequencing datasets. Here, we used simulated reads on the mouse and human genomes to define the best parameters for aligning transposable element-derived reads on a reference genome. The efficiency of the most commonly used aligners was compared and we further evaluated how transposable element representation should be estimated using available methods. The mappability of the different transposon families in the mouse and the human genomes was calculated giving an overview into their evolution. Based on simulated data, we provided recommendations on the alignment and the quantification steps to be performed when transposon expression or regulation is studied, and identified the limits in detecting specific young transposon families of the mouse and human genomes. These principles may help the community to adopt standard procedures and raise awareness of the difficulties encountered in the study of transposable elements.

中文翻译:

使用高通量测序数据进行反转录转座子分析的工具和最佳实践

测序技术可以准确了解作用于基因组调控的分子机制。测序数据的最大技术挑战之一是将数百万个读数映射到参考基因组。当处理重复序列(例如占据哺乳动物基因组质量一半的转座因子)时,这个问题会更加严重。来自这些区域的测序读取在映射步骤中引入了歧义。因此,在使用测序数据集研究转座因子调控时,必须考虑应用专用参数和算法。在这里,我们使用小鼠和人类基因组上的模拟读数来定义在参考基因组上比对转座因子衍生读数的最佳参数。比较了最常用的对齐器的效率,我们进一步评估了应如何使用可用方法估计转座因子表示。计算了小鼠和人类基因组中不同转座子家族的可映射性,概述了它们的进化。基于模拟数据,我们提供了在研究转座子表达或调控时要执行的比对和量化步骤的建议,并确定了检测小鼠和人类基因组特定年轻转座子家族的局限性。这些原则可能有助于社区采用标准程序并提高对转座因子研究中遇到的困难的认识。计算了小鼠和人类基因组中不同转座子家族的可映射性,概述了它们的进化。基于模拟数据,我们提供了在研究转座子表达或调控时要执行的比对和量化步骤的建议,并确定了检测小鼠和人类基因组特定年轻转座子家族的局限性。这些原则可能有助于社区采用标准程序并提高对转座因子研究中遇到的困难的认识。计算了小鼠和人类基因组中不同转座子家族的可映射性,概述了它们的进化。基于模拟数据,我们提供了在研究转座子表达或调控时要执行的比对和量化步骤的建议,并确定了检测小鼠和人类基因组特定年轻转座子家族的局限性。这些原则可能有助于社区采用标准程序并提高对转座因子研究中遇到的困难的认识。我们提供了在研究转座子表达或调控时要执行的比对和量化步骤的建议,并确定了检测小鼠和人类基因组特定年轻转座子家族的局限性。这些原则可能有助于社区采用标准程序并提高对转座因子研究中遇到的困难的认识。我们提供了在研究转座子表达或调控时要执行的比对和量化步骤的建议,并确定了检测小鼠和人类基因组特定年轻转座子家族的局限性。这些原则可能有助于社区采用标准程序并提高对转座因子研究中遇到的困难的认识。
更新日期:2019-12-29
down
wechat
bug