当前位置: X-MOL 学术Mol. Genet. Genomics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Tracing foreign sequences in plant transcriptomes and genomes using OCT4, a POU domain protein
Molecular Genetics and Genomics ( IF 2.3 ) Pub Date : 2021-03-18 , DOI: 10.1007/s00438-021-01768-z
Adeleh Saffar , Maryam M. Matin

Contaminations in sequencing data, especially in reference genomes, lead to inevitable errors in downstream analyses. Similarly, presence of contaminants in transcriptomes, misrepresents the molecular basis of various interactions. In this study, we report the presence of a large number of plant transcriptomes contaminated with RNAs encoding POU domain proteins; a family of proteins that has not been reported in plants and fungi. Besides, our findings illustrated that there are four POU domain protein-coding sequences in the reference genome of Rhodamnia argentea. It turned out that the existing foreign fragments are related to arthropods that are considered as plant pests. We also identified two contaminated draft genomes, Humulus lupulus and Cannabis sativa that contained complete rDNA sequences originating from Tetranychus species. As a result, careful screening of sequencing data before releasing them in public databases or checking existing genomes for possible contaminations is recommended.



中文翻译:

使用POU域蛋白OCT4追踪植物转录组和基因组中的外源序列

测序数据中的污染,尤其是参考基因组中的污染,导致下游分析中不可避免的错误。同样,转录组中污染物的存在错误地代表了各种相互作用的分子基础。在这项研究中,我们报告了大量被编码POU域蛋白的RNA污染的植物转录组的存在。植物和真菌中尚未报道的蛋白质家族。此外,我们的研究结果表明,在非洲大麦若虫的参考基因组中有四个POU域蛋白编码序列。事实证明,现有的外来碎片与被认为是植物害虫的节肢动物有关。我们还确定了两个受污染的草图基因组,,大麻其中包含源自四叶草属物种的完整rDNA序列。因此,建议在对测序数据进行仔细筛选之前,先将其发布到公共数据库中,或者检查现有基因组是否存在可能的污染。

更新日期:2021-03-19
down
wechat
bug