当前位置: X-MOL 学术BMC Genomics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Genome re-sequencing and reannotation of the Escherichia coli ER2566 strain and transcriptome sequencing under overexpression conditions.
BMC Genomics ( IF 3.5 ) Pub Date : 2020-06-16 , DOI: 10.1186/s12864-020-06818-1
Lizhi Zhou 1 , Hai Yu 1 , Kaihang Wang 2 , Tingting Chen 2 , Yue Ma 2 , Yang Huang 2 , Jiajia Li 2 , Liqin Liu 2 , Yuqian Li 2 , Zhibo Kong 2 , Qingbing Zheng 1 , Yingbin Wang 1 , Ying Gu 1, 2 , Ningshao Xia 1, 2 , Shaowei Li 1, 2
Affiliation  

The Escherichia coli ER2566 strain (NC_CP014268.2) was developed as a BL21 (DE3) derivative strain and had been widely used in recombinant protein expression. However, like many other current RefSeq annotations, the annotation of the ER2566 strain was incomplete, with missing gene names and miscellaneous RNAs, as well as uncorrected annotations of some pseudogenes. Here, we performed a systematic reannotation of the ER2566 genome by combining multiple annotation tools with manual revision to provide a comprehensive understanding of the E. coli ER2566 strain, and used high-throughput sequencing to explore how the strain adapted under external pressure. The reannotation included noteworthy corrections to all protein-coding genes, led to the exclusion of 190 hypothetical genes or pseudogenes, and resulted in the addition of 237 coding sequences and 230 miscellaneous noncoding RNAs and 2 tRNAs. In addition, we further manually examined all 194 pseudogenes in the Ref-seq annotation and directly identified 123 (63%) as coding genes. We then used whole-genome sequencing and high-throughput RNA sequencing to assess mutational adaptations under consecutive subculture or overexpression burden. Whereas no mutations were detected in response to consecutive subculture, overexpression of the human papillomavirus 16 type capsid led to the identification of a mutation (position 1,094,824 within the 3′ non-coding region) positioned 19-bp away from the lacI gene in the transcribed RNA, which was not detected at the genomic level by Sanger sequencing. The ER2566 strain was used by both the general scientific community and the biotechnology industry. Reannotation of the E. coli ER2566 strain not only improved the RefSeq data but uncovered a key site that might be involved in the transcription and translation of genes encoding the lactose operon repressor. We proposed that our pipeline might offer a universal method for the reannotation of other bacterial genomes with high speed and accuracy. This study might facilitate a better understanding of gene function for the ER2566 strain under external burden and provided more clues to engineer bacteria for biotechnological applications.

中文翻译:


大肠杆菌 ER2566 菌株的基因组重测序和重新注释以及过表达条件下的转录组测序。



大肠杆菌ER2566菌株(NC_CP014268.2)是作为BL21(DE3)衍生菌株而开发的,已广泛应用于重组蛋白表达。然而,与当前许多其他 RefSeq 注释一样,ER2566 菌株的注释不完整,缺少基因名称和杂项 RNA,以及一些假基因的未更正注释。在这里,我们通过结合多种注释工具和手动修改对ER2566基因组进行了系统的重新注释,以全面了解大肠杆菌ER2566菌株,并利用高通量测序来探索菌株在外部压力下的适应情况。重新注释包括对所有蛋白质编码基因的值得注意的更正,导致排除 190 个假设基因或假基因,并导致添加 237 个编码序列和 230 个杂项非编码 RNA 和 2 个 tRNA。此外,我们进一步手动检查了Ref-seq注释中的所有194个假基因,并直接鉴定出123个(63%)为编码基因。然后,我们使用全基因组测序和高通量 RNA 测序来评估连续传代或过度表达负担下的突变适应。虽然连续传代培养没有检测到突变,但人乳头瘤病毒 16 型衣壳的过度表达导致鉴定出一个突变(3'非编码区内的位置 1,094,824),该突变位于转录产物中距 lacI 基因 19 bp 的位置。 RNA,桑格测序未在基因组水平检测到 RNA。 ER2566 菌株被一般科学界和生物技术行业使用。 E 的重新注释 大肠杆菌 ER2566 菌株不仅改进了 RefSeq 数据,而且发现了一个可能参与编码乳糖操纵子阻遏物的基因转录和翻译的关键位点。我们提出,我们的管道可能会提供一种通用方法,用于高速、准确地重新注释其他细菌基因组。这项研究可能有助于更好地了解 ER2566 菌株在外部负荷下的基因功能,并为工程细菌的生物技术应用提供更多线索。
更新日期:2020-06-16
down
wechat
bug