当前位置: X-MOL 学术Microb. Genom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Identification of the conjugative and mobilizable plasmid fragments in the plasmidome using sequence signatures
Microbial Genomics ( IF 3.9 ) Pub Date : 2020-11-01 , DOI: 10.1099/mgen.0.000459
Zhencheng Fang 1, 2 , Hongwei Zhou 1, 3
Affiliation  

Plasmids are the key element in horizontal gene transfer in the microbial community. Recently, a large number of experimental and computational methods have been developed to obtain the plasmidomes of microbial communities. Distinguishing transmissible plasmid sequences, which are derived from conjugative or at least mobilizable plasmids, from non-transmissible plasmid sequences in the plasmidome is essential for understanding the diversity of plasmids and how they regulate the microbial community. Unfortunately, due to the highly fragmented characteristics of DNA sequences in the plasmidome, effective identification methods are lacking. In this work, we used information entropy from information theory to assess the randomness of synonymous codon usage over 4424 plasmid genomes. The results showed that for all amino acids, the choice of a synonymous codon in conjugative and mobilizable plasmids is more random than that in non-transmissible plasmids, indicating that transmissible plasmids have different sequence signatures from non-transmissible plasmids. Inspired by this phenomenon, we further developed a novel algorithm named PlasTrans. PlasTrans takes the triplet code sequences and base sequences of plasmid DNA fragments as input and uses the convolutional neural network of the deep learning technique to further extract the more complex signatures of the plasmid sequences and identify the conjugative and mobilizable DNA fragments. Tests showed that PlasTrans could achieve an AUC of as high as 84–91%, even though the fragments only contained hundreds of base pairs. To the best of our knowledge, this is the first quantitative analysis of the difference in sequence signatures between transmissible and non-transmissible plasmids, and we developed the first tool to perform transferability annotation for DNA fragments in the plasmidome. We expect that PlasTrans will be a useful tool for researchers who analyse the properties of novel plasmids in the microbial community and horizontal gene transfer, especially the spread of resistance genes and virulence factors associated with plasmids. PlasTrans is freely available via https://github.com/zhenchengfang/PlasTrans

中文翻译:

使用序列签名识别质粒组中的接合和可移动质粒片段

质粒是微生物群落中水平基因转移的关键要素。最近,已经开发了大量的实验和计算方法来获得微生物群落的质粒组。区分源自接合或至少可移动质粒的可传播质粒序列与质粒组中的不可传播质粒序列对于理解质粒的多样性及其如何调节微生物群落至关重要。不幸的是,由于质粒组中 DNA 序列的高度碎片化特征,缺乏有效的识别方法。在这项工作中,我们使用信息论中的信息熵来评估 4424 个质粒基因组上同义密码子使用的随机性。结果表明,对于所有氨基酸,接合质粒和可移动质粒中同义密码子的选择比非传播质粒更随机,表明可传播质粒与非传播质粒具有不同的序列特征。受这一现象的启发,我们进一步开发了一种名为 PlasTrans 的新算法。PlasTrans 以质粒 DNA 片段的三联码序列和碱基序列为输入,利用深度学习技术的卷积神经网络,进一步提取质粒序列更复杂的特征,识别可结合和可移动的 DNA 片段。测试表明 PlasTrans 可以实现高达 84-91% 的 AUC,即使这些片段仅包含数百个碱基对。据我们所知,这是对可传播和不可传播质粒之间序列特征差异的第一次定量分析,我们开发了第一个工具来对质粒组中的 DNA 片段进行可转移性注释。我们预计 PlasTrans 将成为分析微生物群落和水平基因转移中新型质粒特性的研究人员的有用工具,尤其是与质粒相关的抗性基因和毒力因子的传播。PlasTrans 可通过 https://github.com/zhenchengfang/PlasTrans 免费获得 我们预计 PlasTrans 将成为分析微生物群落和水平基因转移中新型质粒特性的研究人员的有用工具,尤其是与质粒相关的抗性基因和毒力因子的传播。PlasTrans 可通过 https://github.com/zhenchengfang/PlasTrans 免费获得 我们预计 PlasTrans 将成为分析微生物群落和水平基因转移中新型质粒特性的研究人员的有用工具,尤其是与质粒相关的抗性基因和毒力因子的传播。PlasTrans 可通过 https://github.com/zhenchengfang/PlasTrans 免费获得
更新日期:2020-12-01
down
wechat
bug