当前位置: X-MOL 学术Sci. Rep. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Classification of the plant-associated lifestyle of Pseudomonas strains using genome properties and machine learning
Scientific Reports ( IF 4.6 ) Pub Date : 2022-06-27 , DOI: 10.1038/s41598-022-14913-4
Wasin Poncheewin 1 , Anne D van Diepeningen 2 , Theo A J van der Lee 2 , Maria Suarez-Diez 1 , Peter J Schaap 1, 3
Affiliation  

The rhizosphere, the region of soil surrounding roots of plants, is colonized by a unique population of Plant Growth Promoting Rhizobacteria (PGPR). Many important PGPR as well as plant pathogens belong to the genus Pseudomonas. There is, however, uncertainty on the divide between beneficial and pathogenic strains as previously thought to be signifying genomic features have limited power to separate these strains. Here we used the Genome properties (GP) common biological pathways annotation system and Machine Learning (ML) to establish the relationship between the genome wide GP composition and the plant-associated lifestyle of 91 Pseudomonas strains isolated from the rhizosphere and the phyllosphere representing both plant-associated phenotypes. GP enrichment analysis, Random Forest model fitting and feature selection revealed 28 discriminating features. A test set of 75 new strains confirmed the importance of the selected features for classification. The results suggest that GP annotations provide a promising computational tool to better classify the plant-associated lifestyle.



中文翻译:

使用基因组特性和机器学习对假单胞菌菌株的植物相关生活方式进行分类

根际,即植物根部周围的土壤区域,被独特的植物生长促进根际细菌 (PGPR) 种群定殖。许多重要的 PGPR 以及植物病原体属于假单胞菌属。然而,有益菌株和致病菌株之间的区别存在不确定性,因为以前认为这表明基因组特征对分离这些菌株的能力有限。在这里,我们使用基因组特性 (GP) 常见生物途径注释系统和机器学习 (ML) 来建立全基因组 GP 组成与 91 种假单胞菌的植物相关生活方式之间的关系从根际和叶际分离的菌株代表两种植物相关表型。GP 富集分析、随机森林模型拟合和特征选择揭示了 28 个判别特征。75 个新菌株的测试集证实了所选特征对分类的重要性。结果表明,GP 注释提供了一种有前途的计算工具,可以更好地对植物相关的生活方式进行分类。

更新日期:2022-06-28
down
wechat
bug