当前位置: X-MOL 学术J. Comput. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
ClassGraph: Improving Metagenomic Read Classification with Overlap Graphs.
Journal of Computational Biology ( IF 1.7 ) Pub Date : 2023-04-06 , DOI: 10.1089/cmb.2022.0208
Margherita Cavattoni 1 , Matteo Comin 1
Affiliation  

Current technologies allow the sequencing of microbial communities directly from the environment without prior culturing. One of the major problems when analyzing a microbial sample is to taxonomically annotate its reads to identify the species it contains. Most methods that are currently available focus on the classification of reads using a set of reference genomes and their k-mers. While in terms of precision these methods have reached percentages of correctness close to perfection, in terms of sensitivity (the actual number of classified reads), the performance is often poor. One reason is that the reads in a sample can be very different from the corresponding reference genomes; for example, viral genomes are usually highly mutated. To address this issue, in this article, we propose ClassGraph, a new taxonomic classification method that makes use of the read overlap graph and applies a label propagation algorithm to refine the results of existing tools. We evaluated its performance on simulated and real datasets with several taxonomic classification tools, and the results showed an improved sensitivity and F-measure, while maintaining high precision. ClassGraph is capable of improving the classification accuracy, especially in difficult cases such as virus and real datasets, where traditional tools can classify <40% of reads.

中文翻译:

ClassGraph:使用重叠图改进元基因组读取分类。

目前的技术允许直接从环境中对微生物群落进行测序,而无需事先培养。分析微生物样本时的主要问题之一是对其读数进行分类注释以识别其包含的物种。目前可用的大多数方法都侧重于使用一组参考基因组及其 k-mers 对读数进行分类。虽然在精度方面,这些方法的正确率已接近完美,但在灵敏度(分类读取的实际数量)方面,性能往往很差。一个原因是样本中的读数可能与相应的参考基因组非常不同;例如,病毒基因组通常是高度突变的。为了解决这个问题,在这篇文章中,我们提出了ClassGraph,一种新的分类学分类方法,它利用读取重叠图并应用标签传播算法来改进现有工具的结果。我们使用几种分类学分类工具评估了它在模拟和真实数据集上的性能,结果显示灵敏度和 F-measure 有所提高,同时保持了高精度。ClassGraph 能够提高分类精度,特别是在病毒和真实数据集等困难情况下,传统工具只能对 <40% 的读取进行分类。
更新日期:2023-04-06
down
wechat
bug