当前位置: X-MOL 学术Methods Ecol. Evol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
ReClustOR: a re‐clustering tool using an open‐reference method that improves operational taxonomic unit definition
Methods in Ecology and Evolution ( IF 6.3 ) Pub Date : 2019-11-01 , DOI: 10.1111/2041-210x.13316
Sébastien Terrat 1 , Christophe Djemiel 1 , Corentin Journay 1 , Battle Karimi 1 , Samuel Dequiedt 1 , Walid Horrigue 1 , Pierre‐Alain Maron 1 , Nicolas Chemidlin Prévost‐Bouré 1 , Lionel Ranjard 1
Affiliation  

  1. Environmental microbial communities are now widely studied using metabarcoding approaches, thanks to the democratization of high‐throughput DNA sequencing technologies. The massive number of reads produced with these technologies requires bioinformatic solutions to be treated. A key step in the analysis is to cluster reads into Operational Taxonomic Units (or OTUs) and thus reduce the amount of data for downstream analyses. Due to the important impact of the clustering method on the quantity and quality of OTUs, finding an equilibrium between the reliability and time‐consuming nature of the chosen strategy is a real challenge. The present article proposes a new post‐clustering tool called ReClustOR aimed at improving the stability and reliability of OTUs whatever the initial clustering method.
  2. We compared several clustering methods: a homemade de novo method, VSEARCH, Swarm and ReClustOR associated with these three clustering methods, and the ESV definition, using two datasets (a simulated one and an environmental one). All methods were analysed for their ability to efficiently describe microbial diversity in terms of alpha‐diversity, beta‐diversity and phylogeny.
  3. Dataset analysis showed that post‐clustering with ReClustOR improved OTU detection not only in terms of diversity, but also in terms of reliability and stability as compared to the initial clustering methods. More precisely, the post‐clustering step improved the congruence of the results (alpha‐diversity, beta‐diversity, composition) whatever the initial clustering method. Moreover, ReClustOR, by defining a database of centroids, precludes the need to re‐cluster all the reads each time when new reads are generated.
  4. ReClustOR is a new post‐clustering method that overcomes problems (OTU stability and reliability) associated with classical clustering methods and thereby increases the quality and the congruence of the reconstructed OTUs. Moreover, the OTU database defined with ReClustOR can be used as a reference gradually enriched by merging new studies and samples. In this way, huge datasets (e.g. the Earth Microbiome Project or the Tara Oceans project) can be used as references for other projects within their range of application, and increase the quality of comparisons among studies and datasets.


中文翻译:

ReClustOR:使用开放参考方法的重新聚类工具,可改善操作分类单位的定义

  1. 由于高通量DNA测序技术的民主化,现在使用元条形码方法对环境微生物群落进行了广泛的研究。这些技术产生的大量读取需要处理生物信息学解决方案。分析中的关键步骤是将读数聚类到操作分类单位(OTU)中,从而减少用于下游分析的数据量。由于聚类方法对OTU的数量和质量具有重要影响,因此在所选择策略的可靠性和耗时性之间找到平衡是一个真正的挑战。本文提出了一种称为ReClustOR的新的后聚类工具,旨在提高OTU的稳定性和可靠性,而无论采用哪种初始聚类方法。
  2. 我们比较了几种聚类方法:使用两个数据集(一个模拟数据集和一个环境数据集)比较了与这三种聚类方法相关联的自制从头方法,VSEARCH,Swarm和ReClustOR,以及ESV定义。分析了所有方法在α-多样性,β-多样性和系统发育方面有效描述微生物多样性的能力。
  3. 数据集分析表明,与最初的聚类方法相比,使用ReClustOR进行后聚类不仅在多样性方面,而且在可靠性和稳定性方面都改善了OTU检测。更准确地说,无论采用哪种初始聚类方法,聚类后步骤都可以提高结果的一致性(α多样性,β多样性,组成)。此外,通过定义质心数据库,ReClustOR避免了每次生成新读取时重新群集所有读取的需要。
  4. ReClustOR是一种新的后聚类方法,它克服了与经典聚类方法相关的问题(OTU稳定性和可靠性),从而提高了重建OTU的质量和一致性。此外,通过合并新的研究和样本,用ReClustOR定义的OTU数据库可以用作逐渐丰富的参考。这样,庞大的数据集(例如地球微生物组项目或塔拉海洋项目)可以用作其应用范围内其他项目的参考,并提高研究与数据集之间的比较质量。
更新日期:2019-11-01
down
wechat
bug