当前位置: X-MOL 学术Curr. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Roadmap to Sequence Assembly Evaluation Tools
Current Bioinformatics ( IF 4 ) Pub Date : 2021-05-31 , DOI: 10.2174/1574893615999201111140419
Sara El-Metwally 1 , Eslam Hamouda 1 , Mayada Tarek 1
Affiliation  

The assembly evaluation process is the starting step towards meaningful downstream data analysis. We need to know how much accurate information is included in an assembled sequence before going further to any data analysis stage. Four basic metrics are targeted by different assembly evaluation tools: contiguity, accuracy, completeness, and contamination. Some tools evaluate these metrics based on comparing the assembly results to a closely related reference. Others utilize different types of heuristics to overcome the missing guiding reference, such as the consistency between assembly results and sequencing reads. In this paper, we discuss the assembly evaluation process as a core stage in any sequence assembly pipeline and present a roadmap that is followed by most assembly evaluation tools to assess different metrics. We highlight the challenges that currently exist in the assembly evaluation tools and summarize their technical and practical details to help the end-users choose the best tool according to their working scenarios. To address the similarities/differences among different assembly assessment tools, including their evaluation approaches, metrics, comprehensive nature, limitations, usability and how the evaluated results are presented to the end-user, we provide a practical example for evaluating Velvet assembly results for S. aureus dataset from GAGE competition. A Github repository (https://github.com/SaraEl-Metwally/Assembly-Evaluation-Tools) is created for evaluation result details along with their generated command line parameters.



中文翻译:

序列装配评估工具的路线图

装配评估过程是迈向有意义的下游数据分析的第一步。在进一步进入任何数据分析阶段之前,我们需要知道组装序列中包含多少准确信息。不同的装配评估工具针对四个基本指标:连续性、准确性、完整性和污染。一些工具根据将装配结果与密切相关的参考进行比较来评估这些指标。其他人利用不同类型的启发式方法来克服缺少的指导参考,例如组装结果和测序读数之间的一致性。在本文中,我们将装配评估过程作为任何序列装配流水线中的核心阶段进行讨论,并提出了一个路线图,大多数装配评估工具都遵循该路线图来评估不同的指标。我们重点介绍了组装评估工具目前存在的挑战,并总结了它们的技术和实践细节,以帮助最终用户根据其工作场景选择最佳工具。为了解决不同装配评估工具之间的异同,包括它们的评估方法、指标、综合性、局限性、可用性以及评估结果如何呈现给最终用户,我们提供了一个实际示例来评估 S 的 Velvet 装配结果. 来自 GAGE 竞赛的 aureus 数据集。创建了一个 Github 存储库 (https://github.com/SaraEl-Metwally/Assembly-Evaluation-Tools) 以获取评估结果的详细信息及其生成的命令行参数。为了解决不同装配评估工具之间的异同,包括它们的评估方法、指标、综合性、局限性、可用性以及评估结果如何呈现给最终用户,我们提供了一个实际示例来评估 S 的 Velvet 装配结果. 来自 GAGE 竞赛的 aureus 数据集。创建了一个 Github 存储库 (https://github.com/SaraEl-Metwally/Assembly-Evaluation-Tools) 以获取评估结果的详细信息及其生成的命令行参数。为了解决不同装配评估工具之间的异同,包括它们的评估方法、指标、综合性、局限性、可用性以及评估结果如何呈现给最终用户,我们提供了一个实际示例来评估 S 的 Velvet 装配结果. 来自 GAGE 竞赛的 aureus 数据集。创建了一个 Github 存储库 (https://github.com/SaraEl-Metwally/Assembly-Evaluation-Tools) 以获取评估结果的详细信息及其生成的命令行参数。

更新日期:2021-05-31
down
wechat
bug