当前位置: X-MOL 学术BMC Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
PACVr: plastome assembly coverage visualization in R.
BMC Bioinformatics ( IF 2.9 ) Pub Date : 2020-05-24 , DOI: 10.1186/s12859-020-3475-0
Michael Gruenstaeudl 1 , Nils Jenke 2
Affiliation  

BACKGROUND Plastid genomes typically display a circular, quadripartite structure with two inverted repeat regions, which challenges automatic assembly procedures. The correct assembly of plastid genomes is a prerequisite for the validity of subsequent analyses on genome structure and evolution. The average coverage depth of a genome assembly is often used as an indicator of assembly quality. Visualizing coverage depth across a draft genome is a critical step, which allows users to inspect the quality of the assembly and, where applicable, identify regions of reduced assembly confidence. Despite the interplay between genome structure and assembly quality, no contemporary, user-friendly software tool can visualize the coverage depth of a plastid genome assembly while taking its quadripartite genome structure into account. A software tool is needed that fills this void. RESULTS We introduce 'PACVr', an R package that visualizes the coverage depth of a plastid genome assembly in relation to the circular, quadripartite structure of the genome as well as the individual plastome genes. By using a variable window approach, the tool allows visualizations on different calculation scales. It also confirms sequence equality of, as well as visualizes gene synteny between, the inverted repeat regions of the input genome. As a tool for plastid genomics, PACVr provides the functionality to identify regions of coverage depth above or below user-defined threshold values and helps to identify non-identical IR regions. To allow easy integration into bioinformatic workflows, PACVr can be invoked from a Unix shell, facilitating its use in automated quality control. We illustrate the application of PACVr on four empirical datasets and compare visualizations generated by PACVr with those of alternative software tools. CONCLUSIONS PACVr provides a user-friendly tool to visualize (a) the coverage depth of a plastid genome assembly on a circular, quadripartite plastome map and in relation to individual plastome genes, and (b) gene synteny across the inverted repeat regions. It contributes to optimizing plastid genome assemblies and increasing the reliability of publicly available plastome sequences. The software, example datasets, technical documentation, and a tutorial are available with the package at https://cran.r-project.org/package=PACVr.

中文翻译:


PACVr:R 中的质体组装覆盖可视化。



背景技术质体基因组通常表现出具有两个反向重复区域的圆形四部分结构,这对自动组装程序提出了挑战。质体基因组的正确组装是后续基因组结构和进化分析有效性的先决条件。基因组组装的平均覆盖深度通常用作组装质量的指标。可视化基因组草案的覆盖深度是关键的一步,它允许用户检查装配的质量,并在适用的情况下识别装配置信度降低的区域。尽管基因组结构和组装质量之间存在相互作用,但没有任何现代的、用户友好的软件工具可以在考虑其四部分基因组结构的同时可视化质体基因组组装的覆盖深度。需要一个软件工具来填补这一空白。结果我们引入了“PACVr”,这是一个 R 软件包,可以可视化质体基因组组装相对于基因组的圆形四部分结构以及单个质体基因的覆盖深度。通过使用可变窗口方法,该工具允许在不同的计算尺度上进行可视化。它还确认了输入基因组反向重复区域的序列相等性,并可视化了它们之间的基因同线性。作为质体基因组学的工具,PACVr 提供了识别覆盖深度高于或低于用户定义阈值的区域的功能,并有助于识别不相同的 IR 区域。为了轻松集成到生物信息工作流程中,可以从 Unix shell 调用 PACVr,从而促进其在自动化质量控制中的使用。 我们说明了 PACVr 在四个经验数据集上的应用,并将 PACVr 生成的可视化效果与其他软件工具的可视化效果进行了比较。结论 PACVr 提供了一种用户友好的工具,可以可视化 (a) 圆形四部分质体图上质体基因组组装的覆盖深度以及与各个质体基因的关系,以及 (b) 反向重复区域的基因同线性。它有助于优化质体基因组组装并提高公开可用的质体序列的可靠性。该软件、示例数据集、技术文档和教程可通过 https://cran.r-project.org/package=PACVr 获取。
更新日期:2020-05-24
down
wechat
bug