当前位置: X-MOL 学术BMC Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
SequelTools: a suite of tools for working with PacBio Sequel raw sequence data
BMC Bioinformatics ( IF 2.9 ) Pub Date : 2020-10-01 , DOI: 10.1186/s12859-020-03751-8
David E. Hufnagel , Matthew B. Hufford , Arun S. Seetharam

PacBio sequencing is an incredibly valuable third-generation DNA sequencing method due to very long read lengths, ability to detect methylated bases, and its real-time sequencing methodology. Yet, hitherto no tool was available for analyzing the quality of, subsampling, and filtering PacBio data. Here we present SequelTools, a command-line program containing three tools: Quality Control, Read Subsampling, and Read Filtering. The Quality Control tool quickly processes PacBio Sequel raw sequence data from multiple SMRTcells producing multiple statistics and publication-quality plots describing the quality of the data including N50, read length and count statistics, PSR, and ZOR. The Read Subsampling tool allows the user to subsample reads by one or more of the following criteria: longest subreads per CLR or random CLR selection. The Read Filtering tool provides options for normalizing data by filtering out certain low-quality scraps reads and/or by minimum CLR length. SequelTools is implemented in bash, R, and Python using only standard libraries and packages and is platform independent. SequelTools is a program that provides the only free, fast, and easy-to-use quality control tool, and the only program providing this kind of read subsampling and read filtering for PacBio Sequel raw sequence data, and is available at https://github.com/ISUgenomics/SequelTools .

中文翻译:

SequelTools:一套用于处理PacBio Sequel原始序列数据的工具

PacBio测序是一种非常有价值的第三代DNA测序方法,这是由于其读取长度非常长,能够检测甲基化碱基的能力及其实时测序方法。但是,迄今为止,尚无工具可用于分析PacBio数据的质量,二次采样和过滤。在这里,我们介绍SequelTools,这是一个命令行程序,其中包含三个工具:质量控制,读取子采样和读取过滤。质量控制工具可快速处理来自多个SMRTcell的PacBio Sequel原始序列数据,从而产生多个统计数据和描述质量的出版物质量图,包括N50,读取长度和计数统计数据,PSR和ZOR。读取子采样工具允许用户按照以下一个或多个条件对读取子采样:每个CLR的最长子读取或随机CLR选择。读取过滤工具提供了一些选项,可通过过滤掉某些低质量的废料读取和/或最小CLR长度来规范化数据。SequelTools仅使用标准库和程序包在bash,R和Python中实现,并且与平台无关。SequelTools是一个程序,提供唯一免费,快速且易于使用的质量控制工具,并且是唯一为PacBio Sequel原始序列数据提供这种读取子采样和读取过滤的程序,可在https://获得github.com/ISUgenomics/SequelTools。
更新日期:2020-10-02
down
wechat
bug