当前位置: X-MOL 学术BMC Genomics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A graphical, interactive and GPU-enabled workflow to process long-read sequencing data
BMC Genomics ( IF 4.4 ) Pub Date : 2021-08-23 , DOI: 10.1186/s12864-021-07927-1
Shishir Reddy 1 , Ling-Hong Hung 2 , Olga Sala-Torra 3 , Jerald P Radich 3, 4, 5 , Cecilia Cs Yeung 3, 6 , Ka Yee Yeung 2
Affiliation  

Long-read sequencing has great promise in enabling portable, rapid molecular-assisted cancer diagnoses. A key challenge in democratizing long-read sequencing technology in the biomedical and clinical community is the lack of graphical bioinformatics software tools which can efficiently process the raw nanopore reads, support graphical output and interactive visualizations for interpretations of results. Another obstacle is that high performance software tools for long-read sequencing data analyses often leverage graphics processing units (GPU), which is challenging and time-consuming to configure, especially on the cloud. We present a graphical cloud-enabled workflow for fast, interactive analysis of nanopore sequencing data using GPUs. Users customize parameters, monitor execution and visualize results through an accessible graphical interface. The workflow and its components are completely containerized to ensure reproducibility and facilitate installation of the GPU-enabled software. We also provide an Amazon Machine Image (AMI) with all software and drivers pre-installed for GPU computing on the cloud. Most importantly, we demonstrate the potential of applying our software tools to reduce the turnaround time of cancer diagnostics by generating blood cancer (NB4, K562, ME1, 238 MV4;11) cell line Nanopore data using the Flongle adapter. We observe a 29x speedup and a 93x reduction in costs for the rate-limiting basecalling step in the analysis of blood cancer cell line data. Our interactive and efficient software tools will make analyses of Nanopore data using GPU and cloud computing accessible to biomedical and clinical scientists, thus facilitating the adoption of cost effective, fast, portable and real-time long-read sequencing.

中文翻译:

用于处理长读长测序数据的图形化、交互式和支持 GPU 的工作流程

长读长测序在实现便携式、快速的分子辅助癌症诊断方面具有很大的前景。在生物医学和临床社区中普及长读长测序技术的一个关键挑战是缺乏能够有效处理原始纳米孔读长、支持图形输出和交互式可视化结果解释的图形生物信息学软件工具。另一个障碍是用于长读长测序数据分析的高性能软件工具通常利用图形处理单元 (GPU),这具有挑战性且配置耗时,尤其是在云上。我们展示了一个图形化的云工作流程,用于使用 GPU 对纳米孔测序数据进行快速、交互式的分析。用户通过可访问的图形界面自定义参数、监控执行和可视化结果。工作流程及其组件完全容器化,以确保可重复性并便于安装支持 GPU 的软件。我们还提供亚马逊机器映像 (AMI),其中预装了所有软件和驱动程序,用于云端 GPU 计算。最重要的是,我们展示了应用我们的软件工具通过使用 Flongle 适配器生成血癌(NB4、K562、ME1、238 MV4;11)细胞系纳米孔数据来缩短癌症诊断周转时间的潜力。我们观察到血液癌细胞系数据分析中限速碱基识别步骤的速度提高了 29 倍,成本降低了 93 倍。我们的交互式高效软件工具将使生物医学和临床科学家可以使用 GPU 和云计算对 Nanopore 数据进行分析,从而促进采用具有成本效益的、
更新日期:2021-08-23
down
wechat
bug