当前位置: X-MOL 学术Gigascience › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Arteria: An automation system for a sequencing core facility.
GigaScience ( IF 11.8 ) Pub Date : 2019-12-01 , DOI: 10.1093/gigascience/giz135
Johan Dahlberg 1 , Johan Hermansson 1 , Steinar Sturlaugsson 1 , Mariya Lysenkova 1 , Patrik Smeds 2 , Claes Ladenvall 2 , Roman Valls Guimera 3 , Florian Reisinger 3 , Oliver Hofmann 3 , Pontus Larsson 1
Affiliation  

BACKGROUND In recent years, nucleotide sequencing has become increasingly instrumental in both research and clinical settings. This has led to an explosive growth in sequencing data produced worldwide. As the amount of data increases, so does the need for automated solutions for data processing and analysis. The concept of workflows has gained favour in the bioinformatics community, but there is little in the scientific literature describing end-to-end automation systems. Arteria is an automation system that aims at providing a solution to the data-related operational challenges that face sequencing core facilities. FINDINGS Arteria is built on existing open source technologies, with a modular design allowing for a community-driven effort to create plug-and-play micro-services. In this article we describe the system, elaborate on the underlying conceptual framework, and present an example implementation. Arteria can be reduced to 3 conceptual levels: orchestration (using an event-based model of automation), process (the steps involved in processing sequencing data, modelled as workflows), and execution (using a series of RESTful micro-services). This creates a system that is both flexible and scalable. Arteria-based systems have been successfully deployed at 3 sequencing core facilities. The Arteria Project code, written largely in Python, is available as open source software, and more information can be found at https://arteria-project.github.io/ . CONCLUSIONS We describe the Arteria system and the underlying conceptual framework, demonstrating how this model can be used to automate data handling and analysis in the context of a sequencing core facility.

中文翻译:

Arteria:用于测序核心设施的自动化系统。

背景技术近年来,核苷酸测序已在研究和临床环境中变得越来越有用。这导致全球产生的测序数据爆炸性增长。随着数据量的增加,对用于数据处理和分析的自动化解决方案的需求也在增加。工作流的概念已在生物信息学界得到青睐,但是在描述端到端自动化系统的科学文献中却很少。Arteria是一个自动化系统,旨在提供解决方案,应对测序核心设施面临的与数据相关的运营挑战。调查结果Arteria建立在现有开源技术的基础之上,采用模块化设计,允许社区推动创建即插即用的微服务。在本文中,我们描述了系统,详细说明基础概念框架,并提供示例实现。Arteria可以简化为3个概念级别:编排(使用基于事件的自动化模型),流程(处理序列数据中涉及的步骤,建模为工作流)和执行(使用一系列RESTful微服务)。这将创建一个既灵活又可扩展的系统。基于动脉的系统已成功部署在3个测序核心设施中。Arteria Project代码主要使用Python编写,可以作为开源软件使用,有关更多信息,请访问https://arteria-project.github.io/。结论我们描述了动脉系统和底层的概念框架,展示了如何在测序核心设施的背景下使用该模型来自动进行数据处理和分析。
更新日期:2019-12-11
down
wechat
bug