Serverless computing in omics data analysis and integration,Briefings in Bioinformatics

当前位置： X-MOL 学术 › Brief. Bioinform. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Serverless computing in omics data analysis and integration
Briefings in Bioinformatics ( IF 9.5 ) Pub Date : 2021-08-09 , DOI: 10.1093/bib/bbab349
Piotr Grzesik ₁ , Dariusz R Augustyn ₁ , Łukasz Wyciślik ₁ , Dariusz Mrozek ₁

Affiliation

A comprehensive analysis of omics data can require vast computational resources and access to varied data sources that must be integrated into complex, multi-step analysis pipelines. Execution of many such analyses can be accelerated by applying the cloud computing paradigm, which provides scalable resources for storing data of different types and parallelizing data analysis computations. Moreover, these resources can be reused for different multi-omics analysis scenarios. Traditionally, developers are required to manage a cloud platform’s underlying infrastructure, configuration, maintenance and capacity planning. The serverless computing paradigm simplifies these operations by automatically allocating and maintaining both servers and virtual machines, as required for analysis tasks. This paradigm offers highly parallel execution and high scalability without manual management of the underlying infrastructure, freeing developers to focus on operational logic. This paper reviews serverless solutions in bioinformatics and evaluates their usage in omics data analysis and integration. We start by reviewing the application of the cloud computing model to a multi-omics data analysis and exposing some shortcomings of the early approaches. We then introduce the serverless computing paradigm and show its applicability for performing an integrative analysis of multiple omics data sources in the context of the COVID-19 pandemic.

中文翻译：

组学数据分析和集成中的无服务器计算

对组学数据的全面分析可能需要大量计算资源和访问各种数据源，这些数据源必须集成到复杂的多步骤分析管道中。许多此类分析的执行可以通过应用云计算范式来加速，该范式为存储不同类型的数据和并行化数据分析计算提供了可扩展的资源。此外，这些资源可以重复用于不同的多组学分析场景。传统上，开发人员需要管理云平台的底层基础设施、配置、维护和容量规划。无服务器计算范式通过根据分析任务的需要自动分配和维护服务器和虚拟机来简化这些操作。这种范式提供了高度并行执行和高度可扩展性，无需手动管理底层基础设施，让开发人员可以专注于操作逻辑。本文回顾了生物信息学中的无服务器解决方案，并评估了它们在组学数据分析和集成中的应用。我们首先回顾了云计算模型在多组学数据分析中的应用，并揭示了早期方法的一些缺点。然后，我们介绍了无服务器计算范式，并展示了其在 COVID-19 大流行背景下对多个组学数据源进行综合分析的适用性。本文回顾了生物信息学中的无服务器解决方案，并评估了它们在组学数据分析和集成中的应用。我们首先回顾了云计算模型在多组学数据分析中的应用，并揭示了早期方法的一些缺点。然后，我们介绍了无服务器计算范式，并展示了其在 COVID-19 大流行背景下对多个组学数据源进行综合分析的适用性。本文回顾了生物信息学中的无服务器解决方案，并评估了它们在组学数据分析和集成中的应用。我们首先回顾了云计算模型在多组学数据分析中的应用，并揭示了早期方法的一些缺点。然后，我们介绍了无服务器计算范式，并展示了其在 COVID-19 大流行背景下对多个组学数据源进行综合分析的适用性。

更新日期：2021-08-09

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>