当前位置: X-MOL 学术Brief. Funct. Genomics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Perspectives of using Cloud computing in integrative analysis of multi-omics data
Briefings in Functional Genomics ( IF 2.5 ) Pub Date : 2021-01-28 , DOI: 10.1093/bfgp/elab007
Dariusz R Augustyn 1 , Łukasz Wyciślik 1 , Dariusz Mrozek 1
Affiliation  

Integrative analysis of multi-omics data is usually computationally demanding. It frequently requires building complex, multi-step analysis pipelines, applying dedicated techniques for data processing and combining several data sources. These efforts lead to a better understanding of life processes, current health state or the effects of therapeutic activities. However, many omics data analysis solutions focus only on a selected problem, disease, types of data or organisms. Moreover, they are implemented for general-purpose scientific computational platforms that most often do not easily scale the calculations natively. These features are not conducive to advances in understanding genotype–phenotypic relationships. Fortunately, with new technological paradigms, including Cloud computing, virtualization and containerization, these functionalities could be orchestrated for easy scaling and building independent analysis pipelines for omics data. Therefore, solutions can be re-used for purposes that they were not primarily designed. This paper shows perspectives of using Cloud computing advances and containerization approach for such a purpose. We first review how the Cloud computing model is utilized in multi-omics data analysis and show weak points of the adopted solutions. Then, we introduce containerization concepts, which allow both scaling and linking of functional services designed for various purposes. Finally, on the Bioconductor software package example, we disclose a verified concept model of a universal solution that exhibits the potentials for performing integrative analysis of multiple omics data sources.

中文翻译:

云计算在多组学数据综合分析中的应用前景

多组学数据的综合分析通常在计算上要求很高。它经常需要构建复杂的多步骤分析管道,应用专门的数据处理技术并结合多个数据源。这些努力有助于更好地了解生命过程、当前健康状况或治疗活动的效果。然而,许多组学数据分析解决方案只关注选定的问题、疾病、数据类型或生物体。此外,它们是为通用科学计算平台实现的,这些平台通常不容易在本地扩展计算。这些特征不利于理解基因型-表型关系的进展。幸运的是,随着新的技术范式,包括云计算、虚拟化和容器化,可以编排这些功能,以便轻松扩展和为组学数据构建独立的分析管道。因此,解决方案可以被重新用于它们不是主要设计的目的。本文展示了为此目的使用云计算进步和容器化方法的观点。我们首先回顾了云计算模型在多组学数据分析中的应用,并展示了所采用解决方案的弱点。然后,我们介绍了容器化概念,它允许扩展和链接为各种目的而设计的功能服务。最后,在 Bioconductor 软件包示例中,我们公开了一个经过验证的通用解决方案概念模型,该模型展示了对多个组学数据源进行综合分析的潜力。
更新日期:2021-01-28
down
wechat
bug