当前位置: X-MOL 学术Protein Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Integration of the Rosetta suite with the python software stack via reproducible packaging and core programming interfaces for distributed simulation.
Protein Science ( IF 8 ) Pub Date : 2019-12-02 , DOI: 10.1002/pro.3721
Alexander S Ford 1, 2, 3 , Brian D Weitzner 2, 3, 4 , Christopher D Bahl 1, 5, 6
Affiliation  

The Rosetta software suite for macromolecular modeling is a powerful computational toolbox for protein design, structure prediction, and protein structure analysis. The development of novel Rosetta-based scientific tools requires two orthogonal skill sets: deep domain-specific expertise in protein biochemistry and technical expertise in development, deployment, and analysis of molecular simulations. Furthermore, the computational demands of molecular simulation necessitate large scale cluster-based or distributed solutions for nearly all scientifically relevant tasks. To reduce the technical barriers to entry for new development, we integrated Rosetta with modern, widely adopted computational infrastructure. This allows simplified deployment in large-scale cluster and cloud computing environments, and effective reuse of common libraries for simulation execution and data analysis. To achieve this, we integrated Rosetta with the Conda package manager; this simplifies installation into existing computational environments and packaging as docker images for cloud deployment. Then, we developed programming interfaces to integrate Rosetta with the PyData stack for analysis and distributed computing, including the popular tools Jupyter, Pandas, and Dask. We demonstrate the utility of these components by generating a library of a thousand de novo disulfide-rich miniproteins in a hybrid simulation that included cluster-based design and interactive notebook-based analyses. Our new tools enable users, who would otherwise not have access to the necessary computational infrastructure, to perform state-of-the-art molecular simulation and design with Rosetta.

中文翻译:

通过可复制的包装和用于分布式仿真的核心编程接口,将Rosetta套件与python软件堆栈集成在一起。

用于高分子建模的Rosetta软件套件是用于蛋白质设计,结构预测和蛋白质结构分析的强大计算工具箱。基于Rosetta的新型科学工具的开发需要两个正交的技能集:蛋白质生物化学领域的深层专长和分子模拟的开发,部署和分析中的技术专长。此外,分子模拟的计算需求需要针对几乎所有与科学有关的任务使用大规模的基于簇或分布式的解决方案。为了减少进入新开发项目的技术壁垒,我们将Rosetta与广泛采用的现代计算基础架构进行了集成。这样可以简化在大型集群和云计算环境中的部署,有效地重用通用库以进行仿真执行和数据分析。为此,我们将Rosetta与Conda软件包管理器集成在一起。这简化了在现有计算环境中的安装,并简化了作为用于云部署的docker映像的打包方式。然后,我们开发了编程接口以将Rosetta与PyData堆栈集成以进行分析和分布式计算,其中包括流行的工具Jupyter,Pandas和Dask。我们通过在包含基于聚类的设计和基于交互式笔记本的分析的混合仿真中生成一千个从头开始的富含二硫化物的小蛋白文库,证明了这些组件的实用性。我们的新工具使用户无法使用必要的计算基础架构,
更新日期:2019-12-21
down
wechat
bug