Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Collective knowledge: organizing research projects as a database of reusable components and portable workflows with common interfaces
Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences ( IF 4.3 ) Pub Date : 2021-03-29 , DOI: 10.1098/rsta.2020.0211
Grigori Fursin 1
Affiliation  

This article provides the motivation and overview of the Collective Knowledge Framework (CK or cKnowledge). The CK concept is to decompose research projects into reusable components that encapsulate research artifacts and provide unified application programming interfaces (APIs), command-line interfaces (CLIs), meta descriptions and common automation actions for related artifacts. The CK framework is used to organize and manage research projects as a database of such components. Inspired by the USB ‘plug and play’ approach for hardware, CK also helps to assemble portable workflows that can automatically plug in compatible components from different users and vendors (models, datasets, frameworks, compilers, tools). Such workflows can build and run algorithms on different platforms and environments in a unified way using the customizable CK program pipeline with software detection plugins and the automatic installation of missing packages. This article presents a number of industrial projects in which the modular CK approach was successfully validated in order to automate benchmarking, auto-tuning and co-design of efficient software and hardware for machine learning and artificial intelligence in terms of speed, accuracy, energy, size and various costs. The CK framework also helped to automate the artifact evaluation process at several computer science conferences as well as to make it easier to reproduce, compare and reuse research techniques from published papers, deploy them in production, and automatically adapt them to continuously changing datasets, models and systems. The long-term goal is to accelerate innovation by connecting researchers and practitioners to share and reuse all their knowledge, best practices, artifacts, workflows and experimental results in a common, portable and reproducible format at cKnowledge.io.

This article is part of the theme issue ‘Reliability and reproducibility in computational science: implementing verification, validation and uncertainty quantification in silico’.



中文翻译:

集体知识:将研究项目组织为可重用组件和具有通用接口的可移植工作流的数据库

本文提供了集体知识框架(CK或cKnowledge)的动机和概述。CK的概念是将研究项目分解为可重用的组件,这些组件封装了研究工件,并为相关工件提供了统一的应用程序编程接口(API),命令行界面(CLI),元描述和通用自动化操作。CK框架用于组织和管理研究项目,作为此类组件的数据库。受到USB硬件“即插即用”方法的启发,CK还帮助组装便携式工作流,该工作流可以自动插入来自不同用户和供应商的兼容组件(模型,数据集,框架,编译器,工具)。这样的工作流可以使用带有软件检测插件的可定制CK程序管道以及丢失软件包的自动安装,以统一的方式在不同的平台和环境上构建和运行算法。本文介绍了许多工业项目,其中成功通过了模块化CK方法的验证,以便在速度,准确性,能源,大小和各种费用。CK框架还帮助在几次计算机科学会议上实现了工件评估过程的自动化,并使复制,比较和重用已发表论文中的研究技术,将其部署到生产环境中并使其自动适应不断变化的数据集变得更加容易,模型和系统。长期目标是通过在cKnowledge.io上以共同,可移植和可复制的格式连接研究人员和从业人员以共享和重用其所有知识,最佳实践,工件,工作流程和实验结果,从而加快创新速度。

这篇文章是主题问题的一部分“计算科学可靠性和可重复性:实施验证,确认和量化的不确定性,在硅片”。

更新日期:2021-03-29
down
wechat
bug