当前位置: X-MOL 学术arXiv.cs.MS › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Scipp: Scientific data handling with labeled multi-dimensional arrays for C++ and Python
arXiv - CS - Mathematical Software Pub Date : 2020-10-01 , DOI: arxiv-2010.00257
Simon Heybrock, Owen Arnold, Igor Gudich, Daniel Nixon, and Neil Vaytet

Scipp is heavily inspired by the Python library xarray. It enriches raw NumPy-like multi-dimensional arrays of data by adding named dimensions and associated coordinates. Multiple arrays are combined into datasets. On top of this, scipp introduces (i) implicit handling of physical units, (ii) implicit propagation of uncertainties, (iii) support for histograms, i.e., bin-edge coordinate axes, which exceed the data's dimension extent by one, and (iv) support for event data. In conjunction these features enable a more natural and more concise user experience. The combination of named dimensions, coordinates, and units helps to drastically reduce the risk for programming errors. The core of scipp is written in C++ to open opportunities for performance improvements that a Python-based solution would not allow for. On top of the C++ core, scipp's Python components provide functionality for plotting and content representations, e.g., for use in Jupyter Notebooks. While none of scipp's concepts in isolation is novel per-se, we are not aware of any project combining all of these aspects in a single coherent software package.

中文翻译:

Scipp:使用 C++ 和 Python 的标记多维数组进行科学数据处理

Scipp 深受 Python 库 xarray 的启发。它通过添加命名维度和关联坐标来丰富原始的 NumPy 式多维数据数组。多个数组组合成数据集。最重要的是,scipp 引入了 (i) 物理单位的隐式处理,(ii) 不确定性的隐式传播,(iii) 支持直方图,即 bin-edge 坐标轴,它超过了数据的维度范围一个,以及( iv) 支持事件数据。结合这些功能,可以实现更自然、更简洁的用户体验。命名尺寸、坐标和单位的组合有助于显着降低编程错误的风险。scipp 的核心是用 C++ 编写的,以提供基于 Python 的解决方案所不允许的性能改进机会。在 C++ 核心之上,scipp' ■ Python 组件提供绘图和内容表示的功能,例如,用于 Jupyter Notebooks。虽然单独的 scipp 概念本身都不是新颖的,但我们不知道有任何项目将所有这些方面结合在一个连贯的软件包中。
更新日期:2020-10-02
down
wechat
bug