当前位置: X-MOL 学术Astron. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
CosmoHub: Interactive exploration and distribution of astronomical data on Hadoop
Astronomy and Computing ( IF 1.9 ) Pub Date : 2020-05-16 , DOI: 10.1016/j.ascom.2020.100391
P. Tallada , J. Carretero , J. Casals , C. Acosta-Silva , S. Serrano , M. Caubet , F.J. Castander , E. César , M. Crocce , M. Delfino , M. Eriksen , P. Fosalba , E. Gaztañaga , G. Merino , C. Neissner , N. Tonello

We present CosmoHub (https://cosmohub.pic.es), a web application based on Hadoop to perform interactive exploration and distribution of massive cosmological datasets. Recent Cosmology seeks to unveil the nature of both dark matter and dark energy mapping the large-scale structure of the Universe, through the analysis of massive amounts of astronomical data, progressively increasing during the last (and future) decades with the digitization and automation of the experimental techniques.

CosmoHub, hosted and developed at the Port d’Informació Científica (PIC), provides support to a worldwide community of scientists, without requiring the end user to know any Structured Query Language (SQL). It is serving data of several large international collaborations such as the Euclid space mission, the Dark Energy Survey (DES), the Physics of the Accelerating Universe Survey (PAUS) and the Marenostrum Institut de Ciències de l’Espai (MICE) numerical simulations. While originally developed as a PostgreSQL relational database web frontend, this work describes the current version of CosmoHub, built on top of Apache Hive, which facilitates scalable reading, writing and managing huge datasets. As CosmoHub’s datasets are seldomly modified, Hive it is a better fit.

Over 60 TiB of cataloged information and 50×109 astronomical objects can be interactively explored using an integrated visualization tool which includes 1D histogram and 2D heatmap plots. In our current implementation, online exploration of datasets of 109 objects can be done in a timescale of tens of seconds. Users can also download customized subsets of data in standard formats generated in few minutes.



中文翻译:

CosmoHub:在Hadoop上进行天文数据的交互式探索和分发

我们介绍了CosmoHub(https://cosmohub.pic.es),这是一个基于Hadoop的Web应用程序,用于执行交互式探索和分布大量宇宙学数据集。最近的宇宙学试图通过分析大量的天文数据,揭示暗物质和暗能量的本质,绘制出宇宙的大规模结构,并在过去(和未来)几十年中随着数字化和数字化的自动化而逐步增加实验技术。

CosmoHub由信息中心港口(PIC)托管和开发,可为全球科学家社区提供支持,而无需最终用户了解任何结构化查询语言(SQL)。它正在提供一些大型国际合作的数据,例如欧几里德太空任务,暗能量调查(DES),加速宇宙调查物理(PAUS)和西班牙马伦斯特鲁姆研究所(MICE)数值模拟。虽然最初是作为PostgreSQL关系数据库Web前端开发的,但本文描述了CosmoHub的当前版本,该版本基于Apache Hive构建,该版本有助于可扩展的读取,写入和管理大型数据集。由于很少修改CosmoHub的数据集,因此Hive更适合。

超过60 TiB的分类信息和 50×1个09可以使用集成的可视化工具(包括1D直方图和2D热图图)以交互方式探索天文物体。在我们当前的实现中,在线浏览以下数据集1个09对象可以在数十秒的时间范围内完成。用户还可以下载几分钟内生成的标准格式的自定义数据子集。

更新日期:2020-05-16
down
wechat
bug