当前位置: X-MOL 学术Inform. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An Alternative View on Data Processing Pipelines from the DOLAP 2019 Perspective
Information Systems ( IF 3.0 ) Pub Date : 2019-12-27 , DOI: 10.1016/j.is.2019.101489
Oscar Romero , Robert Wrembel , Il-Yeol Song

Data science requires constructing data processing pipelines (DPPs), which span diverse phases such as data integration, cleaning, pre-processing, and analysis. However, current solutions lack a strong data engineering perspective. As consequence, DPPs are error-prone, inefficient w.r.t. human efforts, and inefficient w.r.t. execution time. We claim that DPP design, development, testing, deployment, and execution should benefit from a standardized DPP architecture and from well-known data engineering solutions. This claim is supported by our experience in real projects and trends in the field, and it opens new paths for research and technology. With this spirit, we outline five research opportunities that represent novel trends towards building DPPs. Finally, we highlight that the best DOLAP 2019 papers selected for the DOLAP 2019 Information Systems Special Issue fall in this category and highlight the relevance of advanced data engineering for data science.



中文翻译:

从DOLAP 2019角度看数据处理管道的另一种观点

数据科学要求构建数据处理管道(DPP),该管道跨越多个阶段,例如数据集成,清理,预处理和分析。但是,当前的解决方案缺乏强大的数据工程前景。因此,DPP容易出错,人工工作效率低下,执行时间效率低下。我们声称DPP的设计,开发,测试,部署和执行应受益于标准化的DPP体系结构和著名的数据工程解决方案。我们在该领域的实际项目和趋势中的经验为这一主张提供了支持,并且为研究和技术开辟了新的途径。本着这种精神,我们概述了五个研究机会,这些机会代表了构建DPP的新趋势。最后,

更新日期:2019-12-27
down
wechat
bug