当前位置: X-MOL 学术J. Internet Serv. Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
DOD-ETL: distributed on-demand ETL for near real-time business intelligence
Journal of Internet Services and Applications ( IF 2.4 ) Pub Date : 2019-11-20 , DOI: 10.1186/s13174-019-0121-z
Gustavo V. Machado , Ítalo Cunha , Adriano C. M. Pereira , Leonardo B. Oliveira

The competitive dynamics of the globalized market demand information on the internal and external reality of corporations. Information is a precious asset and is responsible for establishing key advantages to enable companies to maintain their leadership. However, reliable, rich information is no longer the only goal. The time frame to extract information from data determines its usefulness. This work proposes DOD-ETL, a tool that addresses, in an innovative manner, the main bottleneck in Business Intelligence solutions, the Extract Transform Load process (ETL), providing it in near real-time. DOD-ETL achieves this by combining an on-demand data stream pipeline with a distributed, parallel and technology-independent architecture with in-memory caching and efficient data partitioning. We compared DOD-ETL with other Stream Processing frameworks used to perform near real-time ETL and found DOD-ETL executes workloads up to 10 times faster. We have deployed it in a large steelworks as a replacement for its previous ETL solution, enabling near real-time reports previously unavailable.

中文翻译:

DOD-ETL:分布式按需ETL,用于近实时商业智能

全球化市场的竞争动态需要有关公司内部和外部现实的信息。信息是一项宝贵的资产,它负责建立关键优势,使公司能够保持领先地位。但是,可靠,丰富的信息不再是唯一的目标。从数据中提取信息的时间范围决定了其有用性。这项工作提出了DOD-ETL,该工具以创新的方式解决了商务智能解决方案的主要瓶颈,即提取转换加载过程(ETL),并提供了近乎实时的功能。DOD-ETL通过将按需数据流管道与具有内存中缓存和有效数据分区功能的分布式,并行和技术独立的体系结构相结合,实现了这一目标。我们将DOD-ETL与其他用于执行近实时ETL的流处理框架进行了比较,发现DOD-ETL执行工作负载的速度提高了10倍。我们已经将其部署在大型钢铁厂中,以替代以前的ETL解决方案,从而实现了以前无法获得的近实时报告。
更新日期:2019-11-20
down
wechat
bug