当前位置: X-MOL 学术J. Big Data › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An adaptive and real-time based architecture for financial data integration
Journal of Big Data ( IF 8.1 ) Pub Date : 2019-11-11 , DOI: 10.1186/s40537-019-0260-x
Noussair Fikri , Mohamed Rida , Noureddine Abghour , Khalid Moussaid , Amina El Omri

In this paper we are proposing an adaptive and real-time approach to resolve real-time financial data integration latency problems and semantic heterogeneity. Due to constraints that we have faced in some projects that requires real-time massive financial data integration and analysis, we decided to follow a new approach by combining a hybrid financial ontology, resilient distributed datasets and real-time discretized stream. We create a real-time data integration pipeline to avoid all problems of classic Extract-Transform-Load tools, which are data processing latency, functional miscomprehensions and metadata heterogeneity. This approach is considered as contribution to enhance reporting quality and availability in short time frames, the reason of the use of Apache Spark. We studied Extract-Transform-Load (ETL) concepts, data warehousing fundamentals, big data processing technics and oriented containers clustering architecture, in order to replace the classic data integration and analysis process by our new concept resilient distributed DataStream for online analytical process (RDD4OLAP) cubes which are consumed by using Spark SQL or Spark Core basics.

中文翻译:

基于自适应和实时的金融数据集成架构

在本文中,我们提出了一种自适应的实时方法来解决实时金融数据集成时延问题和语义异质性。由于我们在一些需要实时大量财务数据集成和分析的项目中遇到的限制,我们决定采用一种新方法,将混合财务本体,弹性分布式数据集和实时离散流相结合。我们创建了一个实时数据集成管道,以避免经典的Extract-Transform-Load工具的所有问题,例如数据处理延迟,功能误解和元数据异质性。这种方法被认为是在短时间内提高报告质量和可用性的一种贡献,这是使用Apache Spark的原因。我们研究了提取-转换-加载(ETL)概念,
更新日期:2019-11-11
down
wechat
bug