Elsevier

Knowledge-Based Systems

Volume 232, 28 November 2021, 107489
Knowledge-Based Systems

TITAN: A knowledge-based platform for Big Data workflow management

https://doi.org/10.1016/j.knosys.2021.107489Get rights and content
Under a Creative Commons license
open access

Abstract

Modern applications of Big Data are transcending from being scalable solutions of data processing and analysis, to now provide advanced functionalities with the ability to exploit and understand the underpinning knowledge. This change is promoting the development of tools in the intersection of data processing, data analysis, knowledge extraction and management. In this paper, we propose TITAN, a software platform for managing all the life cycle of science workflows from deployment to execution in the context of Big Data applications. This platform is characterised by a design and operation mode driven by semantics at different levels: data sources, problem domain and workflow components. The proposed platform is developed upon an ontological framework of meta-data consistently managing processes and models and taking advantage of domain knowledge. TITAN comprises a well-grounded stack of Big Data technologies including Apache Kafka for inter-component communication, Apache Avro for data serialisation and Apache Spark for data analytics. A series of use cases are conducted for validation, which comprises workflow composition and semantic meta-data management in academic and real-world fields of human activity recognition and land use monitoring from satellite images.

Keywords

Big Data analytics
Semantics
Knowledge extraction

Cited by (0)

This work has been partially funded by the Spanish Ministry of Science and Innovation via Grant PID2020-112540RB-C41 (AEI/FEDER, UE) and Andalusian PAIDI program with grant P18-RT-2799. Funding for open access charge: Universidad de Málaga / CBUA.

1

Supported by Grant PRE2018-084280 (Spanish Ministry of Science, Innovation and Universities ).