当前位置: X-MOL 学术J. Cloud Comp. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
IoT-enabled directed acyclic graph in spark cluster
Journal of Cloud Computing ( IF 3.7 ) Pub Date : 2020-09-14 , DOI: 10.1186/s13677-020-00195-6
Jahwan Koo , Nawab Muhammad Faseeh Qureshi , Isma Farah Siddiqui , Asad Abbas , Ali Kashif Bashir

Real-time data streaming fetches live sensory segments of the dataset in the heterogeneous distributed computing environment. This process assembles data chunks at a rapid encapsulation rate through a streaming technique that bundles sensor segments into multiple micro-batches and extracts into a repository, respectively. Recently, the acquisition process is enhanced with an additional feature of exchanging IoT devices’ dataset comprised of two components: (i) sensory data and (ii) metadata. The body of sensory data includes record information, and the metadata part consists of logs, heterogeneous events, and routing path tables to transmit micro-batch streams into the repository. Real-time acquisition procedure uses the Directed Acyclic Graph (DAG) to extract live query outcomes from in-place micro-batches through MapReduce stages and returns a result set. However, few bottlenecks affect the performance during the execution process, such as (i) homogeneous micro-batches formation only, (ii) complexity of dataset diversification, (iii) heterogeneous data tuples processing, and (iv) linear DAG workflow only. As a result, it produces huge processing latency and the additional cost of extracting event-enabled IoT datasets. Thus, the Spark cluster that processes Resilient Distributed Dataset (RDD) in a fast-pace using Random access memory (RAM) defies expected robustness in processing IoT streams in the distributed computing environment. This paper presents an IoT-enabled Directed Acyclic Graph (I-DAG) technique that labels micro-batches at the stage of building a stream event and arranges stream elements with event labels. In the next step, heterogeneous stream events are processed through the I-DAG workflow, which has non-linear DAG operation for extracting queries’ results in a Spark cluster. The performance evaluation shows that I-DAG resolves homogeneous IoT-enabled stream event issues and provides an effective stream event heterogeneous solution for IoT-enabled datasets in spark clusters.

中文翻译:

Spark集群中支持IoT的有向无环图

实时数据流在异构分布式计算环境中获取数据集的实时感官片段。此过程通过流技术以快速封装速率组装数据块,该技术将传感器段分别捆绑到多个微批中并提取到存储库中。最近,通过交换物联网设备的数据集包括两个组件的附加功能,增强了获取过程:(i)感官数据和(ii)元数据。感官数据主体包括记录信息,元数据部分由日志,异构事件和路由路径表组成,以将微批处理流传输到存储库中。实时采集过程使用有向无环图(DAG)通过MapReduce阶段从就地微批提取实时查询结果,并返回结果集。但是,很少有瓶颈会影响执行过程中的性能,例如(i)仅同质的微批处理,(ii)数据集多样化的复杂性,(iii)异构数据元组处理和(iv)仅线性DAG工作流程。结果,它产生了巨大的处理延迟,并提取了启用事件的物联网数据集的额外成本。因此,使用随机存取存储器(RAM)在快速进程中处理弹性分布式数据集(RDD)的Spark集群在分布式计算环境中处理IoT流时缺乏预期的鲁棒性。本文提出了一种基于IoT的有向无环图(I-DAG)技术,该技术在构建流事件的阶段标记微批,并使用事件标签排列流元素。下一步,将通过I-DAG工作流处理异构流事件,该工作流具有用于在Spark集群中提取查询结果的非线性DAG操作。性能评估表明,I-DAG解决了同类的支持IoT的流事件问题,并为Spark集群中支持IoT的数据集提供了有效的流事件异构解决方案。
更新日期:2020-09-14
down
wechat
bug