当前位置: X-MOL 学术Journal of Data and Information Science › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
RDFAdaptor: Efficient ETL Plugins for RDF Data Process
Journal of Data and Information Science ( IF 1.5 ) Pub Date : 2021-04-14 , DOI: 10.2478/jdis-2021-0020
Jiao Li 1 , Guojian Xian 1, 2 , Ruixue Zhao 1, 2 , Yongwen Huang 1 , Yuantao Kou 1, 2 , Tingting Luo 1 , Tan Sun 2, 3
Affiliation  

Abstract Purpose The interdisciplinary nature and rapid development of the Semantic Web led to the mass publication of RDF data in a large number of widely accepted serialization formats, thus developing out the necessity for RDF data processing with specific purposes. The paper reports on an assessment of chief RDF data endpoint challenges and introduces the RDF Adaptor, a set of plugins for RDF data processing which covers the whole life-cycle with high efficiency. Design/methodology/approach The RDFAdaptor is designed based on the prominent ETL tool—Pentaho Data Integration—which provides a user-friendly and intuitive interface and allows connect to various data sources and formats, and reuses the Java framework RDF4J as middleware that realizes access to data repositories, SPARQL endpoints and all leading RDF database solutions with SPARQL 1.1 support. It can support effortless services with various configuration templates in multi-scenario applications, and help extend data process tasks in other services or tools to complement missing functions. Findings The proposed comprehensive RDF ETL solution—RDFAdaptor—provides an easy-to-use and intuitive interface, supports data integration and federation over multi-source heterogeneous repositories or endpoints, as well as manage linked data in hybrid storage mode. Research limitations The plugin set can support several application scenarios of RDF data process, but error detection/check and interaction with other graph repositories remain to be improved. Practical implications The plugin set can provide user interface and configuration templates which enable its usability in various applications of RDF data generation, multi-format data conversion, remote RDF data migration, and RDF graph update in semantic query process. Originality/value This is the first attempt to develop components instead of systems that can include extract, consolidate, and store RDF data on the basis of an ecologically mature data warehousing environment.

中文翻译:

RDFAdaptor:用于 RDF 数据处理的高效 ETL 插件

摘要目的语义网的跨学科性质和快速发展导致RDF数据以大量被广泛接受的序列化格式大量发布,从而产生了对RDF数据进行特定目的处理的必要性。本文报告了对主要 RDF 数据端点挑战的评估,并介绍了 RDF 适配器,这是一组用于 RDF 数据处理的插件,高效地覆盖了整个生命周期。设计/方法/途径 RDFAdaptor 是基于著名的 ETL 工具 - Pentaho Data Integration 设计的,它提供了一个用户友好和直观的界面,并允许连接到各种数据源和格式,并重用 Java 框架 RDF4J 作为中间件,实现对数据存储库、SPARQL 端点和所有支持 SPARQL 1.1 的领先 RDF 数据库解决方案的访问。它可以在多场景应用程序中通过各种配置模板支持轻松的服务,并帮助扩展其他服务或工具中的数据处理任务以补充缺失的功能。结果 所提议的综合 RDF ETL 解决方案——RDFAdaptor——提供了一个易于使用和直观的界面,支持多源异构存储库或端点上的数据集成和联合,以及在混合存储模式下管理链接数据。研究局限 插件集可以支持 RDF 数据处理的多种应用场景,但错误检测/检查以及与其他图存储库的交互仍有待改进。实际意义 该插件集可以提供用户界面和配置模板,使其适用于RDF数据生成、多格式数据转换、远程RDF数据迁移和语义查询过程中的RDF图更新等各种应用。原创性/价值 这是第一次尝试开发组件而不是系统,这些系统可以包括提取、整合、
更新日期:2021-04-14
down
wechat
bug