当前位置: X-MOL 学术Semant. Web › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Charaterizing RDF graphs through graph-based measures – framework and assessment
Semantic Web ( IF 3 ) Pub Date : 2020-10-20 , DOI: 10.3233/sw-200409
Matthäus Zloch 1, 2 , Maribel Acosta 3, 4 , Daniel Hienert 1 , Stefan Conrad 2 , Stefan Dietze 1, 2
Affiliation  

The topological structure of RDF graphs inherently differs from other types of graphs, like social graphs, due to the pervasive existence of hierarchical relations (TBox), which complement transversal relations (ABox). Graph measures capture such particularities through descriptive statistics. Besides the classical set of measures established in the field of network analysis, such as size and volume of the graph or the type of degree distribution of its vertices, there has been some effort to define measures that capture some of the aforementioned particularities RDF graphs adhere to. However, some of them are redundant, computationally expensive, and not meaningful enough to describe RDF graphs. In particular, it is not clear which of them are efficient metrics to capture specific distinguishing characteristics of datasets in different knowledge domains (e.g., Cross Domain vs. Linguistics). In this work, we address the problem of identifying a minimal set of measures that is efficient, essential (non-redundant), and meaningful. Based on 54 measures and a sample of 280 graphs of nine knowledge domains from the Linked Open Data Cloud, we identify an essential set of 13 measures, having the capacity to describe graphs concisely. These measures have the capacity to present the topological structures and differences of datasets in established knowledge domains.

中文翻译:

通过基于图的度量对RDF图进行特征化–框架和评估

RDF图的拓扑结构固有地与其他类型的图(如社交图)不同,这是由于层次关系(TBox)的普遍存在,它补充了横向关系(ABox)。图形度量通过描述性统计数据捕获了此类特殊性。除了在网络分析领域建立的经典度量集(例如图的大小和体积或顶点的度数分布类型)外,还付出了一些努力来定义捕获RDF图所遵循的某些上述特殊性的度量至。但是,其中一些是多余的,计算量很大,并且不足以描述RDF图。尤其是,尚不清楚它们中的哪一个是捕获不同知识领域(例如,跨领域与语言学)中数据集的特定区别特征的有效指标。在这项工作中,我们解决了确定有效,必要(非冗余)和有意义的最小量度措施的问题。基于来自链接开放数据云的54个度量和来自9个知识领域的280个图的样本,我们确定了13个度量的基本集合,它们具有简洁地描述图的能力。这些措施具有呈现已建立的知识领域的拓扑结构和数据集差异的能力。基于来自链接开放数据云的54个度量和来自9个知识领域的280个图的样本,我们确定了13个度量的基本集合,它们具有简洁地描述图的能力。这些措施具有呈现已建立的知识领域的拓扑结构和数据集差异的能力。基于来自链接开放数据云的54个度量和来自9个知识领域的280个图的样本,我们确定了13个度量的基本集合,它们具有简洁地描述图的能力。这些措施具有呈现已建立的知识领域的拓扑结构和数据集差异的能力。
更新日期:2020-10-20
down
wechat
bug