当前位置: X-MOL 学术Semant. Web › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Morph-KGC: Scalable knowledge graph materialization with mapping partitions
Semantic Web ( IF 3 ) Pub Date : 2022-08-25 , DOI: 10.3233/sw-223135
Julián Arenas-Guerrero 1 , David Chaves-Fraga 1, 2, 3 , Jhon Toledo 1 , María S. Pérez 1 , Oscar Corcho 1
Affiliation  

Abstract

Knowledge graphs are often constructed from heterogeneous data sources, using declarative rules that map them to a target ontology and materializing them into RDF. When these data sources are large, the materialization of the entire knowledge graph may be computationally expensive and not suitable for those cases where a rapid materialization is required. In this work, we propose an approach to overcome this limitation, based on the novel concept of mapping partitions. Mapping partitions are defined as groups of mapping rules that generate disjoint subsets of the knowledge graph. Each of these groups can be processed separately, reducing the total amount of memory and execution time required by the materialization process. We have included this optimization in our materialization engine Morph-KGC, and we have evaluated it over three different benchmarks. Our experimental results show that, compared with state-of-the-art techniques, the use of mapping partitions in Morph-KGC presents the following advantages: (i) it decreases significantly the time required for materialization, (ii) it reduces the maximum peak of memory used, and (iii) it scales to data sizes that other engines are not capable of processing currently.



中文翻译:

Morph-KGC:具有映射分区的可扩展知识图物化

摘要

知识图谱通常由异构数据源构建,使用声明性规则将它们映射到目标本体并将它们具体化为 RDF。当这些数据源很大时,整个知识图谱的物化可能在计算上很昂贵,并且不适合那些需要快速物化的情况。在这项工作中,我们提出了一种方法来克服这个限制,基于映射分区的新概念. 映射分区被定义为生成知识图的不相交子集的映射规则组。这些组中的每一个都可以单独处理,从而减少了实现过程所需的内存总量和执行时间。我们已将此优化包含在我们的物化引擎 Morph-KGC 中,并且我们已经在三个不同的基准测试中对其进行了评估。我们的实验结果表明,与最先进的技术相比,在 Morph-KGC 中使用映射分区具有以下优点:(i)它显着减少了物化所需的时间,(ii)它减少了最大使用的内存峰值,以及 (iii) 它可以扩展到其他引擎当前无法处理的数据大小。

更新日期:2022-08-27
down
wechat
bug