Separability and Its Approximations in Ontology-based Data Management,Semantic Web

当前位置： X-MOL 学术 › Semant. Web › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Separability and Its Approximations in Ontology-based Data Management
Semantic Web ( IF 3 ) Pub Date : 2023-06-08 , DOI: 10.3233/sw-233391
Gianluca Cima ₁ , Federico Croce ₁ , Maurizio Lenzerini ₁

Affiliation

Abstract

Given two datasets, i.e., two sets of tuples of constants, representing positive and negative examples, logical separability is the reasoning task of finding a formula in a certain target query language that separates them. As already pointed out in previous works, this task turns out to be relevant in several application scenarios such as concept learning and generating referring expressions. Besides, if we think of the input datasets of positive and negative examples as composed of tuples of constants classified, respectively, positively and negatively by a black-box model, then the separating formula can be used to provide global post-hoc explanations of such a model. In this paper, we study the separability task in the context of Ontology-based Data Management (OBDM), in which a domain ontology provides a high-level, logic-based specification of a domain of interest, semantically linked through suitable mapping assertions to the data source layer of an information system. Since a formula that properly separates (proper separation) two input datasets does not always exist, our first contribution is to propose (best) approximations of the proper separation, called (minimally) complete and (maximally) sound separations. We do this by presenting a general framework for separability in OBDM. Then, in a scenario that uses by far the most popular languages for the OBDM paradigm, our second contribution is a comprehensive study of three natural computational problems associated with the framework, namely Verification (check whether a given formula is a proper, complete, or sound separation of two given datasets), Existence (check whether a proper, or best approximated separation of two given datasets exists at all), and Computation (compute any proper, or any best approximated separation of two given datasets).

中文翻译：

基于本体的数据管理中的可分离性及其近似

摘要

给定两个数据集，即两组常量元组，代表正例和负例，逻辑可分离性是在某种目标查询语言中找到将它们分开的公式的推理任务。正如之前的工作中已经指出的那样，该任务与概念学习和生成指代表达式等多个应用场景相关。此外，如果我们将正例和负例的输入数据集视为由黑盒模型分别进行正向和负向分类的常量元组组成，那么分离公式可用于提供此类的全局事后解释一个模型。在本文中，我们研究了基于本体的数据管理（OBDM）背景下的可分离性任务，其中领域本体提供了感兴趣领域的高级、基于逻辑的规范，通过适当的映射断言在语义上链接到信息系统的数据源层。由于正确分离（正确分离）两个输入数据集的公式并不总是存在，因此我们的第一个贡献是提出正确分离的（最佳）近似值，称为（最小）完整分离和（最大）健全分离。我们通过提出 OBDM 中可分离性的通用框架来实现这一点。然后，在使用迄今为止最流行的 OBDM 范式语言的场景中，我们的第二个贡献是对与该框架相关的三个自然计算问题的综合研究，即验证（检查给定的公式是否正确、完整或正确）两个给定数据集的合理分离）、存在（检查两个给定数据集是否存在正确或最佳近似分离）和计算（计算两个给定数据集的任何正确或最佳近似分离）。

更新日期：2023-06-08

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>