当前位置: X-MOL 学术Knowl. Inf. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Simple knowledge graph completion model based on PU learning and prompt learning
Knowledge and Information Systems ( IF 2.7 ) Pub Date : 2024-01-12 , DOI: 10.1007/s10115-023-02040-z
Li Duan , Jing Wang , Bing Luo , Qiao Sun

Abstract

Knowledge graphs (KGs) are important resources for many artificial intelligence tasks but usually suffer from incompleteness, which has prompted scholars to put forward the task of knowledge graph completion (KGC). Embedding-based methods, which use the structural information of the KG for inference completion, are mainstream for this task. But these methods cannot complete the inference for the entities that do not appear in the KG and are also constrained by the structural information. To address these issues, scholars have proposed text-based methods. This type of method improves the reasoning ability of the model by utilizing pre-trained language (PLMs) models to learn textual information from the knowledge graph data. However, the performance of text-based methods lags behind that of embedding-based methods. We identify that the key reason lies in the expensive negative sampling. Positive unlabeled (PU) learning is introduced to help collect negative samples with high confidence from a small number of samples, and prompt learning is introduced to produce good training results. The proposed PLM-based KGC model outperforms earlier text-based methods and rivals earlier embedding-based approaches on several benchmark datasets. By exploiting the structural information of KGs, the proposed model also has a satisfactory performance in inference speed.



中文翻译:

基于PU学习和提示学习的简单知识图谱补全模型

摘要

知识图谱(KG)是许多人工智能任务的重要资源,但通常存在不完整性,这促使学者提出知识图谱补全(KGC)任务。基于嵌入的方法使用知识图谱的结构信息来完成推理,是该任务的主流。但这些方法无法完成对知识图谱中未出现的实体的推理,并且还受到结构信息的约束。为了解决这些问题,学者们提出了基于文本的方法。此类方法通过利用预训练语言(PLM)模型从知识图数据中学习文本信息来提高模型的推理能力。然而,基于文本的方法的性能落后于基于嵌入的方法。我们发现关键原因在于昂贵的负采样。引入正向无标记(PU)学习有助于从少量样本中收集高置信度的负样本,并引入提示学习以产生良好的训练结果。所提出的基于 PLM 的 KGC 模型在多个基准数据集上优于早期基于文本的方法,并且可与早期基于嵌入的方法相媲美。通过利用知识图谱的结构信息,所提出的模型在推理速度上也具有令人满意的性能。

更新日期:2024-01-12
down
wechat
bug