当前位置: X-MOL 学术ACM Trans. Inf. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Document-Level Relation Extraction with Progressive Self-Distillation
ACM Transactions on Information Systems ( IF 5.6 ) Pub Date : 2024-04-08 , DOI: 10.1145/3656168
Quan Wang 1 , Zhendong Mao 2 , Jie Gao 3 , Yongdong Zhang 4
Affiliation  

Document-level relation extraction (RE) aims to simultaneously predict relations (including no-relation cases denoted as NA) between all entity pairs in a document. It is typically formulated as a relation classification task with entities pre-detected in advance and solved by a hard-label training regime, which however neglects the divergence of the NA class and the correlations among other classes. This article introduces progressive self-distillation (PSD), a new training regime that employs online, self-knowledge distillation (KD) to produce and incorporate soft labels for document-level RE. The key idea of PSD is to gradually soften hard labels using past predictions from an RE model itself, which are adjusted adaptively as training proceeds. As such, PSD has to learn only one RE model within a single training pass, requiring no extra computation or annotation to pretrain another high-capacity teacher. PSD is conceptually simple, easy to implement, and generally applicable to various RE models to further improve their performance, without introducing additional parameters or significantly increasing training overheads into the models. It is also a general framework that can be flexibly extended to distilling various types of knowledge, rather than being restricted to soft labels themselves. Extensive experiments on four benchmarking datasets verify the effectiveness and generality of the proposed approach. The code is available at https://github.com/GaoJieCN/psd.



中文翻译:

使用渐进式自蒸馏进行文档级关系提取

文档级关系提取(RE)旨在同时预测文档中所有实体对之间的关​​系(包括表示为 NA 的无关系情况)。它通常被表述为一种关系分类任务,其中实体预先预先检测并通过硬标签训练机制解决,然而,它忽略了 NA 类的分歧以及其他类之间的相关性。本文介绍了渐进式自蒸馏(PSD),这是一种新的训练机制,它采用在线自知识蒸馏 (KD) 来生成和合并文档级 RE 的软标签。 PSD 的关键思想是使用 RE 模型本身过去的预测来逐渐软化硬标签,并随着训练的进行进行自适应调整。因此,PSD 在一次训练过程中只需学习一个 RE 模型,不需要额外的计算或注释来预训练另一位高能力教师。 PSD概念简单,易于实现,普遍适用于各种RE模型,以进一步提高其性能,而无需引入额外的参数或显着增加模型的训练开销。它也是一个通用框架,可以灵活扩展以提炼各种类型的知识,而不是局限于软标签本身。对四个基准数据集的广泛实验验证了所提出方法的有效性和通用性。代码可在 https://github.com/GaoJieCN/psd 获取。

更新日期:2024-04-08
down
wechat
bug