Boundary-Aware Abstractive Summarization with Entity-Augmented Attention for Enhancing Faithfulness,ACM Transactions on Asian and Low-Resource Language Information Processing

当前位置： X-MOL 学术 › ACM Trans. Asian Low Resour. Lang. Inf. Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Boundary-Aware Abstractive Summarization with Entity-Augmented Attention for Enhancing Faithfulness
ACM Transactions on Asian and Low-Resource Language Information Processing ( IF 2 ) Pub Date : 2024-04-15 , DOI: 10.1145/3641278
Jiuyi Li ₁ , Junpeng Liu ₁ , Jianjun Ma ₁ , Wei Yang ₁ , Degen Huang ₁

Affiliation

With the successful application of deep learning, document summarization systems can produce more readable results. However, abstractive summarization still suffers from unfaithful outputs and factual errors, especially in named entities. Current approaches tend to employ external knowledge to improve model performance while neglecting the boundary information and the semantics of the entities. In this article, we propose an entity-augmented method (EAM) to encourage the model to make full use of the entity boundary information and pay more attention to the critical entities. Experimental results on three Chinese and English summarization datasets show that our method outperforms several strong baselines and achieves state-of-the-art performance on the CLTS dataset. Our method can also improve the faithfulness of the summary and generalize well to different pre-trained language models. Moreover, we propose a method to evaluate the integrity of generated entities. Besides, we adapt the data augmentation method in the FactCC model according to the difference between Chinese and English in grammar and train a new evaluation model for factual consistency evaluation in Chinese summarization.

中文翻译：

具有实体增强注意力的边界感知抽象概括可增强可信度

随着深度学习的成功应用，文档摘要系统可以产生更具可读性的结果。然而，抽象概括仍然存在不忠实的输出和事实错误，特别是在命名实体中。当前的方法倾向于利用外部知识来提高模型性能，而忽略实体的边界信息和语义。在本文中，我们提出了一种实体增强方法（EAM）来鼓励模型充分利用实体边界信息并更多地关注关键实体。在三个中文和英文摘要数据集上的实验结果表明，我们的方法优于几个强大的基线，并在 CLTS 数据集上实现了最先进的性能。我们的方法还可以提高摘要的真实性，并很好地推广到不同的预训练语言模型。此外，我们提出了一种评估生成实体完整性的方法。此外，我们根据中英文语法差异，采用FactCC模型中的数据增强方法，训练了一种新的中文摘要事实一致性评估模型。

更新日期：2024-04-15

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>