Infusing factual knowledge into pre-trained model for finding the contributions from the research articles,Journal of Information Science

当前位置： X-MOL 学术 › J. Inf. Sci. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Infusing factual knowledge into pre-trained model for finding the contributions from the research articles
Journal of Information Science ( IF 2.4 ) Pub Date : 2024-01-17 , DOI: 10.1177/01655515231211436
Komal Gupta _{1,

1} , Tirthankar Ghosal _{1,

2} , Asif Ekbal ₁

Affiliation

The growing volume of scientific literature makes it difficult for researchers to identify the key contributions of a research paper. Automating this process would facilitate efficient understanding, faster literature surveys and comparisons. The automated process may help researchers to identify relevant and impactful information in less time and effort. In this article, we address the challenge of identifying the contributions in research articles. We propose a method that infuses factual knowledge from a scientific knowledge graph into a pre-trained model. We divide the knowledge graph into mutually exclusive subgroups and infuse the knowledge in the pre-trained model using adapters. We also construct a scientific knowledge graph consisting of 3,600 Natural Language Processing (NLP) papers to acquire factual knowledge. In addition, we annotate a new test set to evaluate the model’s ability to identify sentences that make significant contributions to the papers. Our model achieves the best performance in comparison to previous methods with a relative improvement of 40.06% and 25.28% in terms of F1 score for identifying contributing sentences in the NLPContributionGraph (NCG) test set and the newly annotated test set, respectively.

中文翻译：

将事实知识注入预先训练的模型中，以查找研究文章的贡献

科学文献数量的不断增加使得研究人员很难确定研究论文的关键贡献。自动化这一过程将有助于有效理解、更快的文献调查和比较。自动化过程可以帮助研究人员以更少的时间和精力识别相关且有影响力的信息。在本文中，我们解决了确定研究文章中的贡献的挑战。我们提出了一种将科学知识图谱中的事实知识注入到预训练模型中的方法。我们将知识图划分为互斥的子组，并使用适配器将知识注入到预训练的模型中。我们还构建了一个由 3,600 篇自然语言处理 (NLP) 论文组成的科学知识图谱来获取事实知识。此外，我们注释了一个新的测试集，以评估模型识别对论文做出重大贡献的句子的能力。与之前的方法相比，我们的模型取得了最佳性能，在 NLPContributionGraph (NCG) 测试集和新注释的测试集中识别贡献句子的 F1 分数方面分别相对提高了 40.06% 和 25.28%。

更新日期：2024-01-17

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>