research-article

Free Access

Just Accepted

Document-Level Relation Extraction with Progressive Self-Distillation

Authors:
Quan Wang

MOE Key Laboratory of Trustworthy Distributed Computing and Service Beijing University of Posts and Telecommunications, Beijing, China

MOE Key Laboratory of Trustworthy Distributed Computing and Service Beijing University of Posts and Telecommunications, Beijing, China
View Profile

,
Zhendong Mao

School of Information Science and Engineering & Institute of Artificial Intelligence University of Science and Technology of China & Hefei Comprehensive National Science Center, Hefei, China

School of Information Science and Engineering & Institute of Artificial Intelligence University of Science and Technology of China & Hefei Comprehensive National Science Center, Hefei, China
View Profile

,
Jie Gao

School of Information Science and Engineering University of Science and Technology of China, Hefei, China

School of Information Science and Engineering University of Science and Technology of China, Hefei, China
View Profile

,
Yongdong Zhang

School of Information Science and Engineering& Institute of Artificial Intelligence University of Science and Technology of China & Hefei Comprehensive National Science Center, Hefei, China

School of Information Science and Engineering& Institute of Artificial Intelligence University of Science and Technology of China & Hefei Comprehensive National Science Center, Hefei, China
View Profile

Authors Info & Claims

ACM Transactions on Information SystemsAccepted on March 2024https://doi.org/10.1145/3656168

Online AM:08 April 2024Publication History

ACM Transactions on Information Systems

Abstract

Document-level relation extraction (RE) aims to simultaneously predict relations (including no-relation cases denoted as NA) between all entity pairs in a document. It is typically formulated as a relation classification task with entities pre-detected in advance and solved by a hard-label training regime, which however neglects the divergence of the NA class and the correlations among other classes. This article introduces progressive self-distillation (PSD), a new training regime that employs online, self-knowledge distillation (KD) to produce and incorporate soft labels for document-level RE. The key idea of PSD is to gradually soften hard labels using past predictions from an RE model itself, which are adjusted adaptively as training proceeds. As such, PSD has to learn only one RE model within a single training pass, requiring no extra computation or annotation to pretrain another high-capacity teacher. PSD is conceptually simple, easy to implement, and generally applicable to various RE models to further improve their performance, without introducing additional parameters or significantly increasing training overheads into the models. It is also a general framework that can be flexibly extended to distilling various types of knowledge, rather than being restricted to soft labels themselves. Extensive experiments on four benchmarking datasets verify the effectiveness and generality of the proposed approach. The code is available at https://github.com/GaoJieCN/psd.

References

Rohan Anil, Gabriel Pereyra, Alexandre Passos, Robert Ormandi, George E Dahl, and Geoffrey E Hinton. 2018. Large Scale Distributed Neural Network Training through Online Distillation. In International Conference on Learning Representations. https://arxiv.org/abs/1804.03235Google Scholar
Jimmy Ba and Rich Caruana. 2014. Do Deep Nets Really Need to be Deep?. In Advances in Neural Information Processing Systems, Vol. 27. https://proceedings.neurips.cc/paper/2014/file/ea8fcd92d59581717e06eb187f10666d-Paper.pdfGoogle Scholar
Livio Baldini Soares, Nicholas FitzGerald, Jeffrey Ling, and Tom Kwiatkowski. 2019. Matching the Blanks: Distributional Similarity for Relation Learning. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2895–2905. https://aclanthology.org/P19-1279Google ScholarCross Ref
Iz Beltagy, Kyle Lo, and Arman Cohan. 2019. SciBERT: A Pretrained Language Model for Scientific Text. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 3615–3620. https://aclanthology.org/D19-1371Google ScholarCross Ref
Rui Cai, Xiaodong Zhang, and Houfeng Wang. 2016. Bidirectional Recurrent Convolutional Neural Network for Relation Classification. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 756–765. https://aclanthology.org/P16-1072Google ScholarCross Ref
Xuanang Chen, Ben He, Kai Hui, Le Sun, and Yingfei Sun. 2021. Simplified TinyBERT: Knowledge Distillation for Document Retrieval. In Advances in Information Retrieval: 43rd European Conference on IR Research. 241–248. https://doi.org/10.1007/978-3-030-72240-1_21Google ScholarDigital Library
Xu Chen, Yongfeng Zhang, Hongteng Xu, Zheng Qin, and Hongyuan Zha. 2018. Adversarial Distillation for Efficient Recommendation with External Knowledge. ACM Transactions on Information Systems 37, 1 (2018), 1–28. https://doi.org/10.1145/3281659Google ScholarDigital Library
Zi-Yuan Chen, Chih-Hung Chang, Yi-Pei Chen, Jijnasa Nayak, and Lun-Wei Ku. 2019. UHop: An Unrestricted-Hop Relation Extraction Framework for Knowledge-Based Question Answering. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 345–356. https://aclanthology.org/N19-1031Google Scholar
Qiao Cheng, Juntao Liu, Xiaoye Qu, Jin Zhao, Jiaqing Liang, Zhefeng Wang, Baoxing Huai, Nicholas Jing Yuan, and Yanghua Xiao. 2021. HacRED: A Large-Scale Relation Extraction Dataset Toward Hard Cases in Practical Applications. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. 2819–2831. https://aclanthology.org/2021.findings-acl.249Google ScholarCross Ref
Fenia Christopoulou, Makoto Miwa, and Sophia Ananiadou. 2019. Connecting the Dots: Document-Level Neural Relation Extraction with Edge-oriented Graphs. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 4925–4936. https://aclanthology.org/D19-1498Google ScholarCross Ref
Wojciech M. Czarnecki, Simon Osindero, Max Jaderberg, Grzegorz Swirszcz, and Razvan Pascanu. 2017. Sobolev Training for Neural Networks. In Advances in Neural Information Processing Systems, Vol. 30. https://proceedings.neurips.cc/paper_files/paper/2017/file/758a06618c69880a6cee5314ee42d52f-Paper.pdfGoogle Scholar
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 4171–4186. https://aclanthology.org/N19-1423Google Scholar
Laura Dietz, Alexander Kotov, and Edgar Meij. 2018. Utilizing Knowledge Graphs for Text-Centric Information Retrieval. In Proceedings of the 41st international ACM SIGIR Conference on Research and Development in Information Retrieval. 1387–1390. https://doi.org/10.1145/3209978.3210187Google ScholarDigital Library
Li Dong, Furu Wei, Ming Zhou, and Ke Xu. 2015. Question Answering over Freebase with Multi-Column Convolutional Neural Networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 260–269. https://aclanthology.org/P15-1026Google ScholarCross Ref
Oren Etzioni, Michael Cafarella, Doug Downey, Stanley Kok, Ana-Maria Popescu, Tal Shaked, Stephen Soderland, Daniel S Weld, and Alexander Yates. 2004. Web-Scale Information Extraction in Knowitall: (Preliminary Results). In Proceedings of the 13th International Conference on World Wide Web. 100–110. https://doi.org/10.1145/988672.988687Google ScholarDigital Library
Jiazhan Feng, Chongyang Tao, Xueliang Zhao, and Dongyan Zhao. 2023. Learning Multi-Turn Response Selection in Grounded Dialogues with Reinforced Knowledge and Context Distillation. ACM Transactions on Information Systems 41, 4 (2023), 1–27. https://doi.org/10.1145/3584701Google ScholarDigital Library
Thibault Formal, Carlos Lassance, Benjamin Piwowarski, and Stéphane Clinchant. 2022. From Distillation to Hard Negative Sampling: Making Sparse Neural IR Models More Effective. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2353–2359. https://doi.org/10.1145/3477495.3531857Google ScholarDigital Library
Tommaso Furlanello, Zachary Lipton, Michael Tschannen, Laurent Itti, and Anima Anandkumar. 2018. Born Again Neural Networks. In Proceedings of the 35th International Conference on Machine Learning, Vol. 80. 1607–1616. https://proceedings.mlr.press/v80/furlanello18a.htmlGoogle Scholar
Ross Girshick. 2015. Fast R-CNN. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1440–1448. https://doi.org/10.1109/ICCV.2015.169Google ScholarDigital Library
Alex Graves. 2012. Long Short-Term Memory. In Supervised Sequence Labelling with Recurrent Neural Networks. 37–45. https://doi.org/10.1007/978-3-642-24797-2_4Google ScholarCross Ref
Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the Knowledge in a Neural Network. arXiv preprint arXiv:1503.02531(2015). https://arxiv.org/abs/1503.02531Google Scholar
Heyan Huang, Changsen Yuan, Qian Liu, and Yixin Cao. 2023. Document-Level Relation Extraction via Separate Relation Representation and Logical Reasoning. ACM Transactions on Information Systems 42, 1 (2023), 1–24. https://doi.org/10.1145/3597610Google ScholarDigital Library
Quzhe Huang, Shibo Hao, Yuan Ye, Shengqi Zhu, Yansong Feng, and Dongyan Zhao. 2022. Does Recommend-Revise Produce Reliable Annotations? An Analysis on Missing Instances in DocRED. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. 6241–6252. https://aclanthology.org/2022.acl-long.432Google ScholarCross Ref
Robin Jia, Cliff Wong, and Hoifung Poon. 2019. Document-Level N-ary Relation Extraction with Multiscale Representation Learning. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 3693–3704. https://aclanthology.org/N19-1370Google ScholarCross Ref
Feng Jiang, Jianwei Niu, Shasha Mo, and Shengda Fan. 2022. Key Mention Pairs Guided Document-Level Relation Extraction. In Proceedings of the 29th International Conference on Computational Linguistics. 1904–1914. https://aclanthology.org/2022.coling-1.165Google Scholar
Xiaotian Jiang, Quan Wang, Peng Li, and Bin Wang. 2016. Relation Extraction with Multi-Instance Multi-Label Convolutional Neural Networks. In Proceedings of the 26th International Conference on Computational Linguistics. 1471–1480. https://aclanthology.org/C16-1139Google Scholar
Jangho Kim, Seonguk Park, and Nojun Kwak. 2018. Paraphrasing Complex Network: Network Compression via Factor Transfer. In Advances in Neural Information Processing Systems, Vol. 31. https://proceedings.neurips.cc/paper/2018/file/6d9cb7de5e8ac30bd5e8734bc96a35c1-Paper.pdfGoogle Scholar
Kyungyul Kim, ByeongMoon Ji, Doyoung Yoon, and Sangheum Hwang. 2021. Self-Knowledge Distillation With Progressive Refinement of Targets. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6567–6576. https://doi.org/10.1109/ICCV48922.2021.00650Google ScholarCross Ref
Sanghak Lee, Seungmin Seo, Byungkook Oh, Kyong-Ho Lee, Donghoon Shin, and Yeonsoo Lee. 2020. Cross-Sentence N-ary Relation Extraction Using Entity Link and Discourse Relation. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 705–714. https://doi.org/10.1145/3340531.3412011Google ScholarDigital Library
Jiao Li, Yueping Sun, Robin J. Johnson, Daniela Sciaky, Chih-Hsuan Wei, Robert Leaman, Allan Peter Davis, Carolyn J. Mattingly, Thomas C. Wiegers, and Zhiyong Lu. 2016. BioCreative V CDR Task Corpus: A Resource for Chemical Disease Relation Extraction. Database 2016(2016). https://doi.org/10.1093/database/baw068Google ScholarCross Ref
Jingye Li, Kang Xu, Fei Li, Hao Fei, Yafeng Ren, and Donghong Ji. 2021. MRN: A Locally and Globally Mention-Based Reasoning Network for Document-Level Relation Extraction. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. 1359–1370. https://aclanthology.org/2021.findings-acl.117Google ScholarCross Ref
Rui Li, Cheng Yang, Tingwei Li, and Sen Su. 2022. MiDTD: A Simple and Effective Distillation Framework for Distantly Supervised Relation Extraction. ACM Transactions on Information Systems 40, 4 (2022), 1–32. https://doi.org/10.1145/3503917Google ScholarDigital Library
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv preprint arXiv:1907.11692(2019). https://arxiv.org/abs/1907.11692Google Scholar
Ilya Loshchilov and Frank Hutter. 2017. Decoupled Weight Decay Regularization. arXiv preprint arXiv:1711.05101(2017). https://arxiv.org/abs/1711.05101.pdfGoogle Scholar
Youmi Ma, An Wang, and Naoaki Okazaki. 2023. DREEAM: Guiding Attention with Evidence for Improving Document-Level Relation Extraction. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics. 1971–1983. https://aclanthology.org/2023.eacl-main.145Google ScholarCross Ref
Erin Macdonald and Denilson Barbosa. 2020. Neural Relation Extraction on Wikipedia Tables for Augmenting Knowledge Graphs. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 2133–2136. https://doi.org/10.1145/3340531.3412164Google ScholarDigital Library
Mike Mintz, Steven Bills, Rion Snow, and Daniel Jurafsky. 2009. Distant Supervision for Relation Extraction without Labeled Data. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. 1003–1011. https://aclanthology.org/P09-1113Google ScholarCross Ref
Makoto Miwa and Mohit Bansal. 2016. End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 1105–1116. https://aclanthology.org/P16-1105Google ScholarCross Ref
Guoshun Nan, Zhijiang Guo, Ivan Sekulic, and Wei Lu. 2020. Reasoning with Latent Structure Refinement for Document-Level Relation Extraction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 1546–1557. https://aclanthology.org/2020.acl-main.141Google ScholarCross Ref
Nicolas Papernot, Patrick McDaniel, Xi Wu, Somesh Jha, and Ananthram Swami. 2016. Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks. In 2016 IEEE Symposium on Security and Privacy. 582–597. https://doi.org/10.1109/SP.2016.41Google ScholarCross Ref
Seongsik Park, Dongkeun Yoon, and Harksoo Kim. 2022. Improving Graph-based Document-Level Relation Extraction Model with Novel Graph Structure. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 4379–4383. https://doi.org/10.1145/3511808.3557615Google ScholarDigital Library
Wonpyo Park, Dongju Kim, Yan Lu, and Minsu Cho. 2019. Relational Knowledge Distillation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3967–3976. https://doi.org/10.1109/CVPR.2019.00409Google ScholarCross Ref
Meng Qu, Xiang Ren, Yu Zhang, and Jiawei Han. 2018. Weakly-Supervised Relation Extraction by Pattern-Enhanced Embedding Learning. In Proceedings of the 2018 World Wide Web Conference. 1257–1266. https://doi.org/10.1145/3178876.3186024Google ScholarDigital Library
Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine Chassang, Carlo Gatta, and Yoshua Bengio. 2015. FitNets: Hints for Thin Deep Nets. In International Conference on Learning Representations. https://arxiv.org/abs/1412.6550Google Scholar
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-Net: Convolutional Networks for Biomedical Image Segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention. 234–241. https://doi.org/10.1007/978-3-319-24574-4_28Google ScholarCross Ref
Peter Shaw, Jakob Uszkoreit, and Ashish Vaswani. 2018. Self-Attention with Relative Position Representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 464–468. https://aclanthology.org/N18-2074Google ScholarCross Ref
Suraj Srinivas and Francois Fleuret. 2018. Knowledge Transfer with Jacobian Matching. In Proceedings of the 35th International Conference on Machine Learning, Vol. 80. 4723–4731. https://proceedings.mlr.press/v80/srinivas18a.htmlGoogle Scholar
Mihai Surdeanu, Julie Tibshirani, Ramesh Nallapati, and Christopher D. Manning. 2012. Multi-Instance Multi-Label Learning for Relation Extraction. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. 455–465. https://aclanthology.org/D12-1042Google Scholar
Qingyu Tan, Ruidan He, Lidong Bing, and Hwee Tou Ng. 2022. Document-Level Relation Extraction with Adaptive Focal Loss and Knowledge Distillation. In Findings of the Association for Computational Linguistics: ACL 2022. 1672–1681. https://aclanthology.org/2022.findings-acl.132Google Scholar
Qingyu Tan, Lu Xu, Lidong Bing, Hwee Tou Ng, and Sharifah Mahani Aljunied. 2022. Revisiting DocRED–Addressing the Overlooked False Negative Problem in Relation Extraction. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. 8472–8487. https://aclanthology.org/2022.emnlp-main.580Google ScholarCross Ref
Hengzhu Tang, Yanan Cao, Zhenyu Zhang, Jiangxia Cao, Fang Fang, Shi Wang, and Pengfei Yin. 2020. HIN: Hierarchical Inference Network for Document-Level Relation Extraction. In Advances in Knowledge Discovery and Data Mining: PAKDD 2020. 197–209. https://doi.org/10.1007/978-3-030-47426-3_16Google ScholarDigital Library
Yonglong Tian, Dilip Krishnan, and Phillip Isola. 2020. Contrastive Representation Distillation. In International Conference on Learning Representations. https://arxiv.org/abs/1910.10699Google Scholar
Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing Data using t-SNE. Journal of Machine Learning Research 9, 86 (2008), 2579–2605. http://jmlr.org/papers/v9/vandermaaten08a.htmlGoogle Scholar
Patrick Verga, Emma Strubell, and Andrew McCallum. 2018. Simultaneously Self-Attending to All Mentions for Full-Abstract Biological Relation Extraction. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 872–884. https://aclanthology.org/N18-1080Google ScholarCross Ref
Denny Vrandečić and Markus Krötzsch. 2014. Wikidata: A Free Collaborative Knowledgebase. Commun. ACM 57, 10 (2014), 78–85. https://doi.org/10.1145/2629489Google ScholarDigital Library
Huiyu Wang, Yukun Zhu, Bradley Green, Hartwig Adam, Alan Yuille, and Liang-Chieh Chen. 2020. Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation. In Proceedings of the 16th European Conference on Computer Vision. 108–126. https://doi.org/10.1007/978-3-030-58548-8_7Google ScholarDigital Library
Ye Wang, Xinxin Liu, Wenxin Hu, and Tao Zhang. 2022. A Unified Positive-Unlabeled Learning Framework for Document-Level Relation Extraction with Different Levels of Labeling. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. 4123–4135. https://aclanthology.org/2022.emnlp-main.276Google ScholarCross Ref
Ying Wei and Qi Li. 2022. SagDRE: Sequence-Aware Graph-Based Document-Level Relation Extraction with Adaptive Margin Loss. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2000–2008. https://doi.org/10.1145/3534678.3539304Google ScholarDigital Library
Ye Wu, Ruibang Luo, Henry CM Leung, Hing-Fung Ting, and Tak-Wah Lam. 2019. ReNet: A Deep Learning Approach for Extracting Gene-Disease Associations from Literature. In Proceedings of the 23rd Annual International Conference on Research in Computational Molecular Biology. 272–284. https://doi.org/10.1007/978-3-030-17083-7_17Google ScholarCross Ref
Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and S Yu Philip. 2020. A Comprehensive Survey on Graph Neural Networks. IEEE Transactions on Neural Networks and Learning Systems 32, 1(2020), 4–24. https://doi.org/10.1109/TNNLS.2020.2978386Google ScholarCross Ref
Yuxin Xiao, Zecheng Zhang, Yuning Mao, Carl Yang, and Jiawei Han. 2022. SAIS: Supervising and Augmenting Intermediate Steps for Document-Level Relation Extraction. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2395–2409. https://aclanthology.org/2022.naacl-main.171Google ScholarCross Ref
Yiqing Xie, Jiaming Shen, Sha Li, Yuning Mao, and Jiawei Han. 2022. EIDER: Empowering Document-Level Relation Extraction with Efficient Evidence Extraction and Inference-Stage Fusion. In Findings of the Association for Computational Linguistics: ACL 2022. 257–268. https://aclanthology.org/2022.findings-acl.23Google Scholar
Chenyan Xiong, Jamie Callan, and Tie-Yan Liu. 2017. Word-Entity Duet Representations for Document Ranking. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. 763–772. https://doi.org/10.1145/3077136.3080768Google ScholarDigital Library
Benfeng Xu, Quan Wang, Yajuan Lyu, Yong Zhu, and Zhendong Mao. 2021. Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 14149–14157. https://doi.org/10.1609/aaai.v35i16.17665Google ScholarCross Ref
Tianyu Xu, Wen Hua, Jianfeng Qu, Zhixu Li, Jiajie Xu, An Liu, and Lei Zhao. 2022. Evidence-Aware Document-Level Relation Extraction. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 2311–2320. https://doi.org/10.1145/3511808.3557313Google ScholarDigital Library
Yuan Yao, Deming Ye, Peng Li, Xu Han, Yankai Lin, Zhenghao Liu, Zhiyuan Liu, Lixin Huang, Jie Zhou, and Maosong Sun. 2019. DocRED: A Large-Scale Document-Level Relation Extraction Dataset. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 764–777. https://aclanthology.org/P19-1074Google ScholarCross Ref
Deming Ye, Yankai Lin, Jiaju Du, Zhenghao Liu, Peng Li, Maosong Sun, and Zhiyuan Liu. 2020. Coreferential Reasoning Learning for Language Representation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. 7170–7186. https://aclanthology.org/2020.emnlp-main.582Google ScholarCross Ref
Haoze Yu, Haisheng Li, Dianhui Mao, and Qiang Cai. 2020. A Relationship Extraction Method for Domain Knowledge Graph Construction. World Wide Web 23(2020), 735–753. https://doi.org/10.1007/s11280-019-00765-yGoogle ScholarCross Ref
Li Yuan, Francis EH Tay, Guilin Li, Tao Wang, and Jiashi Feng. 2020. Revisiting Knowledge Distillation via Label Smoothing Regularization. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3903–3911. https://doi.org/10.1109/CVPR42600.2020.00396Google ScholarCross Ref
Daojian Zeng, Kang Liu, Siwei Lai, Guangyou Zhou, and Jun Zhao. 2014. Relation Classification via Convolutional Deep Neural Network. In Proceedings the 25th International Conference on Computational Linguistics. 2335–2344. https://aclanthology.org/C14-1220Google Scholar
Shuang Zeng, Yuting Wu, and Baobao Chang. 2021. SIRE: Separate Intra- and Inter-Sentential Reasoning for Document-Level Relation Extraction. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. 524–534. https://aclanthology.org/2021.findings-acl.47Google ScholarCross Ref
Shuang Zeng, Runxin Xu, Baobao Chang, and Lei Li. 2020. Double Graph Based Reasoning for Document-Level Relation Extraction. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. 1630–1640. https://aclanthology.org/2020.emnlp-main.127Google ScholarCross Ref
Ningyu Zhang, Xiang Chen, Xin Xie, Shumin Deng, Chuanqi Tan, Mosha Chen, Fei Huang, Luo Si, and Huajun Chen. 2021. Document-Level Relation Extraction as Semantic Segmentation. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence. 3999–4006. https://doi.org/10.24963/ijcai.2021/551Google ScholarCross Ref
Ying Zhang, Tao Xiang, Timothy M. Hospedales, and Huchuan Lu. 2018. Deep Mutual Learning. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4320–4328. https://doi.org/10.1109/CVPR.2018.00454Google ScholarCross Ref
Zhilu Zhang and Mert Sabuncu. 2020. Self-Distillation as Instance-Specific Label Smoothing. In Advances in Neural Information Processing Systems, Vol. 33. 2184–2195. https://proceedings.neurips.cc/paper_files/paper/2020/file/1731592aca5fb4d789c4119c65c10b4b-Paper.pdfGoogle Scholar
Zhenyu Zhang, Xiaobo Shu, Bowen Yu, Tingwen Liu, Jiapeng Zhao, Quangang Li, and Li Guo. 2020. Distilling Knowledge from Well-Informed Soft Labels for Neural Relation Extraction. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 9620–9627. https://doi.org/10.1609/aaai.v34i05.6509Google ScholarCross Ref
Chao Zhao, Daojian Zeng, Lu Xu, and Jianhua Dai. 2022. Document-Level Relation Extraction with Context Guided Mention Integration and Inter-Pair Reasoning. arXiv preprint arXiv:2201.04826(2022). https://arxiv.org/abs/2201.04826Google Scholar
Zexuan Zhong and Danqi Chen. 2021. A Frustratingly Easy Approach for Entity and Relation Extraction. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 50–61. https://aclanthology.org/2021.naacl-main.5Google ScholarCross Ref
Wenxuan Zhou, Kevin Huang, Tengyu Ma, and Jing Huang. 2021. Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 14612–14620. https://doi.org/10.1609/aaai.v35i16.17717Google ScholarCross Ref
Yang Zhou and Wee Sun Lee. 2022. None Class Ranking Loss for Document-Level Relation Extraction. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence. 4538–4544. https://doi.org/10.24963/ijcai.2022/630Google ScholarCross Ref

Index Terms

Document-Level Relation Extraction with Progressive Self-Distillation
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Information extraction

Recommendations

Positive-Guided Knowledge Distillation for Document-Level Relation Extraction with Noisy Labeled Data
Natural Language Processing and Chinese Computing
Abstract
Since one entity may have multiple mentions and relations between entities may stretch across multiple sentences in a document, the annotation of document-level relation extraction datasets becomes a challenging task. Many studies have identified ...
Read More
Multi-relation Identification for Few-Shot Document-Level Relation Extraction
Artificial Neural Networks and Machine Learning – ICANN 2023
Abstract
Document-level relation extraction aims to extract relations between entities mentioned in the given text. Existing approaches characterize relations by concatenating the representation of entities from numerous instances for each relation. ...
Read More
Document-level denoising relation extraction with false-negative mining and reinforced positive-class knowledge distillation
Abstract
Many datasets for document-level relation extraction (RE) suffer from incomplete labeling, particularly the false negative problem, which induces improper biases during training. However, existing denoising methods are either limited to the scale ...
Highlights
- Incomplete labeling data induces improper biases during training.
- Creating a high-quality pseudo corpus by false-negative mining mechanism.
- Capturing the complete positive-class patterns with knowledge distillation.
- Training ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Information Systems Just Accepted
ISSN:1046-8188
EISSN:1558-2868
Table of Contents

Copyright © 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Online AM: 8 April 2024
- Accepted: 27 March 2024
- Revised: 25 February 2024
- Received: 5 October 2023
Published in tois Just Accepted

Check for updates
Author Tags
Document-Level Relation Extraction
Soft-Label Training Regime
Online Knowledge Distillation
Self-Knowledge Distillation
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 88
  Total Downloads
- Downloads (Last 12 months)88
- Downloads (Last 6 weeks)88
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Document-Level Relation Extraction with Progressive Self-Distillation

ACM Transactions on Information Systems

Abstract

References

Cited By

Index Terms

Recommendations

Positive-Guided Knowledge Distillation for Document-Level Relation Extraction with Noisy Labeled Data

Multi-relation Identification for Few-Shot Document-Level Relation Extraction

Document-level denoising relation extraction with false-negative mining and reinforced positive-class knowledge distillation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Document-Level Relation Extraction with Progressive Self-Distillation

ACM Transactions on Information Systems

Abstract

References

Cited By

Index Terms

Recommendations

Positive-Guided Knowledge Distillation for Document-Level Relation Extraction with Noisy Labeled Data

Multi-relation Identification for Few-Shot Document-Level Relation Extraction

Document-level denoising relation extraction with false-negative mining and reinforced positive-class knowledge distillation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media