Abstract
Document-level relation extraction (RE) aims to simultaneously predict relations (including no-relation cases denoted as NA) between all entity pairs in a document. It is typically formulated as a relation classification task with entities pre-detected in advance and solved by a hard-label training regime, which however neglects the divergence of the NA class and the correlations among other classes. This article introduces progressive self-distillation (PSD), a new training regime that employs online, self-knowledge distillation (KD) to produce and incorporate soft labels for document-level RE. The key idea of PSD is to gradually soften hard labels using past predictions from an RE model itself, which are adjusted adaptively as training proceeds. As such, PSD has to learn only one RE model within a single training pass, requiring no extra computation or annotation to pretrain another high-capacity teacher. PSD is conceptually simple, easy to implement, and generally applicable to various RE models to further improve their performance, without introducing additional parameters or significantly increasing training overheads into the models. It is also a general framework that can be flexibly extended to distilling various types of knowledge, rather than being restricted to soft labels themselves. Extensive experiments on four benchmarking datasets verify the effectiveness and generality of the proposed approach. The code is available at https://github.com/GaoJieCN/psd.
- Rohan Anil, Gabriel Pereyra, Alexandre Passos, Robert Ormandi, George E Dahl, and Geoffrey E Hinton. 2018. Large Scale Distributed Neural Network Training through Online Distillation. In International Conference on Learning Representations. https://arxiv.org/abs/1804.03235Google Scholar
- Jimmy Ba and Rich Caruana. 2014. Do Deep Nets Really Need to be Deep?. In Advances in Neural Information Processing Systems, Vol. 27. https://proceedings.neurips.cc/paper/2014/file/ea8fcd92d59581717e06eb187f10666d-Paper.pdfGoogle Scholar
- Livio Baldini Soares, Nicholas FitzGerald, Jeffrey Ling, and Tom Kwiatkowski. 2019. Matching the Blanks: Distributional Similarity for Relation Learning. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2895–2905. https://aclanthology.org/P19-1279Google ScholarCross Ref
- Iz Beltagy, Kyle Lo, and Arman Cohan. 2019. SciBERT: A Pretrained Language Model for Scientific Text. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 3615–3620. https://aclanthology.org/D19-1371Google ScholarCross Ref
- Rui Cai, Xiaodong Zhang, and Houfeng Wang. 2016. Bidirectional Recurrent Convolutional Neural Network for Relation Classification. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 756–765. https://aclanthology.org/P16-1072Google ScholarCross Ref
- Xuanang Chen, Ben He, Kai Hui, Le Sun, and Yingfei Sun. 2021. Simplified TinyBERT: Knowledge Distillation for Document Retrieval. In Advances in Information Retrieval: 43rd European Conference on IR Research. 241–248. https://doi.org/10.1007/978-3-030-72240-1_21Google ScholarDigital Library
- Xu Chen, Yongfeng Zhang, Hongteng Xu, Zheng Qin, and Hongyuan Zha. 2018. Adversarial Distillation for Efficient Recommendation with External Knowledge. ACM Transactions on Information Systems 37, 1 (2018), 1–28. https://doi.org/10.1145/3281659Google ScholarDigital Library
- Zi-Yuan Chen, Chih-Hung Chang, Yi-Pei Chen, Jijnasa Nayak, and Lun-Wei Ku. 2019. UHop: An Unrestricted-Hop Relation Extraction Framework for Knowledge-Based Question Answering. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 345–356. https://aclanthology.org/N19-1031Google Scholar
- Qiao Cheng, Juntao Liu, Xiaoye Qu, Jin Zhao, Jiaqing Liang, Zhefeng Wang, Baoxing Huai, Nicholas Jing Yuan, and Yanghua Xiao. 2021. HacRED: A Large-Scale Relation Extraction Dataset Toward Hard Cases in Practical Applications. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. 2819–2831. https://aclanthology.org/2021.findings-acl.249Google ScholarCross Ref
- Fenia Christopoulou, Makoto Miwa, and Sophia Ananiadou. 2019. Connecting the Dots: Document-Level Neural Relation Extraction with Edge-oriented Graphs. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 4925–4936. https://aclanthology.org/D19-1498Google ScholarCross Ref
- Wojciech M. Czarnecki, Simon Osindero, Max Jaderberg, Grzegorz Swirszcz, and Razvan Pascanu. 2017. Sobolev Training for Neural Networks. In Advances in Neural Information Processing Systems, Vol. 30. https://proceedings.neurips.cc/paper_files/paper/2017/file/758a06618c69880a6cee5314ee42d52f-Paper.pdfGoogle Scholar
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 4171–4186. https://aclanthology.org/N19-1423Google Scholar
- Laura Dietz, Alexander Kotov, and Edgar Meij. 2018. Utilizing Knowledge Graphs for Text-Centric Information Retrieval. In Proceedings of the 41st international ACM SIGIR Conference on Research and Development in Information Retrieval. 1387–1390. https://doi.org/10.1145/3209978.3210187Google ScholarDigital Library
- Li Dong, Furu Wei, Ming Zhou, and Ke Xu. 2015. Question Answering over Freebase with Multi-Column Convolutional Neural Networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 260–269. https://aclanthology.org/P15-1026Google ScholarCross Ref
- Oren Etzioni, Michael Cafarella, Doug Downey, Stanley Kok, Ana-Maria Popescu, Tal Shaked, Stephen Soderland, Daniel S Weld, and Alexander Yates. 2004. Web-Scale Information Extraction in Knowitall: (Preliminary Results). In Proceedings of the 13th International Conference on World Wide Web. 100–110. https://doi.org/10.1145/988672.988687Google ScholarDigital Library
- Jiazhan Feng, Chongyang Tao, Xueliang Zhao, and Dongyan Zhao. 2023. Learning Multi-Turn Response Selection in Grounded Dialogues with Reinforced Knowledge and Context Distillation. ACM Transactions on Information Systems 41, 4 (2023), 1–27. https://doi.org/10.1145/3584701Google ScholarDigital Library
- Thibault Formal, Carlos Lassance, Benjamin Piwowarski, and Stéphane Clinchant. 2022. From Distillation to Hard Negative Sampling: Making Sparse Neural IR Models More Effective. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2353–2359. https://doi.org/10.1145/3477495.3531857Google ScholarDigital Library
- Tommaso Furlanello, Zachary Lipton, Michael Tschannen, Laurent Itti, and Anima Anandkumar. 2018. Born Again Neural Networks. In Proceedings of the 35th International Conference on Machine Learning, Vol. 80. 1607–1616. https://proceedings.mlr.press/v80/furlanello18a.htmlGoogle Scholar
- Ross Girshick. 2015. Fast R-CNN. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1440–1448. https://doi.org/10.1109/ICCV.2015.169Google ScholarDigital Library
- Alex Graves. 2012. Long Short-Term Memory. In Supervised Sequence Labelling with Recurrent Neural Networks. 37–45. https://doi.org/10.1007/978-3-642-24797-2_4Google ScholarCross Ref
- Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the Knowledge in a Neural Network. arXiv preprint arXiv:1503.02531(2015). https://arxiv.org/abs/1503.02531Google Scholar
- Heyan Huang, Changsen Yuan, Qian Liu, and Yixin Cao. 2023. Document-Level Relation Extraction via Separate Relation Representation and Logical Reasoning. ACM Transactions on Information Systems 42, 1 (2023), 1–24. https://doi.org/10.1145/3597610Google ScholarDigital Library
- Quzhe Huang, Shibo Hao, Yuan Ye, Shengqi Zhu, Yansong Feng, and Dongyan Zhao. 2022. Does Recommend-Revise Produce Reliable Annotations? An Analysis on Missing Instances in DocRED. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. 6241–6252. https://aclanthology.org/2022.acl-long.432Google ScholarCross Ref
- Robin Jia, Cliff Wong, and Hoifung Poon. 2019. Document-Level N-ary Relation Extraction with Multiscale Representation Learning. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 3693–3704. https://aclanthology.org/N19-1370Google ScholarCross Ref
- Feng Jiang, Jianwei Niu, Shasha Mo, and Shengda Fan. 2022. Key Mention Pairs Guided Document-Level Relation Extraction. In Proceedings of the 29th International Conference on Computational Linguistics. 1904–1914. https://aclanthology.org/2022.coling-1.165Google Scholar
- Xiaotian Jiang, Quan Wang, Peng Li, and Bin Wang. 2016. Relation Extraction with Multi-Instance Multi-Label Convolutional Neural Networks. In Proceedings of the 26th International Conference on Computational Linguistics. 1471–1480. https://aclanthology.org/C16-1139Google Scholar
- Jangho Kim, Seonguk Park, and Nojun Kwak. 2018. Paraphrasing Complex Network: Network Compression via Factor Transfer. In Advances in Neural Information Processing Systems, Vol. 31. https://proceedings.neurips.cc/paper/2018/file/6d9cb7de5e8ac30bd5e8734bc96a35c1-Paper.pdfGoogle Scholar
- Kyungyul Kim, ByeongMoon Ji, Doyoung Yoon, and Sangheum Hwang. 2021. Self-Knowledge Distillation With Progressive Refinement of Targets. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6567–6576. https://doi.org/10.1109/ICCV48922.2021.00650Google ScholarCross Ref
- Sanghak Lee, Seungmin Seo, Byungkook Oh, Kyong-Ho Lee, Donghoon Shin, and Yeonsoo Lee. 2020. Cross-Sentence N-ary Relation Extraction Using Entity Link and Discourse Relation. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 705–714. https://doi.org/10.1145/3340531.3412011Google ScholarDigital Library
- Jiao Li, Yueping Sun, Robin J. Johnson, Daniela Sciaky, Chih-Hsuan Wei, Robert Leaman, Allan Peter Davis, Carolyn J. Mattingly, Thomas C. Wiegers, and Zhiyong Lu. 2016. BioCreative V CDR Task Corpus: A Resource for Chemical Disease Relation Extraction. Database 2016(2016). https://doi.org/10.1093/database/baw068Google ScholarCross Ref
- Jingye Li, Kang Xu, Fei Li, Hao Fei, Yafeng Ren, and Donghong Ji. 2021. MRN: A Locally and Globally Mention-Based Reasoning Network for Document-Level Relation Extraction. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. 1359–1370. https://aclanthology.org/2021.findings-acl.117Google ScholarCross Ref
- Rui Li, Cheng Yang, Tingwei Li, and Sen Su. 2022. MiDTD: A Simple and Effective Distillation Framework for Distantly Supervised Relation Extraction. ACM Transactions on Information Systems 40, 4 (2022), 1–32. https://doi.org/10.1145/3503917Google ScholarDigital Library
- Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv preprint arXiv:1907.11692(2019). https://arxiv.org/abs/1907.11692Google Scholar
- Ilya Loshchilov and Frank Hutter. 2017. Decoupled Weight Decay Regularization. arXiv preprint arXiv:1711.05101(2017). https://arxiv.org/abs/1711.05101.pdfGoogle Scholar
- Youmi Ma, An Wang, and Naoaki Okazaki. 2023. DREEAM: Guiding Attention with Evidence for Improving Document-Level Relation Extraction. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics. 1971–1983. https://aclanthology.org/2023.eacl-main.145Google ScholarCross Ref
- Erin Macdonald and Denilson Barbosa. 2020. Neural Relation Extraction on Wikipedia Tables for Augmenting Knowledge Graphs. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 2133–2136. https://doi.org/10.1145/3340531.3412164Google ScholarDigital Library
- Mike Mintz, Steven Bills, Rion Snow, and Daniel Jurafsky. 2009. Distant Supervision for Relation Extraction without Labeled Data. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. 1003–1011. https://aclanthology.org/P09-1113Google ScholarCross Ref
- Makoto Miwa and Mohit Bansal. 2016. End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 1105–1116. https://aclanthology.org/P16-1105Google ScholarCross Ref
- Guoshun Nan, Zhijiang Guo, Ivan Sekulic, and Wei Lu. 2020. Reasoning with Latent Structure Refinement for Document-Level Relation Extraction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 1546–1557. https://aclanthology.org/2020.acl-main.141Google ScholarCross Ref
- Nicolas Papernot, Patrick McDaniel, Xi Wu, Somesh Jha, and Ananthram Swami. 2016. Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks. In 2016 IEEE Symposium on Security and Privacy. 582–597. https://doi.org/10.1109/SP.2016.41Google ScholarCross Ref
- Seongsik Park, Dongkeun Yoon, and Harksoo Kim. 2022. Improving Graph-based Document-Level Relation Extraction Model with Novel Graph Structure. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 4379–4383. https://doi.org/10.1145/3511808.3557615Google ScholarDigital Library
- Wonpyo Park, Dongju Kim, Yan Lu, and Minsu Cho. 2019. Relational Knowledge Distillation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3967–3976. https://doi.org/10.1109/CVPR.2019.00409Google ScholarCross Ref
- Meng Qu, Xiang Ren, Yu Zhang, and Jiawei Han. 2018. Weakly-Supervised Relation Extraction by Pattern-Enhanced Embedding Learning. In Proceedings of the 2018 World Wide Web Conference. 1257–1266. https://doi.org/10.1145/3178876.3186024Google ScholarDigital Library
- Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine Chassang, Carlo Gatta, and Yoshua Bengio. 2015. FitNets: Hints for Thin Deep Nets. In International Conference on Learning Representations. https://arxiv.org/abs/1412.6550Google Scholar
- Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-Net: Convolutional Networks for Biomedical Image Segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention. 234–241. https://doi.org/10.1007/978-3-319-24574-4_28Google ScholarCross Ref
- Peter Shaw, Jakob Uszkoreit, and Ashish Vaswani. 2018. Self-Attention with Relative Position Representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 464–468. https://aclanthology.org/N18-2074Google ScholarCross Ref
- Suraj Srinivas and Francois Fleuret. 2018. Knowledge Transfer with Jacobian Matching. In Proceedings of the 35th International Conference on Machine Learning, Vol. 80. 4723–4731. https://proceedings.mlr.press/v80/srinivas18a.htmlGoogle Scholar
- Mihai Surdeanu, Julie Tibshirani, Ramesh Nallapati, and Christopher D. Manning. 2012. Multi-Instance Multi-Label Learning for Relation Extraction. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. 455–465. https://aclanthology.org/D12-1042Google Scholar
- Qingyu Tan, Ruidan He, Lidong Bing, and Hwee Tou Ng. 2022. Document-Level Relation Extraction with Adaptive Focal Loss and Knowledge Distillation. In Findings of the Association for Computational Linguistics: ACL 2022. 1672–1681. https://aclanthology.org/2022.findings-acl.132Google Scholar
- Qingyu Tan, Lu Xu, Lidong Bing, Hwee Tou Ng, and Sharifah Mahani Aljunied. 2022. Revisiting DocRED–Addressing the Overlooked False Negative Problem in Relation Extraction. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. 8472–8487. https://aclanthology.org/2022.emnlp-main.580Google ScholarCross Ref
- Hengzhu Tang, Yanan Cao, Zhenyu Zhang, Jiangxia Cao, Fang Fang, Shi Wang, and Pengfei Yin. 2020. HIN: Hierarchical Inference Network for Document-Level Relation Extraction. In Advances in Knowledge Discovery and Data Mining: PAKDD 2020. 197–209. https://doi.org/10.1007/978-3-030-47426-3_16Google ScholarDigital Library
- Yonglong Tian, Dilip Krishnan, and Phillip Isola. 2020. Contrastive Representation Distillation. In International Conference on Learning Representations. https://arxiv.org/abs/1910.10699Google Scholar
- Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing Data using t-SNE. Journal of Machine Learning Research 9, 86 (2008), 2579–2605. http://jmlr.org/papers/v9/vandermaaten08a.htmlGoogle Scholar
- Patrick Verga, Emma Strubell, and Andrew McCallum. 2018. Simultaneously Self-Attending to All Mentions for Full-Abstract Biological Relation Extraction. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 872–884. https://aclanthology.org/N18-1080Google ScholarCross Ref
- Denny Vrandečić and Markus Krötzsch. 2014. Wikidata: A Free Collaborative Knowledgebase. Commun. ACM 57, 10 (2014), 78–85. https://doi.org/10.1145/2629489Google ScholarDigital Library
- Huiyu Wang, Yukun Zhu, Bradley Green, Hartwig Adam, Alan Yuille, and Liang-Chieh Chen. 2020. Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation. In Proceedings of the 16th European Conference on Computer Vision. 108–126. https://doi.org/10.1007/978-3-030-58548-8_7Google ScholarDigital Library
- Ye Wang, Xinxin Liu, Wenxin Hu, and Tao Zhang. 2022. A Unified Positive-Unlabeled Learning Framework for Document-Level Relation Extraction with Different Levels of Labeling. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. 4123–4135. https://aclanthology.org/2022.emnlp-main.276Google ScholarCross Ref
- Ying Wei and Qi Li. 2022. SagDRE: Sequence-Aware Graph-Based Document-Level Relation Extraction with Adaptive Margin Loss. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2000–2008. https://doi.org/10.1145/3534678.3539304Google ScholarDigital Library
- Ye Wu, Ruibang Luo, Henry CM Leung, Hing-Fung Ting, and Tak-Wah Lam. 2019. ReNet: A Deep Learning Approach for Extracting Gene-Disease Associations from Literature. In Proceedings of the 23rd Annual International Conference on Research in Computational Molecular Biology. 272–284. https://doi.org/10.1007/978-3-030-17083-7_17Google ScholarCross Ref
- Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and S Yu Philip. 2020. A Comprehensive Survey on Graph Neural Networks. IEEE Transactions on Neural Networks and Learning Systems 32, 1(2020), 4–24. https://doi.org/10.1109/TNNLS.2020.2978386Google ScholarCross Ref
- Yuxin Xiao, Zecheng Zhang, Yuning Mao, Carl Yang, and Jiawei Han. 2022. SAIS: Supervising and Augmenting Intermediate Steps for Document-Level Relation Extraction. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2395–2409. https://aclanthology.org/2022.naacl-main.171Google ScholarCross Ref
- Yiqing Xie, Jiaming Shen, Sha Li, Yuning Mao, and Jiawei Han. 2022. EIDER: Empowering Document-Level Relation Extraction with Efficient Evidence Extraction and Inference-Stage Fusion. In Findings of the Association for Computational Linguistics: ACL 2022. 257–268. https://aclanthology.org/2022.findings-acl.23Google Scholar
- Chenyan Xiong, Jamie Callan, and Tie-Yan Liu. 2017. Word-Entity Duet Representations for Document Ranking. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. 763–772. https://doi.org/10.1145/3077136.3080768Google ScholarDigital Library
- Benfeng Xu, Quan Wang, Yajuan Lyu, Yong Zhu, and Zhendong Mao. 2021. Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 14149–14157. https://doi.org/10.1609/aaai.v35i16.17665Google ScholarCross Ref
- Tianyu Xu, Wen Hua, Jianfeng Qu, Zhixu Li, Jiajie Xu, An Liu, and Lei Zhao. 2022. Evidence-Aware Document-Level Relation Extraction. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 2311–2320. https://doi.org/10.1145/3511808.3557313Google ScholarDigital Library
- Yuan Yao, Deming Ye, Peng Li, Xu Han, Yankai Lin, Zhenghao Liu, Zhiyuan Liu, Lixin Huang, Jie Zhou, and Maosong Sun. 2019. DocRED: A Large-Scale Document-Level Relation Extraction Dataset. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 764–777. https://aclanthology.org/P19-1074Google ScholarCross Ref
- Deming Ye, Yankai Lin, Jiaju Du, Zhenghao Liu, Peng Li, Maosong Sun, and Zhiyuan Liu. 2020. Coreferential Reasoning Learning for Language Representation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. 7170–7186. https://aclanthology.org/2020.emnlp-main.582Google ScholarCross Ref
- Haoze Yu, Haisheng Li, Dianhui Mao, and Qiang Cai. 2020. A Relationship Extraction Method for Domain Knowledge Graph Construction. World Wide Web 23(2020), 735–753. https://doi.org/10.1007/s11280-019-00765-yGoogle ScholarCross Ref
- Li Yuan, Francis EH Tay, Guilin Li, Tao Wang, and Jiashi Feng. 2020. Revisiting Knowledge Distillation via Label Smoothing Regularization. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3903–3911. https://doi.org/10.1109/CVPR42600.2020.00396Google ScholarCross Ref
- Daojian Zeng, Kang Liu, Siwei Lai, Guangyou Zhou, and Jun Zhao. 2014. Relation Classification via Convolutional Deep Neural Network. In Proceedings the 25th International Conference on Computational Linguistics. 2335–2344. https://aclanthology.org/C14-1220Google Scholar
- Shuang Zeng, Yuting Wu, and Baobao Chang. 2021. SIRE: Separate Intra- and Inter-Sentential Reasoning for Document-Level Relation Extraction. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. 524–534. https://aclanthology.org/2021.findings-acl.47Google ScholarCross Ref
- Shuang Zeng, Runxin Xu, Baobao Chang, and Lei Li. 2020. Double Graph Based Reasoning for Document-Level Relation Extraction. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. 1630–1640. https://aclanthology.org/2020.emnlp-main.127Google ScholarCross Ref
- Ningyu Zhang, Xiang Chen, Xin Xie, Shumin Deng, Chuanqi Tan, Mosha Chen, Fei Huang, Luo Si, and Huajun Chen. 2021. Document-Level Relation Extraction as Semantic Segmentation. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence. 3999–4006. https://doi.org/10.24963/ijcai.2021/551Google ScholarCross Ref
- Ying Zhang, Tao Xiang, Timothy M. Hospedales, and Huchuan Lu. 2018. Deep Mutual Learning. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4320–4328. https://doi.org/10.1109/CVPR.2018.00454Google ScholarCross Ref
- Zhilu Zhang and Mert Sabuncu. 2020. Self-Distillation as Instance-Specific Label Smoothing. In Advances in Neural Information Processing Systems, Vol. 33. 2184–2195. https://proceedings.neurips.cc/paper_files/paper/2020/file/1731592aca5fb4d789c4119c65c10b4b-Paper.pdfGoogle Scholar
- Zhenyu Zhang, Xiaobo Shu, Bowen Yu, Tingwen Liu, Jiapeng Zhao, Quangang Li, and Li Guo. 2020. Distilling Knowledge from Well-Informed Soft Labels for Neural Relation Extraction. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 9620–9627. https://doi.org/10.1609/aaai.v34i05.6509Google ScholarCross Ref
- Chao Zhao, Daojian Zeng, Lu Xu, and Jianhua Dai. 2022. Document-Level Relation Extraction with Context Guided Mention Integration and Inter-Pair Reasoning. arXiv preprint arXiv:2201.04826(2022). https://arxiv.org/abs/2201.04826Google Scholar
- Zexuan Zhong and Danqi Chen. 2021. A Frustratingly Easy Approach for Entity and Relation Extraction. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 50–61. https://aclanthology.org/2021.naacl-main.5Google ScholarCross Ref
- Wenxuan Zhou, Kevin Huang, Tengyu Ma, and Jing Huang. 2021. Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 14612–14620. https://doi.org/10.1609/aaai.v35i16.17717Google ScholarCross Ref
- Yang Zhou and Wee Sun Lee. 2022. None Class Ranking Loss for Document-Level Relation Extraction. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence. 4538–4544. https://doi.org/10.24963/ijcai.2022/630Google ScholarCross Ref
Index Terms
- Document-Level Relation Extraction with Progressive Self-Distillation
Recommendations
Positive-Guided Knowledge Distillation for Document-Level Relation Extraction with Noisy Labeled Data
Natural Language Processing and Chinese ComputingAbstractSince one entity may have multiple mentions and relations between entities may stretch across multiple sentences in a document, the annotation of document-level relation extraction datasets becomes a challenging task. Many studies have identified ...
Multi-relation Identification for Few-Shot Document-Level Relation Extraction
Artificial Neural Networks and Machine Learning – ICANN 2023AbstractDocument-level relation extraction aims to extract relations between entities mentioned in the given text. Existing approaches characterize relations by concatenating the representation of entities from numerous instances for each relation. ...
Document-level denoising relation extraction with false-negative mining and reinforced positive-class knowledge distillation
AbstractMany datasets for document-level relation extraction (RE) suffer from incomplete labeling, particularly the false negative problem, which induces improper biases during training. However, existing denoising methods are either limited to the scale ...
Highlights- Incomplete labeling data induces improper biases during training.
- Creating a high-quality pseudo corpus by false-negative mining mechanism.
- Capturing the complete positive-class patterns with knowledge distillation.
- Training ...
Comments