skip to main content
research-article
Free Access
Just Accepted

Document-Level Relation Extraction with Progressive Self-Distillation

Authors Info & Claims
Online AM:08 April 2024Publication History
Skip Abstract Section

Abstract

Document-level relation extraction (RE) aims to simultaneously predict relations (including no-relation cases denoted as NA) between all entity pairs in a document. It is typically formulated as a relation classification task with entities pre-detected in advance and solved by a hard-label training regime, which however neglects the divergence of the NA class and the correlations among other classes. This article introduces progressive self-distillation (PSD), a new training regime that employs online, self-knowledge distillation (KD) to produce and incorporate soft labels for document-level RE. The key idea of PSD is to gradually soften hard labels using past predictions from an RE model itself, which are adjusted adaptively as training proceeds. As such, PSD has to learn only one RE model within a single training pass, requiring no extra computation or annotation to pretrain another high-capacity teacher. PSD is conceptually simple, easy to implement, and generally applicable to various RE models to further improve their performance, without introducing additional parameters or significantly increasing training overheads into the models. It is also a general framework that can be flexibly extended to distilling various types of knowledge, rather than being restricted to soft labels themselves. Extensive experiments on four benchmarking datasets verify the effectiveness and generality of the proposed approach. The code is available at https://github.com/GaoJieCN/psd.

References

  1. Rohan Anil, Gabriel Pereyra, Alexandre Passos, Robert Ormandi, George E Dahl, and Geoffrey E Hinton. 2018. Large Scale Distributed Neural Network Training through Online Distillation. In International Conference on Learning Representations. https://arxiv.org/abs/1804.03235Google ScholarGoogle Scholar
  2. Jimmy Ba and Rich Caruana. 2014. Do Deep Nets Really Need to be Deep?. In Advances in Neural Information Processing Systems, Vol.  27. https://proceedings.neurips.cc/paper/2014/file/ea8fcd92d59581717e06eb187f10666d-Paper.pdfGoogle ScholarGoogle Scholar
  3. Livio Baldini Soares, Nicholas FitzGerald, Jeffrey Ling, and Tom Kwiatkowski. 2019. Matching the Blanks: Distributional Similarity for Relation Learning. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2895–2905. https://aclanthology.org/P19-1279Google ScholarGoogle ScholarCross RefCross Ref
  4. Iz Beltagy, Kyle Lo, and Arman Cohan. 2019. SciBERT: A Pretrained Language Model for Scientific Text. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 3615–3620. https://aclanthology.org/D19-1371Google ScholarGoogle ScholarCross RefCross Ref
  5. Rui Cai, Xiaodong Zhang, and Houfeng Wang. 2016. Bidirectional Recurrent Convolutional Neural Network for Relation Classification. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 756–765. https://aclanthology.org/P16-1072Google ScholarGoogle ScholarCross RefCross Ref
  6. Xuanang Chen, Ben He, Kai Hui, Le Sun, and Yingfei Sun. 2021. Simplified TinyBERT: Knowledge Distillation for Document Retrieval. In Advances in Information Retrieval: 43rd European Conference on IR Research. 241–248. https://doi.org/10.1007/978-3-030-72240-1_21Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Xu Chen, Yongfeng Zhang, Hongteng Xu, Zheng Qin, and Hongyuan Zha. 2018. Adversarial Distillation for Efficient Recommendation with External Knowledge. ACM Transactions on Information Systems 37, 1 (2018), 1–28. https://doi.org/10.1145/3281659Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Zi-Yuan Chen, Chih-Hung Chang, Yi-Pei Chen, Jijnasa Nayak, and Lun-Wei Ku. 2019. UHop: An Unrestricted-Hop Relation Extraction Framework for Knowledge-Based Question Answering. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 345–356. https://aclanthology.org/N19-1031Google ScholarGoogle Scholar
  9. Qiao Cheng, Juntao Liu, Xiaoye Qu, Jin Zhao, Jiaqing Liang, Zhefeng Wang, Baoxing Huai, Nicholas Jing Yuan, and Yanghua Xiao. 2021. HacRED: A Large-Scale Relation Extraction Dataset Toward Hard Cases in Practical Applications. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. 2819–2831. https://aclanthology.org/2021.findings-acl.249Google ScholarGoogle ScholarCross RefCross Ref
  10. Fenia Christopoulou, Makoto Miwa, and Sophia Ananiadou. 2019. Connecting the Dots: Document-Level Neural Relation Extraction with Edge-oriented Graphs. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 4925–4936. https://aclanthology.org/D19-1498Google ScholarGoogle ScholarCross RefCross Ref
  11. Wojciech M. Czarnecki, Simon Osindero, Max Jaderberg, Grzegorz Swirszcz, and Razvan Pascanu. 2017. Sobolev Training for Neural Networks. In Advances in Neural Information Processing Systems, Vol.  30. https://proceedings.neurips.cc/paper_files/paper/2017/file/758a06618c69880a6cee5314ee42d52f-Paper.pdfGoogle ScholarGoogle Scholar
  12. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 4171–4186. https://aclanthology.org/N19-1423Google ScholarGoogle Scholar
  13. Laura Dietz, Alexander Kotov, and Edgar Meij. 2018. Utilizing Knowledge Graphs for Text-Centric Information Retrieval. In Proceedings of the 41st international ACM SIGIR Conference on Research and Development in Information Retrieval. 1387–1390. https://doi.org/10.1145/3209978.3210187Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Li Dong, Furu Wei, Ming Zhou, and Ke Xu. 2015. Question Answering over Freebase with Multi-Column Convolutional Neural Networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 260–269. https://aclanthology.org/P15-1026Google ScholarGoogle ScholarCross RefCross Ref
  15. Oren Etzioni, Michael Cafarella, Doug Downey, Stanley Kok, Ana-Maria Popescu, Tal Shaked, Stephen Soderland, Daniel S Weld, and Alexander Yates. 2004. Web-Scale Information Extraction in Knowitall: (Preliminary Results). In Proceedings of the 13th International Conference on World Wide Web. 100–110. https://doi.org/10.1145/988672.988687Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Jiazhan Feng, Chongyang Tao, Xueliang Zhao, and Dongyan Zhao. 2023. Learning Multi-Turn Response Selection in Grounded Dialogues with Reinforced Knowledge and Context Distillation. ACM Transactions on Information Systems 41, 4 (2023), 1–27. https://doi.org/10.1145/3584701Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Thibault Formal, Carlos Lassance, Benjamin Piwowarski, and Stéphane Clinchant. 2022. From Distillation to Hard Negative Sampling: Making Sparse Neural IR Models More Effective. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2353–2359. https://doi.org/10.1145/3477495.3531857Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Tommaso Furlanello, Zachary Lipton, Michael Tschannen, Laurent Itti, and Anima Anandkumar. 2018. Born Again Neural Networks. In Proceedings of the 35th International Conference on Machine Learning, Vol.  80. 1607–1616. https://proceedings.mlr.press/v80/furlanello18a.htmlGoogle ScholarGoogle Scholar
  19. Ross Girshick. 2015. Fast R-CNN. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1440–1448. https://doi.org/10.1109/ICCV.2015.169Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Alex Graves. 2012. Long Short-Term Memory. In Supervised Sequence Labelling with Recurrent Neural Networks. 37–45. https://doi.org/10.1007/978-3-642-24797-2_4Google ScholarGoogle ScholarCross RefCross Ref
  21. Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the Knowledge in a Neural Network. arXiv preprint arXiv:1503.02531(2015). https://arxiv.org/abs/1503.02531Google ScholarGoogle Scholar
  22. Heyan Huang, Changsen Yuan, Qian Liu, and Yixin Cao. 2023. Document-Level Relation Extraction via Separate Relation Representation and Logical Reasoning. ACM Transactions on Information Systems 42, 1 (2023), 1–24. https://doi.org/10.1145/3597610Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Quzhe Huang, Shibo Hao, Yuan Ye, Shengqi Zhu, Yansong Feng, and Dongyan Zhao. 2022. Does Recommend-Revise Produce Reliable Annotations? An Analysis on Missing Instances in DocRED. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. 6241–6252. https://aclanthology.org/2022.acl-long.432Google ScholarGoogle ScholarCross RefCross Ref
  24. Robin Jia, Cliff Wong, and Hoifung Poon. 2019. Document-Level N-ary Relation Extraction with Multiscale Representation Learning. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 3693–3704. https://aclanthology.org/N19-1370Google ScholarGoogle ScholarCross RefCross Ref
  25. Feng Jiang, Jianwei Niu, Shasha Mo, and Shengda Fan. 2022. Key Mention Pairs Guided Document-Level Relation Extraction. In Proceedings of the 29th International Conference on Computational Linguistics. 1904–1914. https://aclanthology.org/2022.coling-1.165Google ScholarGoogle Scholar
  26. Xiaotian Jiang, Quan Wang, Peng Li, and Bin Wang. 2016. Relation Extraction with Multi-Instance Multi-Label Convolutional Neural Networks. In Proceedings of the 26th International Conference on Computational Linguistics. 1471–1480. https://aclanthology.org/C16-1139Google ScholarGoogle Scholar
  27. Jangho Kim, Seonguk Park, and Nojun Kwak. 2018. Paraphrasing Complex Network: Network Compression via Factor Transfer. In Advances in Neural Information Processing Systems, Vol.  31. https://proceedings.neurips.cc/paper/2018/file/6d9cb7de5e8ac30bd5e8734bc96a35c1-Paper.pdfGoogle ScholarGoogle Scholar
  28. Kyungyul Kim, ByeongMoon Ji, Doyoung Yoon, and Sangheum Hwang. 2021. Self-Knowledge Distillation With Progressive Refinement of Targets. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6567–6576. https://doi.org/10.1109/ICCV48922.2021.00650Google ScholarGoogle ScholarCross RefCross Ref
  29. Sanghak Lee, Seungmin Seo, Byungkook Oh, Kyong-Ho Lee, Donghoon Shin, and Yeonsoo Lee. 2020. Cross-Sentence N-ary Relation Extraction Using Entity Link and Discourse Relation. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 705–714. https://doi.org/10.1145/3340531.3412011Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Jiao Li, Yueping Sun, Robin J. Johnson, Daniela Sciaky, Chih-Hsuan Wei, Robert Leaman, Allan Peter Davis, Carolyn J. Mattingly, Thomas C. Wiegers, and Zhiyong Lu. 2016. BioCreative V CDR Task Corpus: A Resource for Chemical Disease Relation Extraction. Database 2016(2016). https://doi.org/10.1093/database/baw068Google ScholarGoogle ScholarCross RefCross Ref
  31. Jingye Li, Kang Xu, Fei Li, Hao Fei, Yafeng Ren, and Donghong Ji. 2021. MRN: A Locally and Globally Mention-Based Reasoning Network for Document-Level Relation Extraction. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. 1359–1370. https://aclanthology.org/2021.findings-acl.117Google ScholarGoogle ScholarCross RefCross Ref
  32. Rui Li, Cheng Yang, Tingwei Li, and Sen Su. 2022. MiDTD: A Simple and Effective Distillation Framework for Distantly Supervised Relation Extraction. ACM Transactions on Information Systems 40, 4 (2022), 1–32. https://doi.org/10.1145/3503917Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv preprint arXiv:1907.11692(2019). https://arxiv.org/abs/1907.11692Google ScholarGoogle Scholar
  34. Ilya Loshchilov and Frank Hutter. 2017. Decoupled Weight Decay Regularization. arXiv preprint arXiv:1711.05101(2017). https://arxiv.org/abs/1711.05101.pdfGoogle ScholarGoogle Scholar
  35. Youmi Ma, An Wang, and Naoaki Okazaki. 2023. DREEAM: Guiding Attention with Evidence for Improving Document-Level Relation Extraction. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics. 1971–1983. https://aclanthology.org/2023.eacl-main.145Google ScholarGoogle ScholarCross RefCross Ref
  36. Erin Macdonald and Denilson Barbosa. 2020. Neural Relation Extraction on Wikipedia Tables for Augmenting Knowledge Graphs. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 2133–2136. https://doi.org/10.1145/3340531.3412164Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Mike Mintz, Steven Bills, Rion Snow, and Daniel Jurafsky. 2009. Distant Supervision for Relation Extraction without Labeled Data. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. 1003–1011. https://aclanthology.org/P09-1113Google ScholarGoogle ScholarCross RefCross Ref
  38. Makoto Miwa and Mohit Bansal. 2016. End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 1105–1116. https://aclanthology.org/P16-1105Google ScholarGoogle ScholarCross RefCross Ref
  39. Guoshun Nan, Zhijiang Guo, Ivan Sekulic, and Wei Lu. 2020. Reasoning with Latent Structure Refinement for Document-Level Relation Extraction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 1546–1557. https://aclanthology.org/2020.acl-main.141Google ScholarGoogle ScholarCross RefCross Ref
  40. Nicolas Papernot, Patrick McDaniel, Xi Wu, Somesh Jha, and Ananthram Swami. 2016. Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks. In 2016 IEEE Symposium on Security and Privacy. 582–597. https://doi.org/10.1109/SP.2016.41Google ScholarGoogle ScholarCross RefCross Ref
  41. Seongsik Park, Dongkeun Yoon, and Harksoo Kim. 2022. Improving Graph-based Document-Level Relation Extraction Model with Novel Graph Structure. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 4379–4383. https://doi.org/10.1145/3511808.3557615Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Wonpyo Park, Dongju Kim, Yan Lu, and Minsu Cho. 2019. Relational Knowledge Distillation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3967–3976. https://doi.org/10.1109/CVPR.2019.00409Google ScholarGoogle ScholarCross RefCross Ref
  43. Meng Qu, Xiang Ren, Yu Zhang, and Jiawei Han. 2018. Weakly-Supervised Relation Extraction by Pattern-Enhanced Embedding Learning. In Proceedings of the 2018 World Wide Web Conference. 1257–1266. https://doi.org/10.1145/3178876.3186024Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine Chassang, Carlo Gatta, and Yoshua Bengio. 2015. FitNets: Hints for Thin Deep Nets. In International Conference on Learning Representations. https://arxiv.org/abs/1412.6550Google ScholarGoogle Scholar
  45. Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-Net: Convolutional Networks for Biomedical Image Segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention. 234–241. https://doi.org/10.1007/978-3-319-24574-4_28Google ScholarGoogle ScholarCross RefCross Ref
  46. Peter Shaw, Jakob Uszkoreit, and Ashish Vaswani. 2018. Self-Attention with Relative Position Representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 464–468. https://aclanthology.org/N18-2074Google ScholarGoogle ScholarCross RefCross Ref
  47. Suraj Srinivas and Francois Fleuret. 2018. Knowledge Transfer with Jacobian Matching. In Proceedings of the 35th International Conference on Machine Learning, Vol.  80. 4723–4731. https://proceedings.mlr.press/v80/srinivas18a.htmlGoogle ScholarGoogle Scholar
  48. Mihai Surdeanu, Julie Tibshirani, Ramesh Nallapati, and Christopher D. Manning. 2012. Multi-Instance Multi-Label Learning for Relation Extraction. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. 455–465. https://aclanthology.org/D12-1042Google ScholarGoogle Scholar
  49. Qingyu Tan, Ruidan He, Lidong Bing, and Hwee Tou Ng. 2022. Document-Level Relation Extraction with Adaptive Focal Loss and Knowledge Distillation. In Findings of the Association for Computational Linguistics: ACL 2022. 1672–1681. https://aclanthology.org/2022.findings-acl.132Google ScholarGoogle Scholar
  50. Qingyu Tan, Lu Xu, Lidong Bing, Hwee Tou Ng, and Sharifah Mahani Aljunied. 2022. Revisiting DocRED–Addressing the Overlooked False Negative Problem in Relation Extraction. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. 8472–8487. https://aclanthology.org/2022.emnlp-main.580Google ScholarGoogle ScholarCross RefCross Ref
  51. Hengzhu Tang, Yanan Cao, Zhenyu Zhang, Jiangxia Cao, Fang Fang, Shi Wang, and Pengfei Yin. 2020. HIN: Hierarchical Inference Network for Document-Level Relation Extraction. In Advances in Knowledge Discovery and Data Mining: PAKDD 2020. 197–209. https://doi.org/10.1007/978-3-030-47426-3_16Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Yonglong Tian, Dilip Krishnan, and Phillip Isola. 2020. Contrastive Representation Distillation. In International Conference on Learning Representations. https://arxiv.org/abs/1910.10699Google ScholarGoogle Scholar
  53. Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing Data using t-SNE. Journal of Machine Learning Research 9, 86 (2008), 2579–2605. http://jmlr.org/papers/v9/vandermaaten08a.htmlGoogle ScholarGoogle Scholar
  54. Patrick Verga, Emma Strubell, and Andrew McCallum. 2018. Simultaneously Self-Attending to All Mentions for Full-Abstract Biological Relation Extraction. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 872–884. https://aclanthology.org/N18-1080Google ScholarGoogle ScholarCross RefCross Ref
  55. Denny Vrandečić and Markus Krötzsch. 2014. Wikidata: A Free Collaborative Knowledgebase. Commun. ACM 57, 10 (2014), 78–85. https://doi.org/10.1145/2629489Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Huiyu Wang, Yukun Zhu, Bradley Green, Hartwig Adam, Alan Yuille, and Liang-Chieh Chen. 2020. Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation. In Proceedings of the 16th European Conference on Computer Vision. 108–126. https://doi.org/10.1007/978-3-030-58548-8_7Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Ye Wang, Xinxin Liu, Wenxin Hu, and Tao Zhang. 2022. A Unified Positive-Unlabeled Learning Framework for Document-Level Relation Extraction with Different Levels of Labeling. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. 4123–4135. https://aclanthology.org/2022.emnlp-main.276Google ScholarGoogle ScholarCross RefCross Ref
  58. Ying Wei and Qi Li. 2022. SagDRE: Sequence-Aware Graph-Based Document-Level Relation Extraction with Adaptive Margin Loss. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2000–2008. https://doi.org/10.1145/3534678.3539304Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Ye Wu, Ruibang Luo, Henry CM Leung, Hing-Fung Ting, and Tak-Wah Lam. 2019. ReNet: A Deep Learning Approach for Extracting Gene-Disease Associations from Literature. In Proceedings of the 23rd Annual International Conference on Research in Computational Molecular Biology. 272–284. https://doi.org/10.1007/978-3-030-17083-7_17Google ScholarGoogle ScholarCross RefCross Ref
  60. Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and S Yu Philip. 2020. A Comprehensive Survey on Graph Neural Networks. IEEE Transactions on Neural Networks and Learning Systems 32, 1(2020), 4–24. https://doi.org/10.1109/TNNLS.2020.2978386Google ScholarGoogle ScholarCross RefCross Ref
  61. Yuxin Xiao, Zecheng Zhang, Yuning Mao, Carl Yang, and Jiawei Han. 2022. SAIS: Supervising and Augmenting Intermediate Steps for Document-Level Relation Extraction. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2395–2409. https://aclanthology.org/2022.naacl-main.171Google ScholarGoogle ScholarCross RefCross Ref
  62. Yiqing Xie, Jiaming Shen, Sha Li, Yuning Mao, and Jiawei Han. 2022. EIDER: Empowering Document-Level Relation Extraction with Efficient Evidence Extraction and Inference-Stage Fusion. In Findings of the Association for Computational Linguistics: ACL 2022. 257–268. https://aclanthology.org/2022.findings-acl.23Google ScholarGoogle Scholar
  63. Chenyan Xiong, Jamie Callan, and Tie-Yan Liu. 2017. Word-Entity Duet Representations for Document Ranking. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. 763–772. https://doi.org/10.1145/3077136.3080768Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Benfeng Xu, Quan Wang, Yajuan Lyu, Yong Zhu, and Zhendong Mao. 2021. Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol.  35. 14149–14157. https://doi.org/10.1609/aaai.v35i16.17665Google ScholarGoogle ScholarCross RefCross Ref
  65. Tianyu Xu, Wen Hua, Jianfeng Qu, Zhixu Li, Jiajie Xu, An Liu, and Lei Zhao. 2022. Evidence-Aware Document-Level Relation Extraction. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 2311–2320. https://doi.org/10.1145/3511808.3557313Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Yuan Yao, Deming Ye, Peng Li, Xu Han, Yankai Lin, Zhenghao Liu, Zhiyuan Liu, Lixin Huang, Jie Zhou, and Maosong Sun. 2019. DocRED: A Large-Scale Document-Level Relation Extraction Dataset. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 764–777. https://aclanthology.org/P19-1074Google ScholarGoogle ScholarCross RefCross Ref
  67. Deming Ye, Yankai Lin, Jiaju Du, Zhenghao Liu, Peng Li, Maosong Sun, and Zhiyuan Liu. 2020. Coreferential Reasoning Learning for Language Representation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. 7170–7186. https://aclanthology.org/2020.emnlp-main.582Google ScholarGoogle ScholarCross RefCross Ref
  68. Haoze Yu, Haisheng Li, Dianhui Mao, and Qiang Cai. 2020. A Relationship Extraction Method for Domain Knowledge Graph Construction. World Wide Web 23(2020), 735–753. https://doi.org/10.1007/s11280-019-00765-yGoogle ScholarGoogle ScholarCross RefCross Ref
  69. Li Yuan, Francis EH Tay, Guilin Li, Tao Wang, and Jiashi Feng. 2020. Revisiting Knowledge Distillation via Label Smoothing Regularization. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3903–3911. https://doi.org/10.1109/CVPR42600.2020.00396Google ScholarGoogle ScholarCross RefCross Ref
  70. Daojian Zeng, Kang Liu, Siwei Lai, Guangyou Zhou, and Jun Zhao. 2014. Relation Classification via Convolutional Deep Neural Network. In Proceedings the 25th International Conference on Computational Linguistics. 2335–2344. https://aclanthology.org/C14-1220Google ScholarGoogle Scholar
  71. Shuang Zeng, Yuting Wu, and Baobao Chang. 2021. SIRE: Separate Intra- and Inter-Sentential Reasoning for Document-Level Relation Extraction. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. 524–534. https://aclanthology.org/2021.findings-acl.47Google ScholarGoogle ScholarCross RefCross Ref
  72. Shuang Zeng, Runxin Xu, Baobao Chang, and Lei Li. 2020. Double Graph Based Reasoning for Document-Level Relation Extraction. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. 1630–1640. https://aclanthology.org/2020.emnlp-main.127Google ScholarGoogle ScholarCross RefCross Ref
  73. Ningyu Zhang, Xiang Chen, Xin Xie, Shumin Deng, Chuanqi Tan, Mosha Chen, Fei Huang, Luo Si, and Huajun Chen. 2021. Document-Level Relation Extraction as Semantic Segmentation. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence. 3999–4006. https://doi.org/10.24963/ijcai.2021/551Google ScholarGoogle ScholarCross RefCross Ref
  74. Ying Zhang, Tao Xiang, Timothy M. Hospedales, and Huchuan Lu. 2018. Deep Mutual Learning. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4320–4328. https://doi.org/10.1109/CVPR.2018.00454Google ScholarGoogle ScholarCross RefCross Ref
  75. Zhilu Zhang and Mert Sabuncu. 2020. Self-Distillation as Instance-Specific Label Smoothing. In Advances in Neural Information Processing Systems, Vol.  33. 2184–2195. https://proceedings.neurips.cc/paper_files/paper/2020/file/1731592aca5fb4d789c4119c65c10b4b-Paper.pdfGoogle ScholarGoogle Scholar
  76. Zhenyu Zhang, Xiaobo Shu, Bowen Yu, Tingwen Liu, Jiapeng Zhao, Quangang Li, and Li Guo. 2020. Distilling Knowledge from Well-Informed Soft Labels for Neural Relation Extraction. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol.  34. 9620–9627. https://doi.org/10.1609/aaai.v34i05.6509Google ScholarGoogle ScholarCross RefCross Ref
  77. Chao Zhao, Daojian Zeng, Lu Xu, and Jianhua Dai. 2022. Document-Level Relation Extraction with Context Guided Mention Integration and Inter-Pair Reasoning. arXiv preprint arXiv:2201.04826(2022). https://arxiv.org/abs/2201.04826Google ScholarGoogle Scholar
  78. Zexuan Zhong and Danqi Chen. 2021. A Frustratingly Easy Approach for Entity and Relation Extraction. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 50–61. https://aclanthology.org/2021.naacl-main.5Google ScholarGoogle ScholarCross RefCross Ref
  79. Wenxuan Zhou, Kevin Huang, Tengyu Ma, and Jing Huang. 2021. Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol.  35. 14612–14620. https://doi.org/10.1609/aaai.v35i16.17717Google ScholarGoogle ScholarCross RefCross Ref
  80. Yang Zhou and Wee Sun Lee. 2022. None Class Ranking Loss for Document-Level Relation Extraction. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence. 4538–4544. https://doi.org/10.24963/ijcai.2022/630Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Document-Level Relation Extraction with Progressive Self-Distillation

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Information Systems
      ACM Transactions on Information Systems Just Accepted
      ISSN:1046-8188
      EISSN:1558-2868
      Table of Contents

      Copyright © 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Online AM: 8 April 2024
      • Accepted: 27 March 2024
      • Revised: 25 February 2024
      • Received: 5 October 2023
      Published in tois Just Accepted

      Check for updates

      Qualifiers

      • research-article
    • Article Metrics

      • Downloads (Last 12 months)88
      • Downloads (Last 6 weeks)88

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader