skip to main content
research-article

Medical Question Summarization with Entity-driven Contrastive Learning

Authors Info & Claims
Published:15 April 2024Publication History
Skip Abstract Section

Abstract

By summarizing longer consumer health questions into shorter and essential ones, medical question-answering systems can more accurately understand consumer intentions and retrieve suitable answers. However, medical question summarization is very challenging due to obvious distinctions in health trouble descriptions from patients and doctors. Although deep learning has been applied to successfully address the medical question summarization (MQS) task, two challenges remain: how to correctly capture question focus to model its semantic intention, and how to obtain reliable datasets to fairly evaluate performance. To address these challenges, this article proposes a novel medical question summarization framework based on entity-driven contrastive learning (ECL). ECL employs medical entities present in frequently asked questions (FAQs) as focuses and devises an effective mechanism to generate hard negative samples. This approach compels models to focus on essential information and consequently generate more accurate question summaries. Furthermore, we have discovered that some MQS datasets, such as the iCliniq dataset with a 33% duplicate rate, have significant data leakage issues. To ensure an impartial evaluation of the related methods, this article carefully examines leaked samples to reorganize more reasonable datasets. Extensive experiments demonstrate that our ECL method outperforms the existing methods and achieves new state-of-the-art performance, i.e., 52.85, 43.16, 41.31, 43.52 in terms of ROUGE-1 metric on MeQSum, CHQ-Summ, iCliniq, HealthCareMagic dataset, respectively. The code and datasets are available at https://github.com/yrbobo/MQS-ECL

REFERENCES

  1. [1] Abacha Asma Ben and Demner-Fushman Dina. 2019. On the summarization of consumer health questions. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL’19). 22282234.Google ScholarGoogle ScholarCross RefCross Ref
  2. [2] Abacha Asma Ben, M’rabet Yassine, Zhang Yuhao, Shivade Chaitanya, Langlotz Curtis, and Demner-Fushman Dina. 2021. Overview of the MEDIQA 2021 shared task on summarization in the medical domain. In Proceedings of the 20th Workshop on Biomedical Language Processing (BioNLP’21). 7485.Google ScholarGoogle ScholarCross RefCross Ref
  3. [3] Abacha Asma Ben and Demner-Fushman Dina. 2019. A question-entailment approach to question answering. BMC Bioinformatics 20, 1 (2019), 123.Google ScholarGoogle Scholar
  4. [4] Caciularu Avi, Dagan Ido, Goldberger Jacob, and Cohan Arman. 2022. Long context question answering via supervised contrastive learning. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL’22). 28722879.Google ScholarGoogle ScholarCross RefCross Ref
  5. [5] Cao Shuyang and Wang Lu. 2021. CLIFF: Contrastive learning for improving faithfulness and factuality in abstractive summarization. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP’21). 66336649.Google ScholarGoogle ScholarCross RefCross Ref
  6. [6] Chen Shaobin, Zhou Jie, Sun Yuling, and He Liang. 2022. An information minimization based contrastive learning model for unsupervised sentence embeddings learning. In Proceedings of the 29th International Conference on Computational Linguistics (COLING’22). 48214831.Google ScholarGoogle Scholar
  7. [7] Chen Ting, Kornblith Simon, Norouzi Mohammad, and Hinton Geoffrey. 2020. A simple framework for contrastive learning of visual representations. In Proceedings of the 37th International Conference on Machine Learning (ICML’20). 15971607.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. [8] Das Sarkar Snigdha Sarathi, Katiyar Arzoo, Passonneau Rebecca J., and Zhang Rui. 2022. CONTaiNER: Few-shot named entity recognition via contrastive learning. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL’22). 63386353.Google ScholarGoogle ScholarCross RefCross Ref
  9. [9] Gao Tianyu, Yao Xingcheng, and Chen Danqi. 2021. SimCSE: Simple contrastive learning of sentence embeddings. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP’21). 68946910.Google ScholarGoogle ScholarCross RefCross Ref
  10. [10] Hadsell Raia, Chopra Sumit, and LeCun Yann. 2006. Dimensionality reduction by learning an invariant mapping. In Proceedings of the 2006 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’06). 17351742.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. [11] He Kaiming, Fan Haoqi, Wu Yuxin, Xie Saining, and Girshick Ross. 2020. Momentum contrast for unsupervised visual representation learning. In Proceedings of the 2020 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’20). 97299738.Google ScholarGoogle ScholarCross RefCross Ref
  12. [12] Huang Yucheng, He Kai, Wang Yige, Zhang Xianli, Gong Tieliang, Mao Rui, and Li Chen. 2022. COPNER: Contrastive learning with prompt guiding for few-shot named entity recognition. In Proceedings of the 29th International Conference on Computational Linguistics (COLING’22). 25152527.Google ScholarGoogle Scholar
  13. [13] Jaiswal Ashish, Babu Ashwin Ramesh, Zadeh Mohammad Zaki, Banerjee Debapriya, and Makedon Fillia. 2020. A survey on contrastive self-supervised learning. Technologies (2020), 9, 1 (2020), 2.Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Nambiar Sindhya K., S David Peter, and Idicula Sumam Mary. 2023. Abstractive summarization of text document in Malayalam language: Enhancing attention model using POS tagging feature. ACM Transactions on Asian and Low-Resource Language Information Processing 22, 2 (2023), 114.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. [15] Katwe Praveen Kumar, Khamparia Aditya, Gupta Deepak, and Dutta Ashit Kumar. 2023. Methodical systematic review of abstractive summarization and natural language processing models for biomedical health informatics: Approaches, metrics and challenges. ACM Transactions on Asian and Low-Resource Language Information Processing (2023), 1–37.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. [16] Kilicoglu Halil, Abacha Asma Ben, Mrabet Yassine, Shooshan Sonya E., Rodriguez Laritza, Masterton Kate, and Demner-Fushman Dina. 2018. Semantic annotation of consumer health questions. BMC Bioinformatics 19, 1 (2018), 128.Google ScholarGoogle ScholarCross RefCross Ref
  17. [17] Lei Chuan, Efthymiou Vasilis, Geis Rebecca, and Özcan Fatma. 2020. Expanding query answers on medical knowledge bases. In Proceedings of the 2020 International Conference on Extending Database Technology (EDBT’20). 567578.Google ScholarGoogle Scholar
  18. [18] Lewis Mike, Liu Yinhan, Goyal Naman, Ghazvininejad Marjan, Mohamed Abdelrahman, Levy Omer, Stoyanov Veselin, and Zettlemoyer Luke. 2020. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL’20). 78717880.Google ScholarGoogle ScholarCross RefCross Ref
  19. [19] Li Yaoyiran, Liu Fangyu, Collier Nigel, Korhonen Anna, and Vulić Ivan. 2022. Improving word translation via two-stage contrastive learning. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL’22). 43534374.Google ScholarGoogle ScholarCross RefCross Ref
  20. [20] Lin Chin-Yew. 2004. ROUGE: A package for automatic evaluation of summaries. In Proceedings of the Text Summarization Branches Out. 7481.Google ScholarGoogle Scholar
  21. [21] Liu Yixin and Liu Pengfei. 2021. SimCLS: A simple framework for contrastive learning of abstractive summarization. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP’21). 10651072.Google ScholarGoogle ScholarCross RefCross Ref
  22. [22] Liu Yixin, Liu Pengfei, Radev Dragomir, and Neubig Graham. 2022. BRIO: Bringing order to abstractive summarization. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL’22). 28902903.Google ScholarGoogle ScholarCross RefCross Ref
  23. [23] Ma Congbo, Zhang Wei Emma, Guo Mingyu, Wang Hu, and Sheng Quan Z.. 2022. Multi-document summarization via deep learning techniques: A survey. ACM Computing Surveys 55, 5 (2022), 137.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. [24] Mrini Khalil, Dernoncourt Franck, Chang Walter, Farcas Emilia, and Nakashole Ndapandula. 2021. Joint summarization-entailment optimization for consumer health question understanding. In Proceedings of the 2nd Workshop on Natural Language Processing for Medical Conversations (NLPMC’21). 5865.Google ScholarGoogle ScholarCross RefCross Ref
  25. [25] Mrini Khalil, Dernoncourt Franck, Yoon Seunghyun, Bui Trung, Chang Walter, Farcas Emilia, and Nakashole Ndapandula. 2021. A gradually soft multi-task and data-augmented approach to medical question understanding. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP’21). 15051515.Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Mrini Khalil, Dernoncourt Franck, Yoon Seunghyun, Bui Trung, Chang Walter, Farcas Emilia, and Nakashole Ndapandula. 2021. UCSD-Adobe at MEDIQA 2021: Transfer learning and answer sentence selection for medical summarization. In Proceedings of the 20th Workshop on Biomedical Language Processing (BioNLP’21). 257262.Google ScholarGoogle ScholarCross RefCross Ref
  27. [27] Pan Xiao, Wang Mingxuan, Wu Liwei, and Li Lei. 2021. Contrastive learning for many-to-many multilingual neural machine translation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP’21). 244258.Google ScholarGoogle ScholarCross RefCross Ref
  28. [28] Qi Peng, Zhang Yuhao, Zhang Yuhui, Bolton Jason, and Manning Christopher D.. 2020. Stanza: A Python natural language processing toolkit for many human languages. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL’20). 101108.Google ScholarGoogle ScholarCross RefCross Ref
  29. [29] Qi Weizhen, Yan Yu, Gong Yeyun, Liu Dayiheng, Duan Nan, Chen Jiusheng, Zhang Ruofei, and Zhou Ming. 2020. ProphetNet: Predicting future N-gram for sequence-to-sequence pre-training. In Findings of the Association for Computational Linguistics: EMNLP 2020. 24012410.Google ScholarGoogle ScholarCross RefCross Ref
  30. [30] Raffel Colin, Shazeer Noam, Roberts Adam, Lee Katherine, Narang Sharan, Matena Michael, Zhou Yanqi, Li Wei, and Liu Peter J.. 2020. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research 21, 1 (2020), 54855551.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. [31] Sänger Mario, Weber Leon, and Leser Ulf. 2021. WBI at MEDIQA 2021: Summarizing consumer health questions with generative transformers. In Proceedings of the 20th Workshop on Biomedical Language Processing (BioNLP’21). 8695.Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] See Abigail, Liu Peter J., and Manning Christopher D.. 2017. Get to the point: Summarization with pointer-generator networks. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL’17). 10731083.Google ScholarGoogle ScholarCross RefCross Ref
  33. [33] Tan Haochen, Shao Wei, Wu Han, Yang Ke, and Song Linqi. 2022. A sentence is worth 128 pseudo tokens: A semantic-aware contrastive learning framework for sentence embeddings. In Findings of the Association for Computational Linguistics: ACL 2022. 246256.Google ScholarGoogle ScholarCross RefCross Ref
  34. [34] Vo Tham. 2021. Se4exsum: An integrated semantic-aware neural approach with graph convolutional network for extractive text summarization. Transactions on Asian and Low-Resource Language Information Processing 20, 6 (2021), 122.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. [35] Wang Dong, Ding Ning, Li Piji, and Zheng Haitao. 2021. CLINE: Contrastive learning with semantic negative examples for natural language understanding. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP’21). 23322342.Google ScholarGoogle ScholarCross RefCross Ref
  36. [36] Xu Shusheng, Zhang Xingxing, Wu Yi, and Wei Furu. 2022. Sequence level contrastive learning for text summarization. In Proceedings of the 36th AAAI Conference on Artificial Intelligence (AAAI’22). 1155611565.Google ScholarGoogle ScholarCross RefCross Ref
  37. [37] Yadav Shweta, Gupta Deepak, Abacha Asma Ben, and Demner-Fushman Dina. 2021. Reinforcement learning for abstractive question summarization with question-aware semantic rewards. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP’21). 249255.Google ScholarGoogle ScholarCross RefCross Ref
  38. [38] Yadav Shweta, Gupta Deepak, and Demner-Fushman Dina. 2022. CHQ-Summ: A dataset for consumer healthcare question summarization. arXiv:2206.06581. Retrieved from https://arxiv.org/abs/2206.06581Google ScholarGoogle Scholar
  39. [39] Yadav Shweta, Sarrouti Mourad, and Gupta Deepak. 2021. NLM at MEDIQA 2021: Transfer learning-based approaches for consumer question and multi-answer summarization. In Proceedings of the 20th Workshop on Biomedical Language Processing (BioNLP’21). 291301.Google ScholarGoogle ScholarCross RefCross Ref
  40. [40] Yang Nan, Wei Furu, Jiao Binxing, Jiang Daxing, and Yang Linjun. 2021. xMoCo: Cross momentum contrastive learning for open-domain question answering. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP’21). 61206129.Google ScholarGoogle ScholarCross RefCross Ref
  41. [41] Yang Zonghan, Cheng Yong, Liu Yang, and Sun Maosong. 2019. Reducing word omission errors in neural machine translation: A contrastive learning approach. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL’19). 61916196.Google ScholarGoogle ScholarCross RefCross Ref
  42. [42] Ying Huaiyuan, Luo Shengxuan, Dang Tiantian, and Yu Sheng. 2022. Label refinement via contrastive learning for distantly-supervised named entity recognition. In Findings of the Association for Computational Linguistics: NAACL 2022. 26562666.Google ScholarGoogle ScholarCross RefCross Ref
  43. [43] Guangtao Zeng, Wenmian Yang, Zeqian Ju, Yue Yang, Sicheng Wang, Ruisi Zhang, Meng Zhou, Jiaqi Zeng, Xiangyu Dong, Ruoyu Zhang, Hongchao Fang, Penghui Zhu, Shu Chen, and Pengtao Xie. 2020. MedDialog: Large-scale medical dialogue datasets. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP’20). 9241–9250.Google ScholarGoogle Scholar
  44. [44] Zhang Jingqing, Zhao Yao, Saleh Mohammad, and Liu Peter. 2020. PEGASUS: Pre-training with extracted gap-sentences for abstractive summarization. In Proceedings of the 37th International Conference on Machine Learning (ICML’20). 1132811339.Google ScholarGoogle Scholar
  45. [45] Zhang Ming, Dou Shuai, Wang Ziyang, and Wu Yunfang. 2022. Focus-driven contrastive learning for medical question summarization. In Proceedings of the 29th International Conference on Computational Linguistics (COLING’22). 61766186.Google ScholarGoogle Scholar
  46. [46] Zhang Miaoran, Mosbach Marius, Adelani David, Hedderich Michael, and Klakow Dietrich. 2022. MCSE: Multimodal contrastive learning of sentence embeddings. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL’22). 59595969.Google ScholarGoogle ScholarCross RefCross Ref
  47. [47] Zhang Mengli, Zhou Gang, Yu Wanting, Huang Ningbo, and Liu Wenfen. 2023. Ga-scs: Graph-augmented source code summarization. ACM Transactions on Asian and Low-Resource Language Information Processing 22, 2 (2023), 119.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. [48] Zhang Tong, Ye Wei, Yang Baosong, Zhang Long, Ren Xingzhang, Liu Dayiheng, Sun Jinan, Zhang Shikun, Zhang Haibo, and Zhao Wen. 2022. Frequency-aware contrastive learning for neural machine translation. In Proceedings of the 36th AAAI Conference on Artificial Intelligence (AAAI’22). 1171211720.Google ScholarGoogle ScholarCross RefCross Ref
  49. [49] Zhang Yuteng, Lu Wenpeng, Ou Weihua, Zhang Guoqiang, Zhang Xu, Cheng Jinyong, and Zhang Weiyu. 2020. Chinese medical question answer selection via hybrid models based on CNN and GRU. Multimedia Tools and Applications 79, 21-22 (2020), 14751–14776.Google ScholarGoogle Scholar
  50. [50] Zhao Shuai, Li Qing, Yang Yuer, Wen Jinming, and Luo Weiqi. 2023. From softmax to nucleusmax: A novel sparse language model for Chinese radiology report summarization. ACM Transactions on Asian and Low-Resource Language Information Processing 22, 6 (2023), 121.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Medical Question Summarization with Entity-driven Contrastive Learning

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Asian and Low-Resource Language Information Processing
      ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 23, Issue 4
      April 2024
      221 pages
      ISSN:2375-4699
      EISSN:2375-4702
      DOI:10.1145/3613577
      • Editor:
      • Imed Zitouni
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 15 April 2024
      • Online AM: 11 March 2024
      • Accepted: 24 February 2024
      • Revised: 27 December 2023
      • Received: 21 August 2023
      Published in tallip Volume 23, Issue 4

      Check for updates

      Qualifiers

      • research-article
    • Article Metrics

      • Downloads (Last 12 months)119
      • Downloads (Last 6 weeks)65

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text