Adaptive semi-supervised learning from stronger augmentation transformations of discrete text information

Zhang, Xuemiao; Tan, Zhouxing; Lu, Fengyu; Yan, Rui; Liu, Junfei

doi:10.1007/s10115-024-02100-y

Adaptive semi-supervised learning from stronger augmentation transformations of discrete text information

Regular Paper
Published: 11 April 2024

(2024)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Xuemiao Zhang¹,
Zhouxing Tan¹,
Fengyu Lu¹,
Rui Yan³ &
…
Junfei Liu^1,2

37 Accesses
Explore all metrics

Abstract

Semi-supervised learning is a promising approach to dealing with the problem of insufficient labeled data. Recent methods grouped into paradigms of consistency regularization and pseudo-labeling have outstanding performances on image data, but achieve limited improvements when employed for processing textual information, due to the neglect of the discrete nature of textual information and the lack of high-quality text augmentation transformation means. In this paper, we propose the novel SeqMatch method. It can automatically perceive abnormal model states caused by anomalous data obtained by text augmentations and reduce their interferences and instead leverages normal ones to improve the effectiveness of consistency regularization. And it generates hard artificial pseudo-labels to enable the model to be efficiently updated and optimized toward low entropy. We also design several much stronger well-organized text augmentation transformation pipelines to increase the divergence between two views of unlabeled discrete textual sequences, thus enabling the model to learn more knowledge from the alignment. Extensive comparative experimental results show that our SeqMatch outperforms previous methods on three widely used benchmarks significantly. In particular, SeqMatch can achieve a maximum performance improvement of 16.4% compared to purely supervised training when provided with a minimal number of labeled examples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

FeatMatch: Feature-Based Augmentation for Semi-supervised Learning

Perturbation consistency and mutual information regularization for semi-supervised semantic segmentation

Article 20 May 2022

Revisiting Consistency Regularization for Semi-Supervised Learning

Article Open access 07 December 2022

Notes

References

Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inform Process Syst, 30
Kenton JDMWC, Toutanova LK (2019) Bert: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT
Van Engelen JE, Hoos HH (2020) A survey on semi-supervised learning. Mach Learn 109(2):373–440
Article MathSciNet Google Scholar
Yang X, Song Z, King I, Xu Z (2022) A survey on deep semi-supervised learning. IEEE Trans Knowl Data Eng
Liu F, Tian Y, Chen Y, Liu Y, Belagiannis V, Carneiro G (2022) Acpl: anti-curriculum pseudo-,labelling for semi-supervised medical image classification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 20697–20706
Xie Q, Dai Z, Hovy E, Luong T, Le Q (2020) Unsupervised data augmentation for consistency training. Adv Neural Inf Process Syst 33:6256–6268
Google Scholar
Sohn K, Berthelot D, Carlini N, Zhang Z, Zhang H, Raffel CA, Cubuk ED, Kurakin A, Li C-L (2020) Fixmatch: simplifying semi-supervised learning with consistency and confidence. Adv Neural Inf Process Syst 33:596–608
Google Scholar
Zhang B, Wang Y, Hou W, Wu H, Wang J, Okumura M, Shinozaki T (2021) Flexmatch: Boosting semi-supervised learning with curriculum pseudo labeling. Adv Neural Inform Process Syst, 34
Park J, Kim G, Kang J (2022) Consistency training with virtual adversarial discrete perturbation. In: Proceedings of the 2022 conference of the north american chapter of the association for computational linguistics: human language technologies, pp 5646–5656
Lee DH (2013) Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In: Workshop on challenges in representation learning, ICML, vol 3, p 896
Chen L, Alexa A, Garcia F, Kumar V, Xie H, Lu J (2021) Industry scale semi-supervised learning for natural language understanding. NAACL-HLT 2021:311
Google Scholar
Tsai ACY, Lin SY, Fu LC (2022) Contrast-enhanced semi-supervised text classification with few labels. In: Proceedings of the AAAI conference on artificial intelligence, vol 36, pp 11394–11402
Bachman P, Alsharif O, Precup D (2014) Learning with pseudo-ensemblesAdv Neural Inform Process Syst, 27
Sajjadi M, Javanmardi M, Tasdizen T (2016) Regularization with stochastic transformations and perturbations for deep semi-supervised learning. Adv Neural Inform Process Syst, 29
Laine S, Aila T (2016) Temporal ensembling for semi-supervised learning. In: International conference on learning representations
Lee D, Kim S, Kim I, Cheon Y, Cho M, Han WS (2022 ) Contrastive regularization for semi-supervised learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3911–3920
Tarvainen A, Valpola H (2017) Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. Adv. Neural Inform Process Syst, 30
Cubuk ED, Zoph B, Shlens J, Le QV (2019) Randaugment: practical data augmentation with no separate search. arXiv:1909.13719 (vol 2 no 4, p 7)
Cubuk ED, Zoph B, Mane D, Vasudevan V, Le QV (2018) Autoaugment: learning augmentation policies from data. arXiv:1805.09501
Wei J, Zou K (2019) Eda: Easy data augmentation techniques for boosting performance on text classification tasks. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 6382–6388
Sennrich R, Haddow B, Birch A(2016) Improving neural machine translation models with monolingual data. In: Proceedings of the 54th annual meeting of the association for computational linguistics, Vol 1: Long Papers, pp 86–96
Edunov S, Ott M, Auli M, Grangier D (2018) Understanding back-translation at scale. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 489–500
Ma, E.: NLP Augmentation. https://github.com/makcedward/nlpaug (2019)
Miyato T, Dai AM, Goodfellow I (2016) Adversarial training methods for semi-supervised text classification. In: International conference on learning representations
Qu Y, Shen D, Shen Y, Sajeev S, Chen W, Han J (2020) Coda: contrast-enhanced and diversity-promoting data augmentation for natural language understanding. In: International conference on learning representations
Feng SY, Gangal V, Wei J, Chandar S, Vosoughi S, Mitamura T, Hovy E (2021) A survey of data augmentation approaches for nlp. In: Findings of the association for computational linguistics: ACL-IJCNLP 2021, pp 968–988
Wang Y, Xu C, Sun Q, Hu H, Tao C, Geng X, Jiang D (2022) Promda: Prompt-based data augmentation for low-resource nlu tasks. In: Proceedings of the 60th annual meeting of the association for computational linguistics, pp 4242–4255
Wang M, Wang W, Li B, Zhang X, Lan L, Tan H, Liang T, Yu W, Luo Z (2021) Interbn: Channel fusion for adversarial unsupervised domain adaptation. In: Proceedings of the 29th ACM international conference on multimedia, pp 3691–3700
Wang M, Li P, Shen L, Wang Y, Wang S, Wang W, Zhang X, Chen J, Luo Z (2022) Informative pairs mining based adaptive metric learning for adversarial domain adaptation. Neural Netw 151:238–249
Article Google Scholar
Wang M, Yuan J, Qian Q, Wang Z, Li H (2022) Semantic data augmentation based distance metric learning for domain generalization. In: Proceedings of the 30th ACM international conference on multimedia, pp 3214–3223
Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu PJ (2020) Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 21(1):5485–5551
MathSciNet Google Scholar
Kobayashi S (2018) Contextual augmentation: Data augmentation by words with paradigmatic relations. In: Proceedings of NAACL-HLT, pp 452–457
Cocos A, Apidianaki M, Callison-Burch C (2017) Mapping the paraphrase database to wordnet. In: Proceedings of the 6th joint conference on lexical and computational semantics (* SEM 2017), pp 84–90
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv:1301.3781
Maas AL, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies, Association for Computational Linguistics, Portland, pp 142–150
Zhang X, Zhao J, LeCun Y (2015) Character-level convolutional networks for text classification. Adv Neural Inform Process Syst, 28

Download references

Author information

Authors and Affiliations

School of Software and Microelectronics, Peking University, Beijing, 100871, Beijing, China
Xuemiao Zhang, Zhouxing Tan, Fengyu Lu & Junfei Liu
National Engineering Research Center for Software Engineering, Peking University, Beijing, 100871, Beijing, China
Junfei Liu
Renmin University of China, Beijing, 100871, Beijing, China
Rui Yan

Authors

Xuemiao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zhouxing Tan
View author publications
You can also search for this author in PubMed Google Scholar
Fengyu Lu
View author publications
You can also search for this author in PubMed Google Scholar
Rui Yan
View author publications
You can also search for this author in PubMed Google Scholar
Junfei Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Xuemiao Zhang or Junfei Liu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhang, X., Tan, Z., Lu, F. et al. Adaptive semi-supervised learning from stronger augmentation transformations of discrete text information. Knowl Inf Syst (2024). https://doi.org/10.1007/s10115-024-02100-y

Download citation

Received: 12 March 2023
Revised: 17 February 2024
Accepted: 10 March 2024
Published: 11 April 2024
DOI: https://doi.org/10.1007/s10115-024-02100-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adaptive semi-supervised learning from stronger augmentation transformations of discrete text information

Abstract

Access this article

Similar content being viewed by others

FeatMatch: Feature-Based Augmentation for Semi-supervised Learning

Perturbation consistency and mutual information regularization for semi-supervised semantic segmentation

Revisiting Consistency Regularization for Semi-Supervised Learning

Notes

References

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Adaptive semi-supervised learning from stronger augmentation transformations of discrete text information

Abstract

Access this article

Similar content being viewed by others

FeatMatch: Feature-Based Augmentation for Semi-supervised Learning

Perturbation consistency and mutual information regularization for semi-supervised semantic segmentation

Revisiting Consistency Regularization for Semi-Supervised Learning

Notes

References

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation