A semi-supervised method to generate a persian dataset for suggestion classification

Safari, Leila; Mohammady, Zanyar

doi:10.1007/s10579-023-09688-7

A semi-supervised method to generate a persian dataset for suggestion classification

Original Paper
Published: 29 September 2023

(2023)
Cite this article

Language Resources and Evaluation Aims and scope Submit manuscript

84 Accesses
Explore all metrics

Abstract

Suggestion mining has become a popular subject in the field of natural language processing (NLP) that is useful in areas like a service/product improvement. The purpose of this study is to provide an automated machine learning (ML) based approach to extract suggestions from Persian text. In this research, first, a novel two-step semi-supervised method has been proposed to generate a Persian dataset called ParsSugg, which is then used in the automatic classification of the user’s suggestions. The first step is manual labeling of data based on a proposed guideline, followed by a data augmentation phase. In the second step, using pre-trained Persian Bidirectional Encoder Representations from Transformers (ParsBERT) as a classifier and the data from the previous step, more data were labeled. The performance of various ML models, including Support Vector Machine (SVM), Random Forest (RF), Convolutional Neural Networks (CNN), Long Short Term Memory (LSTM), and the ParsBERT language model has been examined on the generated dataset. The F-score value of 97.27 for ParsBERT and about 94.5 for SVM and CNN classifiers were obtained for the suggestion class which is a promising result as the first research on suggestion classification on Persian texts. Also, the proposed guideline can be used for other NLP tasks, and the generated dataset can be used in other suggestion classification tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Multi-task Approach to Open Domain Suggestion Mining Using Language Model for Text Over-Sampling

ServiceBERT: A Pre-trained Model for Web Service Tagging and Recommendation

Ensemble Approach for Suggestion Mining Using Deep Recurrent Convolutional Networks

Notes

References

Baroni, M., Dinu, G., & Kruszewski, G. (2014). Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 238–247. https://doi.org/10.3115/v1/P14-1023.
Brun, C., & Hagege, C. (2013). Suggestion mining: Detecting suggestions for improvement in users’ comments. Research in Computing Science, 70(79.7179), 31–41. http://www.rcs.cic.ipn.mx/rcs/2013_70/Suggestion.
Google Scholar
Dong, L., Wei, F., Duan, Y., Liu, X., Zhou, M., & Xu, K. (2013). The automated acquisition of suggestions from tweets. Twenty-Seventh AAAI Conference on Artificial Intelligence.
Farahani, M., Gharachorloo, M., Farahani, M., & Manthouri, M. (2021). ParsBERT: Transformer-based Model for Persian Language understanding. Neural Processing Letters, 53(6), 3831–3847. https://doi.org/10.1007/s11063-021-10528-4.
Article Google Scholar
Leekha, M., Goswami, M., & Jain, M. (2020). A Multi-task Approach to Open Domain Suggestion Mining using Language Model for text Over-Sampling. In J. M. Jose, E. Yilmaz, J. Magalhães, P. Castells, N. Ferro, M. J. Silva, & F. Martins (Eds.), Advances in Information Retrieval (pp. 223–229). Springer International Publishing.
Li, J. (2019). Lijunyi at SemEval-2019 Task 9: An attention-based LSTM and ensemble of different models for suggestion mining from online reviews and forums. Proceedings of the 13th International Workshop on Semantic Evaluation, 1208–1212.
Liu, J., Wang, S., & Sun, Y. (2019). OleNet at SemEval-2019 Task 9: BERT based Multi-Perspective Models for Suggestion Mining. Proceedings of the 13th International Workshop on Semantic Evaluation, 1231–1236.
McHugh, M. L. (2012). Interrater reliability: The Kappa statistic. Biochemia Medica, 22(3), 276–282.
Article Google Scholar
Negi, S. (2019). Suggestion Mining from Text. NUI Galway. Ph.D. thesis, National University of Ireland Galway (NUIG) (2019), http://hdl.handle.net/10379/14987.
Negi, S., & Buitelaar, P. (2015). Towards the extraction of customer-to-customer suggestions from reviews. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2159–2167.
Negi, S., Asooja, K., Mehrotra, S., & Buitelaar, P. (2016). A study of suggestions in opinionated texts and their automatic detection. Proceedings of the Fifth Joint Conference on Lexical and Computational Semantics, 170–178.
Pennington, J., Socher, R., & Manning, C. D. (2014). Glove: Global vectors for word representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532–1543.
Potamias, R. A., Neofytou, A., & Siolas, G. (2019). NTUA-ISLab at SemEval-2019 Task 9: Mining Suggestions in the wild. Proceedings of the 13th International Workshop on Semantic Evaluation, 1224–1230.
Reddy, T. R., Reddy, P. V., Mohan, T. M., & Dara, R. (2021). An approach for suggestion mining based on deep learning techniques. IOP Conference Series: Materials Science and Engineering, 1074(1), 12021.
Singal, S., Goel, T., Chopra, S., & Dahiya, S. (2020). Open Domain Suggestion Mining Leveraging Fine-Grained Analysis (Workshop Paper). 2020 IEEE Sixth International Conference on Multimedia Big Data (BigMM), 414–423. https://doi.org/10.1109/BigMM50055.2020.00069.
Tanwar, P., & Rai, P. (2020). A proposed system for opinion mining using machine learning, nlp and classifiers. IAES International Journal of Artificial Intelligence, 9(4), 726–733. https://doi.org/10.11591/ijai.v9.i4.pp726-733.
Article Google Scholar
Wachsmuth, H., Trenkmann, M., Stein, B., Engels, G., & Palakarska, T. (2014). A review corpus for argumentation analysis. International Conference on Intelligent Text Processing and Computational Linguistics, 115–127.
Wicaksono, A. F., & Myaeng, S. H. (2012). Mining advices from weblogs. Proceedings of the 21st ACM International Conference on Information and Knowledge Management, 2347–2350.
Yamamoto, M., & Sekiya, T. (2019). m_y at SemEval-2019 Task 9: Exploring BERT for Suggestion Mining. Proceedings of the 13th International Workshop on Semantic Evaluation, 888–892.
Zhou, Q., Zhang, Z., Wu, H., & Wang, L. (2019). ZQM at SemEval-2019 Task9: A Single Layer CNN Based on Pre-trained Model for Suggestion Mining. Proceedings of the 13th International Workshop on Semantic Evaluation, 1287–1291. https://doi.org/10.18653/v1/S19-2226.

Download references

Author information

Authors and Affiliations

Department of Computer Engineering, University of Zanjan, Zanjan, 4537138791, Iran
Leila Safari & Zanyar Mohammady

Authors

Leila Safari
View author publications
You can also search for this author in PubMed Google Scholar
Zanyar Mohammady
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Leila Safari.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Safari, L., Mohammady, Z. A semi-supervised method to generate a persian dataset for suggestion classification. Lang Resources & Evaluation (2023). https://doi.org/10.1007/s10579-023-09688-7

Download citation

Accepted: 07 August 2023
Published: 29 September 2023
DOI: https://doi.org/10.1007/s10579-023-09688-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A semi-supervised method to generate a persian dataset for suggestion classification

Abstract

Access this article

Similar content being viewed by others

A Multi-task Approach to Open Domain Suggestion Mining Using Language Model for Text Over-Sampling

ServiceBERT: A Pre-trained Model for Web Service Tagging and Recommendation

Ensemble Approach for Suggestion Mining Using Deep Recurrent Convolutional Networks

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A semi-supervised method to generate a persian dataset for suggestion classification

Abstract

Access this article

Similar content being viewed by others

A Multi-task Approach to Open Domain Suggestion Mining Using Language Model for Text Over-Sampling

ServiceBERT: A Pre-trained Model for Web Service Tagging and Recommendation

Ensemble Approach for Suggestion Mining Using Deep Recurrent Convolutional Networks

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation