Hostname: page-component-848d4c4894-ttngx Total loading time: 0 Render date: 2024-05-05T10:42:30.244Z Has data issue: false hasContentIssue false

A benchmark for evaluating Arabic word embedding models

Published online by Cambridge University Press:  17 October 2022

Sane Yagi
Affiliation:
Department of Foreign Languages, University of Sharjah, Sharjah, UAE
Ashraf Elnagar*
Affiliation:
Department of Computer Science, University of Sharjah, Sharjah, UAE
Shehdeh Fareh
Affiliation:
Department of Foreign Languages, University of Sharjah, Sharjah, UAE
*
*Corresponding author. E-mail: ashraf@sharjah.ac.ae

Abstract

Modelling the distributional semantics of such a morphologically rich language as Arabic needs to take into account its introflexive, fusional, and inflectional nature attributes that make up its combinatorial sequences and substitutional paradigms. To evaluate such word distributional models, the benchmarks that have been used thus far in Arabic have mimicked those in English. This paper reports on a benchmark that we designed to reflect linguistic patterns in both Contemporary Arabic and Classical Arabic, the first being a cover term for written and spoken Modern Standard Arabic, while the second for pre-modern Arabic. The analogy items we included in this benchmark are chosen in a transparent manner such that they would capture the major features of nouns and verbs; derivational and inflectional morphology; high-, middle-, and low-frequency patterns and lexical items; and morphosemantic, morphosyntactic, and semantic dimensions of the language. All categories included in this benchmark are carefully selected to ensure proper representation of the language. The benchmark consists of 45 roots of the trilateral, all-consonantal, and semivowel-inclusive types; six morphosemantic patterns (’af‘ala; ifta‘ala; infa‘ala; istaf‘ala; tafa‘‘ala; and tafā‘ala); five derivations (the verbal noun, active participle, and the contrasts in Masculine-Feminine; Feminine-Singular-Plural; Masculine-Singular-Plural); and morphosyntactic transformations (perfect and imperfect verbs conjugated for all pronouns); and lexical semantics (synonyms, antonyms, and hyponyms of nouns, verbs, and adjectives), as well as capital cities and currencies. All categories include an equal proportion of high-, medium-, and low-frequency items. For the purpose of validating the proposed benchmark, we developed a set of embedding models from different textual sources. Then, we tested them intrinsically using the proposed benchmark and extrinsically using two natural language processing tasks: Arabic Named Entity Recognition and Text Classification. The evaluation leads to the conclusion that the proposed benchmark is truly reflective of this morphologically rich language and discriminatory of word embeddings.

Type
Article
Copyright
© The Author(s), 2022. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Abbas, M., Lichouri, M. and Zeggada, A. (2019). Classification of arabic poems: From the 5th to the 15th century. In Cristani, M., Prati, A., Lanz, O., Messelodi, S. and Sebe, N. (eds), New Trends in Image Analysis and Processing – ICIAP 2019. Springer International Publishing, pp. 179186.Google Scholar
Al-Ayyoub, M., Khamaiseh, A.A., Jararweh, Y. and Al-Kabi, M.N. (2019). A comprehensive survey of arabic sentiment analysis. Information Processing & Management 56(2), 320342. Advance Arabic Natural Language Processing (ANLP) and its Applications.CrossRefGoogle Scholar
Al Qadi, L., El Rifai, H., Obaid, S. and Elnagar, A. (2019). Arabic text classification of news articles using classical supervised classifiers. In 2019 2nd International Conference on new Trends in Computing Sciences (ICTCS). IEEE, pp. 16.CrossRefGoogle Scholar
Al Qadi, L., El Rifai, H., Obaid, S. and Elnagar, A. (2020). A scalable shallow learning approach for tagging arabic news articles. Jordanian Journal of Computer and Information Technology (JJCIT) 6(3), 263280.Google Scholar
Al-Smadi, M., Al-Ayyoub, M., Jararweh, Y. and Qawasmeh, O. (2019). Enhancing aspect-based sentiment analysis of arabic hotels’ reviews using morphological, syntactic and semantic features. Information Processing & Management 56(2), 308319. Advance Arabic Natural Language Processing (ANLP) and its Applications.CrossRefGoogle Scholar
AL-Smadi, M., Jaradat, Z., AL-Ayyoub, M. and Jararweh, Y. (2017). Paraphrase identification and semantic text similarity analysis in arabic news tweets using lexical, syntactic, and semantic features. Information Processing & Management 53(3), 640652.CrossRefGoogle Scholar
Alam, Y.M. (1983). al-Mujam al-Arabi: Dirasa Ihsaiya li-Dawaran al-Huruf fi al-Judhur al-Arabiya. Thesis, Damascus University.Google Scholar
Alkhatlan, A., Kalita, J. and Alhaddad, A. (2018). Word sense disambiguation for arabic exploiting arabic wordnet and word embedding. Procedia Computer Science 142, 5060. Arabic Computational Linguistics.CrossRefGoogle Scholar
AlMahmoud, R.H., Hammo, B. and Faris, H. (2020). A modified bond energy algorithm with fuzzy merging and its application to arabic text document clustering. Expert Systems with Applications 159, 113598.CrossRefGoogle Scholar
Altowayan, A.A. and Elnagar, A. (2017). Improving arabic sentiment analysis with sentiment-specific embeddings. In 2017 IEEE International Conference on Big Data (Big Data). IEEE, pp. 43144320.CrossRefGoogle Scholar
Bakarov, A. (2018). A survey of word embeddings evaluation methods. CoRR, abs/1801.09536.Google Scholar
Benajiba, Y., Rosso, P. and Benedruiz, J.M. (2007). Anersys: An arabic named entity recognition system based on maximum entropy. In International Conference on Intelligent Text Processing and Computational Linguistics. Springer, pp. 143153.Google Scholar
Bolukbasi, T., Chang, K.-W., Zou, J. Y., Saligrama, V. and Kalai, A.T. (2016). Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In Advances in Neural Information Processing Systems, pp. 43494357.Google Scholar
Bounhas, I., Soudani, N. and Slimani, Y. (2020). Building a morpho-semantic knowledge graph for arabic information retrieval. Information Processing & Management 57(6), 102124.CrossRefGoogle Scholar
Buckwalter, T. and Parkinson, D.L. (2011). A Frequency Dictionary of Arabic: Core Vocabulary for Learners . Routledge Frequency Dictionaries. London, New York: Routledge.Google Scholar
Einea, O. and Elnagar, A. (2019). Predicting semantic textual similarity of arabic question pairs using deep learning. In 2019 IEEE/ACS 16th International Conference on Computer Systems and Applications (AICCSA). IEEE, pp. 15.CrossRefGoogle Scholar
Einea, O., Elnagar, A. and Debsi, R.A. (2019). Sanad: Single-label arabic news articles dataset for automatic text categorization. Data in Brief 25, 104076.CrossRefGoogle ScholarPubMed
El Rifai, H., Al Qadi, L. and Elnagar, A. (2022). Arabic text classification: The need for multi-labeling systems. Neural Computing and Applications 34(2), 11351159.CrossRefGoogle ScholarPubMed
Elnagar, A., Al-Debsi, R. and Einea, O. (2020). Arabic text classification using deep learning models. Information Processing & Management 57(1), 102121.CrossRefGoogle Scholar
Elnagar, A., Khalifa, Y.S. and Einea, A. (2018a). Hotel Arabic-Reviews Dataset Construction for Sentiment Analysis Applications. Computational Intelligence book series (SCI, Vol. 740), Springer International Publishing, pp. 3552.Google Scholar
Elnagar, A., Lulu, L. and Einea, O. (2018b). An annotated huge dataset for standard and colloquial arabic reviews for subjective sentiment analysis. Procedia Computer Science 142, 182189. Arabic Computational Linguistics.CrossRefGoogle Scholar
Elnagar, A., Yagi, S., Nassif, A.B., Shahin, I. and Salloum, S.A. (2021a). Sentiment analysis in dialectal arabic: A systematic review. In International Conference on Advanced Machine Learning Technologies and Applications. Springer, pp. 407417.CrossRefGoogle Scholar
Elnagar, A., Yagi, S.M., Nassif, A.B., Shahin, I. and Salloum, S.A. (2021b). Systematic literature review of dialectal arabic: Identification and detection. IEEE Access 9, 3101031042.CrossRefGoogle Scholar
Elrazzaz, M., Elbassuoni, S., Shaban, K. and Helwe, C. (2017). Methodical evaluation of arabic word embeddings. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 454458.CrossRefGoogle Scholar
Farha, I.A. and Magdy, W. (2021). A comparative study of effective approaches for arabic sentiment analysis. Information Processing & Management 58(2), 102438.CrossRefGoogle Scholar
Gladkova, A., Drozd, A. and Matsuoka, S. (2016). Analogy-based detection of morphological and semantic relations with word embeddings: What works and what doesn’t. In Proceedings of the NAACL Student Research Workshop, pp. 815.CrossRefGoogle Scholar
Hammo, B., Yagi, S., Ismail, O. and Abushariah, M.A.M. (2016). Exploring and exploiting a historical corpus for arabic. Language Resources and Evaluation 50(4), 839861.CrossRefGoogle Scholar
Khalifa, Y. and Elnagar, A. (2020). Colloquial arabic tweets: Collection, automatic annotation, and classification. In 2020 International Conference on Asian Language Processing (IALP). IEEE, pp. 163168.CrossRefGoogle Scholar
Khusainova, A., Khan, A. and Rivera, A.R. (2019). Sart-similarity, analogies, and relatedness for tatar language: New benchmark datasets for word embeddings evaluation. arXiv preprint arXiv:1904.00365.Google Scholar
Köper, M., Scheible, C. and im Walde, S.S. (2015). Multilingual reliability and “semantic” structure of continuous word spaces. In Proceedings of the 11th International Conference on Computational Semantics, pp. 4045.Google Scholar
Manzini, T., Yao Chong, L., Black, A.W. and Tsvetkov, Y. (2019). Black is to criminal as caucasian is to police: Detecting and removing multiclass bias in word embeddings. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, Minnesota. Association for Computational Linguistics, pp. 615621.CrossRefGoogle Scholar
Mikolov, T., Chen, K., Corrado, G. and Dean, J. (2013). Efficient estimation of word representations in vector space. 1st International Conference on Learning Representations, ICLR 2013, Scottsdale, Arizona, USA, May 2–4, 2013, Workshop Track Proceedings.Google Scholar
Mohamed, E.H. and Shokry, E.M. (2022). Qsst: A quranic semantic search tool based on word embedding. Journal of King Saud University - Computer and Information Sciences 34(3), 934945.CrossRefGoogle Scholar
Nassif, A.B., Darya, A.M. and Elnagar, A. (2021a). Empirical evaluation of shallow and deep learning classifiers for arabic sentiment analysis. Transactions on Asian and Low-Resource Language Information Processing 21(1), 125.Google Scholar
Nassif, A.B., Elnagar, A., Shahin, I. and Henno, S. (2021b). Deep learning for arabic subjective sentiment analysis: Challenges and research opportunities. Applied Soft Computing 98, 106836.CrossRefGoogle Scholar
Nissim, M., van Noord, R. and van der Goot, R. (2020). Fair is better than sensational: Man is to doctor as woman is to doctor.CrossRefGoogle Scholar
Orabi, M., El Rifai, H. and Elnagar, A. (2020). Classical arabic poetry: Classification based on era. In 2020 IEEE/ACS 17th International Conference on Computer Systems and Applications (AICCSA). IEEE, pp. 16.CrossRefGoogle Scholar
Romanov, M. and Seydi, M. (2019). OpenITI: A Machine-Readable Corpus of Islamicate Texts.Google Scholar
Romeo, S., Da San Martino, G., Belinkov, Y., Barrón-Cedeño, A., Eldesouki, M., Darwish, K., Mubarak, H., Glass, J. and Moschitti, A. (2019). Language processing and learning models for community question answering in arabic. Information Processing & Management 56(2), 274290. Advance Arabic Natural Language Processing (ANLP) and its Applications.CrossRefGoogle Scholar
Schluter, N. (2018). The word analogy testing caveat. In Walker M.A., Ji H. and Stent A. (eds), Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT, New Orleans, Louisiana, USA, June 1–6, 2018, Volume 2 (Short Papers). Association for Computational Linguistics, pp. 242246.CrossRefGoogle Scholar
Sibawayh, A.i.U. and Ya‘qub, I. (1999). al-Kitab. Dar al-Kutub al-Ilmiyah.Google Scholar
Soliman, A.B., Eissa, K. and El-Beltagy, S.R. (2017). Aravec: A set of arabic word embedding models for use in arabic nlp. Procedia Computer Science 117, 256265. Arabic Computational Linguistics.CrossRefGoogle Scholar
Ulčar, M., Vaik, K., Lindström, J., Dailidėnaitė, M. and Robnik-Šikonja, M. (2020). Multilingual culture-independent word analogy datasets. In Proceedings of the 12th Language Resources and Evaluation Conference, Marseille, France. European Language Resources Association, pp. 40744080.Google Scholar
Velupillai, V. (2012). An Introduction to Linguistic Typology. Amsterdam: John Benjamins Publishing Company.CrossRefGoogle Scholar
Yagi, S.M. (2002). Computerizing arabic morphology. International Journal of Arabic-English Studies 3(1), 153168.Google Scholar
Zahran, M.A., Magooda, A., Mahgoub, A.Y., Raafat, H., Rashwan, M. and Atyia, A. (2015). Word representations in vector space and their applications for arabic. In International Conference on Intelligent Text Processing and Computational Linguistics. Springer, pp. 430443.Google Scholar