A benchmark for evaluating Arabic word embedding models

Sane Yagi; Ashraf Elnagar; Shehdeh Fareh

doi:10.1017/S1351324922000444

A benchmark for evaluating Arabic word embedding models

Published online by Cambridge University Press: 17 October 2022

Sane Yagi ,

Ashraf Elnagar and

Shehdeh Fareh

Show author details

Sane Yagi: Affiliation:
Department of Foreign Languages, University of Sharjah, Sharjah, UAE
Ashraf Elnagar*: Affiliation:
Department of Computer Science, University of Sharjah, Sharjah, UAE
Shehdeh Fareh: Affiliation:
Department of Foreign Languages, University of Sharjah, Sharjah, UAE
*: *Corresponding author. E-mail: ashraf@sharjah.ac.ae

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Modelling the distributional semantics of such a morphologically rich language as Arabic needs to take into account its introflexive, fusional, and inflectional nature attributes that make up its combinatorial sequences and substitutional paradigms. To evaluate such word distributional models, the benchmarks that have been used thus far in Arabic have mimicked those in English. This paper reports on a benchmark that we designed to reflect linguistic patterns in both Contemporary Arabic and Classical Arabic, the first being a cover term for written and spoken Modern Standard Arabic, while the second for pre-modern Arabic. The analogy items we included in this benchmark are chosen in a transparent manner such that they would capture the major features of nouns and verbs; derivational and inflectional morphology; high-, middle-, and low-frequency patterns and lexical items; and morphosemantic, morphosyntactic, and semantic dimensions of the language. All categories included in this benchmark are carefully selected to ensure proper representation of the language. The benchmark consists of 45 roots of the trilateral, all-consonantal, and semivowel-inclusive types; six morphosemantic patterns (’af‘ala; ifta‘ala; infa‘ala; istaf‘ala; tafa‘‘ala; and tafā‘ala); five derivations (the verbal noun, active participle, and the contrasts in Masculine-Feminine; Feminine-Singular-Plural; Masculine-Singular-Plural); and morphosyntactic transformations (perfect and imperfect verbs conjugated for all pronouns); and lexical semantics (synonyms, antonyms, and hyponyms of nouns, verbs, and adjectives), as well as capital cities and currencies. All categories include an equal proportion of high-, medium-, and low-frequency items. For the purpose of validating the proposed benchmark, we developed a set of embedding models from different textual sources. Then, we tested them intrinsically using the proposed benchmark and extrinsically using two natural language processing tasks: Arabic Named Entity Recognition and Text Classification. The evaluation leads to the conclusion that the proposed benchmark is truly reflective of this morphologically rich language and discriminatory of word embeddings.

Keywords

Language Resources Semantics Similarity Syntax Arabic embedding models

Type: Article
Information: Natural Language Engineering , Volume 29 , Issue 4 , July 2023 , pp. 978 - 1003

DOI: https://doi.org/10.1017/S1351324922000444 [Opens in a new window]
Copyright: © The Author(s), 2022. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Abbas, M., Lichouri, M. and Zeggada, A. (2019). Classification of arabic poems: From the 5th to the 15th century. In Cristani, M., Prati, A., Lanz, O., Messelodi, S. and Sebe, N. (eds), New Trends in Image Analysis and Processing – ICIAP 2019. Springer International Publishing, pp. 179–186.Google Scholar

Al-Ayyoub, M., Khamaiseh, A.A., Jararweh, Y. and Al-Kabi, M.N. (2019). A comprehensive survey of arabic sentiment analysis. Information Processing & Management 56(2), 320–342. Advance Arabic Natural Language Processing (ANLP) and its Applications.CrossRef Google Scholar

Al Qadi, L., El Rifai, H., Obaid, S. and Elnagar, A. (2019). Arabic text classification of news articles using classical supervised classifiers. In 2019 2nd International Conference on new Trends in Computing Sciences (ICTCS). IEEE, pp. 1–6.CrossRef Google Scholar

Al Qadi, L., El Rifai, H., Obaid, S. and Elnagar, A. (2020). A scalable shallow learning approach for tagging arabic news articles. Jordanian Journal of Computer and Information Technology (JJCIT) 6(3), 263–280.Google Scholar

Al-Smadi, M., Al-Ayyoub, M., Jararweh, Y. and Qawasmeh, O. (2019). Enhancing aspect-based sentiment analysis of arabic hotels’ reviews using morphological, syntactic and semantic features. Information Processing & Management 56(2), 308–319. Advance Arabic Natural Language Processing (ANLP) and its Applications.CrossRef Google Scholar

AL-Smadi, M., Jaradat, Z., AL-Ayyoub, M. and Jararweh, Y. (2017). Paraphrase identification and semantic text similarity analysis in arabic news tweets using lexical, syntactic, and semantic features. Information Processing & Management 53(3), 640–652.CrossRef Google Scholar

Alam, Y.M. (1983). al-Mujam al-Arabi: Dirasa Ihsaiya li-Dawaran al-Huruf fi al-Judhur al-Arabiya. Thesis, Damascus University.Google Scholar

Alkhatlan, A., Kalita, J. and Alhaddad, A. (2018). Word sense disambiguation for arabic exploiting arabic wordnet and word embedding. Procedia Computer Science 142, 50–60. Arabic Computational Linguistics.CrossRef Google Scholar

AlMahmoud, R.H., Hammo, B. and Faris, H. (2020). A modified bond energy algorithm with fuzzy merging and its application to arabic text document clustering. Expert Systems with Applications 159, 113598.CrossRef Google Scholar

Altowayan, A.A. and Elnagar, A. (2017). Improving arabic sentiment analysis with sentiment-specific embeddings. In 2017 IEEE International Conference on Big Data (Big Data). IEEE, pp. 4314–4320.CrossRef Google Scholar

Bakarov, A. (2018). A survey of word embeddings evaluation methods. CoRR, abs/1801.09536.Google Scholar

Benajiba, Y., Rosso, P. and Benedruiz, J.M. (2007). Anersys: An arabic named entity recognition system based on maximum entropy. In International Conference on Intelligent Text Processing and Computational Linguistics. Springer, pp. 143–153.Google Scholar

Bolukbasi, T., Chang, K.-W., Zou, J. Y., Saligrama, V. and Kalai, A.T. (2016). Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In Advances in Neural Information Processing Systems, pp. 4349–4357.Google Scholar

Bounhas, I., Soudani, N. and Slimani, Y. (2020). Building a morpho-semantic knowledge graph for arabic information retrieval. Information Processing & Management 57(6), 102124.CrossRef Google Scholar

Buckwalter, T. and Parkinson, D.L. (2011). A Frequency Dictionary of Arabic: Core Vocabulary for Learners . Routledge Frequency Dictionaries. London, New York: Routledge.Google Scholar

Einea, O. and Elnagar, A. (2019). Predicting semantic textual similarity of arabic question pairs using deep learning. In 2019 IEEE/ACS 16th International Conference on Computer Systems and Applications (AICCSA). IEEE, pp. 1–5.CrossRef Google Scholar

Einea, O., Elnagar, A. and Debsi, R.A. (2019). Sanad: Single-label arabic news articles dataset for automatic text categorization. Data in Brief 25, 104076.CrossRef Google Scholar PubMed

El Rifai, H., Al Qadi, L. and Elnagar, A. (2022). Arabic text classification: The need for multi-labeling systems. Neural Computing and Applications 34(2), 1135–1159.CrossRef Google Scholar PubMed

Elnagar, A., Al-Debsi, R. and Einea, O. (2020). Arabic text classification using deep learning models. Information Processing & Management 57(1), 102–121.CrossRef Google Scholar

Elnagar, A., Khalifa, Y.S. and Einea, A. (2018a). Hotel Arabic-Reviews Dataset Construction for Sentiment Analysis Applications. Computational Intelligence book series (SCI, Vol. 740), Springer International Publishing, pp. 35–52.Google Scholar

Elnagar, A., Lulu, L. and Einea, O. (2018b). An annotated huge dataset for standard and colloquial arabic reviews for subjective sentiment analysis. Procedia Computer Science 142, 182–189. Arabic Computational Linguistics.CrossRef Google Scholar

Elnagar, A., Yagi, S., Nassif, A.B., Shahin, I. and Salloum, S.A. (2021a). Sentiment analysis in dialectal arabic: A systematic review. In International Conference on Advanced Machine Learning Technologies and Applications. Springer, pp. 407–417.CrossRef Google Scholar

Elnagar, A., Yagi, S.M., Nassif, A.B., Shahin, I. and Salloum, S.A. (2021b). Systematic literature review of dialectal arabic: Identification and detection. IEEE Access 9, 31010–31042.CrossRef Google Scholar

Elrazzaz, M., Elbassuoni, S., Shaban, K. and Helwe, C. (2017). Methodical evaluation of arabic word embeddings. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 454–458.CrossRef Google Scholar

Farha, I.A. and Magdy, W. (2021). A comparative study of effective approaches for arabic sentiment analysis. Information Processing & Management 58(2), 102438.CrossRef Google Scholar

Gladkova, A., Drozd, A. and Matsuoka, S. (2016). Analogy-based detection of morphological and semantic relations with word embeddings: What works and what doesn’t. In Proceedings of the NAACL Student Research Workshop, pp. 8–15.CrossRef Google Scholar

Hammo, B., Yagi, S., Ismail, O. and Abushariah, M.A.M. (2016). Exploring and exploiting a historical corpus for arabic. Language Resources and Evaluation 50(4), 839–861.CrossRef Google Scholar

Khalifa, Y. and Elnagar, A. (2020). Colloquial arabic tweets: Collection, automatic annotation, and classification. In 2020 International Conference on Asian Language Processing (IALP). IEEE, pp. 163–168.CrossRef Google Scholar

Khusainova, A., Khan, A. and Rivera, A.R. (2019). Sart-similarity, analogies, and relatedness for tatar language: New benchmark datasets for word embeddings evaluation. arXiv preprint arXiv:1904.00365.Google Scholar

Köper, M., Scheible, C. and im Walde, S.S. (2015). Multilingual reliability and “semantic” structure of continuous word spaces. In Proceedings of the 11th International Conference on Computational Semantics, pp. 40–45.Google Scholar

Manzini, T., Yao Chong, L., Black, A.W. and Tsvetkov, Y. (2019). Black is to criminal as caucasian is to police: Detecting and removing multiclass bias in word embeddings. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, Minnesota. Association for Computational Linguistics, pp. 615–621.CrossRef Google Scholar

Mikolov, T., Chen, K., Corrado, G. and Dean, J. (2013). Efficient estimation of word representations in vector space. 1st International Conference on Learning Representations, ICLR 2013, Scottsdale, Arizona, USA, May 2–4, 2013, Workshop Track Proceedings.Google Scholar

Mohamed, E.H. and Shokry, E.M. (2022). Qsst: A quranic semantic search tool based on word embedding. Journal of King Saud University - Computer and Information Sciences 34(3), 934–945.CrossRef Google Scholar

Nassif, A.B., Darya, A.M. and Elnagar, A. (2021a). Empirical evaluation of shallow and deep learning classifiers for arabic sentiment analysis. Transactions on Asian and Low-Resource Language Information Processing 21(1), 1–25.Google Scholar

Nassif, A.B., Elnagar, A., Shahin, I. and Henno, S. (2021b). Deep learning for arabic subjective sentiment analysis: Challenges and research opportunities. Applied Soft Computing 98, 106836.CrossRef Google Scholar

Nissim, M., van Noord, R. and van der Goot, R. (2020). Fair is better than sensational: Man is to doctor as woman is to doctor.CrossRef Google Scholar

Orabi, M., El Rifai, H. and Elnagar, A. (2020). Classical arabic poetry: Classification based on era. In 2020 IEEE/ACS 17th International Conference on Computer Systems and Applications (AICCSA). IEEE, pp. 1–6.CrossRef Google Scholar

Romanov, M. and Seydi, M. (2019). OpenITI: A Machine-Readable Corpus of Islamicate Texts.Google Scholar

Romeo, S., Da San Martino, G., Belinkov, Y., Barrón-Cedeño, A., Eldesouki, M., Darwish, K., Mubarak, H., Glass, J. and Moschitti, A. (2019). Language processing and learning models for community question answering in arabic. Information Processing & Management 56(2), 274–290. Advance Arabic Natural Language Processing (ANLP) and its Applications.CrossRef Google Scholar

Schluter, N. (2018). The word analogy testing caveat. In Walker M.A., Ji H. and Stent A. (eds), Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT, New Orleans, Louisiana, USA, June 1–6, 2018, Volume 2 (Short Papers). Association for Computational Linguistics, pp. 242–246.CrossRef Google Scholar

Sibawayh, A.i.U. and Ya‘qub, I. (1999). al-Kitab. Dar al-Kutub al-Ilmiyah.Google Scholar

Soliman, A.B., Eissa, K. and El-Beltagy, S.R. (2017). Aravec: A set of arabic word embedding models for use in arabic nlp. Procedia Computer Science 117, 256–265. Arabic Computational Linguistics.CrossRef Google Scholar

Ulčar, M., Vaik, K., Lindström, J., Dailidėnaitė, M. and Robnik-Šikonja, M. (2020). Multilingual culture-independent word analogy datasets. In Proceedings of the 12th Language Resources and Evaluation Conference, Marseille, France. European Language Resources Association, pp. 4074–4080.Google Scholar

Velupillai, V. (2012). An Introduction to Linguistic Typology. Amsterdam: John Benjamins Publishing Company.CrossRef Google Scholar

Yagi, S.M. (2002). Computerizing arabic morphology. International Journal of Arabic-English Studies 3(1), 153–168.Google Scholar

Zahran, M.A., Magooda, A., Mahgoub, A.Y., Raafat, H., Rashwan, M. and Atyia, A. (2015). Word representations in vector space and their applications for arabic. In International Conference on Intelligent Text Processing and Computational Linguistics. Springer, pp. 430–443.Google Scholar

Article contents

A benchmark for evaluating Arabic word embedding models

Abstract

Keywords

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests