样式: 排序: IF: - GO 导出 标记为已读
-
ArEntail: manually-curated Arabic natural language inference dataset from news headlines Lang. Resour. Eval. (IF 2.7) Pub Date : 2024-04-22 Rasha Obeidat, Yara Al-Harahsheh, Mahmoud Al-Ayyoub, Maram Gharaibeh
-
Faux Hate: unravelling the web of fake narratives in spreading hateful stories: a multi-label and multi-class dataset in cross-lingual Hindi-English code-mixed text Lang. Resour. Eval. (IF 2.7) Pub Date : 2024-04-16 Shankar Biradar, Sunil Saumya, Arun Chauhan
-
Depression symptoms modelling from social media text: an LLM driven semi-supervised learning approach Lang. Resour. Eval. (IF 2.7) Pub Date : 2024-04-04 Nawshad Farruque, Randy Goebel, Sudhakar Sivapalan, Osmar R. Zaïane
A fundamental component of user-level social media language based clinical depression modelling is depression symptoms detection (DSD). Unfortunately, there does not exist any DSD dataset that reflects both the clinical insights and the distribution of depression symptoms from the samples of self-disclosed depressed population. In our work, we describe a semi-supervised learning (SSL) framework which
-
A morphologically annotated longitudinal corpus of spoken Czech child–adult interactions Lang. Resour. Eval. (IF 2.7) Pub Date : 2024-03-30
Abstract The paper presents a longitudinal corpus of transcribed spontaneous child–adult interactions in Czech. It consists of 99,388 tokens in 42,103 utterances produced by seven children between ca 1.5 and 3.5 years of age, and 238,211 tokens in 61,252 utterances produced by their close caregivers in everyday situations at home. The corpus covers language production of the children from the mean
-
TCMeta: a multilingual dataset of COVID tweets for relation-level metaphor analysis Lang. Resour. Eval. (IF 2.7) Pub Date : 2024-03-30
Abstract The COVID pandemic spurred the use of various metaphors, some very common and universal, others depending on the language, country and culture. The use of metaphors by the general public, especially in languages other than English, has not yet been sufficiently investigated, one of the reasons being the lack of resources and automatic tools for metaphor analysis. To fill this gap, we introduce
-
A longitudinal multi-modal dataset for dementia monitoring and diagnosis Lang. Resour. Eval. (IF 2.7) Pub Date : 2024-03-30
Abstract Dementia affects cognitive functions of adults, including memory, language, and behaviour. Standard diagnostic biomarkers such as MRI are costly, whilst neuropsychological tests suffer from sensitivity issues in detecting dementia onset. The analysis of speech and language has emerged as a promising and non-intrusive technology to diagnose and monitor dementia. Currently, most work in this
-
"Approaches to sentiment analysis of Hungarian political news at the sentence level" Lang. Resour. Eval. (IF 2.7) Pub Date : 2024-03-23
Abstract Automated sentiment analysis of textual data is one of the central and most challenging tasks in political communication studies. However, the toolkits available are primarily for English texts and require contextual adaptation to produce valid results—especially concerning morphologically rich languages such as Hungarian. This study introduces (1) a new sentiment and emotion annotation framework
-
DILLo: an Italian lexical database for speech-language pathologists Lang. Resour. Eval. (IF 2.7) Pub Date : 2024-03-23 Federica Beccaria, Angela Cristiano, Flavio Pisciotta, Noemi Usardi, Elisa Borgogni, Filippo Prayer Galletti, Giulia Corsi, Lorenzo Gregori, Gloria Gagliardi
-
Introducing the 3MT_French dataset to investigate the timing of public speaking judgements Lang. Resour. Eval. (IF 2.7) Pub Date : 2024-03-23 Beatrice Biancardi, Mathieu Chollet, Chloé Clavel
-
VeLeRo: an inflected verbal lexicon of standard Romanian and a quantitative analysis of morphological predictability Lang. Resour. Eval. (IF 2.7) Pub Date : 2024-03-23 Borja Herce, Bogdan Pricop
-
An aligned corpus of Spanish bibles Lang. Resour. Eval. (IF 2.7) Pub Date : 2024-03-15 Gerardo Sierra, Gemma Bel-Enguix, Ameyali Díaz-Velasco, Natalia Guerrero-Cerón, Núria Bel
-
SOLD: Sinhala offensive language dataset Lang. Resour. Eval. (IF 2.7) Pub Date : 2024-03-06 Tharindu Ranasinghe, Isuri Anuradha, Damith Premasiri, Kanishka Silva, Hansi Hettiarachchi, Lasitha Uyangodage, Marcos Zampieri
-
Infectious risk events and their novelty in event-based surveillance: new definitions and annotated corpus Lang. Resour. Eval. (IF 2.7) Pub Date : 2024-03-05 François Delon, Gabriel Bédubourg, Léo Bouscarrat, Jean-Baptiste Meynard, Aude Valois, Benjamin Queyriaux, Carlos Ramisch, Marc Tanti
-
Semantic search as extractive paraphrase span detection Lang. Resour. Eval. (IF 2.7) Pub Date : 2024-02-01
Abstract In this paper, we approach the problem of semantic search by introducing a task of paraphrase span detection, i.e. given a segment of text as a query phrase, the task is to identify its paraphrase in a given document, the same modelling setup as typically used in extractive question answering. While current work in paraphrasing has almost uniquely focused on sentence-level approaches, the
-
A new methodology for automatic creation of concept maps of Turkish texts Lang. Resour. Eval. (IF 2.7) Pub Date : 2024-01-28 Merve Bayrak, Deniz Dal
-
Large scale annotated dataset for code-mix abusive short noisy text Lang. Resour. Eval. (IF 2.7) Pub Date : 2024-01-25
Abstract With globalization and cultural exchange around the globe, most of the population gained knowledge of at least two languages. The bilingual user base on the Social Media Platform (SMP) has significantly contributed to the popularity of code-mixing. However, apart from multiple vital uses, SMP also suffer with abusive text content. Identifying abusive instances for a single language is a challenging
-
A flexible tool for a qualia-enriched FrameNet: the FrameNet Brasil WebTool Lang. Resour. Eval. (IF 2.7) Pub Date : 2024-01-22 Tiago Timponi Torrent, Ely Edison da Silva Matos, Alexandre Diniz da Costa, Maucha Andrade Gamonal, Simone Peron-Corrêa, Vanessa Maria Ramos Lopes Paiva
-
NewsCom-TOX: a corpus of comments on news articles annotated for toxicity in Spanish Lang. Resour. Eval. (IF 2.7) Pub Date : 2024-01-17 Mariona Taulé, Montserrat Nofre, Víctor Bargiela, Xavier Bonet
In this article, we present the NewsCom-TOX corpus, a new corpus manually annotated for toxicity in Spanish. NewsCom-TOX consists of 4359 comments in Spanish posted in response to 21 news articles on social media related to immigration, in order to analyse and identify messages with racial and xenophobic content. This corpus is multi-level annotated with different binary linguistic categories -stance
-
Toxic comment classification and rationale extraction in code-mixed text leveraging co-attentive multi-task learning Lang. Resour. Eval. (IF 2.7) Pub Date : 2024-01-13 Kiran Babu Nelatoori, Hima Bindu Kommanti
-
Multi-layered semantic annotation and the formalisation of annotation schemas for the investigation of modality in a Latin corpus Lang. Resour. Eval. (IF 2.7) Pub Date : 2024-01-06
Abstract This paper stems from the project A World of Possibilities. Modal pathways over an extra-long period of time: the diachrony of modality in the Latin language (WoPoss) which involves a corpus-based approach to the study of modality in the history of the Latin language. Linguistic annotation and, in particular, the semantic annotation of modality is a keystone of the project. Besides the difficulties
-
AC-IQuAD: Automatically Constructed Indonesian Question Answering Dataset by Leveraging Wikidata Lang. Resour. Eval. (IF 2.7) Pub Date : 2024-01-03 Kerenza Doxolodeo, Adila Alfa Krisnadhi
-
KurdiSent: a corpus for kurdish sentiment analysis Lang. Resour. Eval. (IF 2.7) Pub Date : 2024-01-02 Soran Badawi, Arefeh Kazemi, Vali Rezaie
-
Syntactic annotation for Portuguese corpora: standards, parsers, and search interfaces Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-12-26 Pablo Faria, Charlotte Galves, Catarina Magro
-
Linguistic annotation of Byzantine book epigrams Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-12-13 Colin Swaelens, Ilse De Vos, Els Lefever
-
Democratizing neural machine translation with OPUS-MT Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-12-13 Jörg Tiedemann, Mikko Aulamo, Daria Bakshandaeva, Michele Boggia, Stig-Arne Grönroos, Tommi Nieminen, Alessandro Raganato, Yves Scherrer, Raúl Vázquez, Sami Virpioja
-
EmoTwiCS: a corpus for modelling emotion trajectories in Dutch customer service dialogues on Twitter Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-12-08 Sofie Labat, Thomas Demeester, Véronique Hoste
-
When MIPVU goes to no man’s land: a new language resource for hybrid, morpheme-based metaphor identification in Hungarian Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-12-09 Gábor Simon, Tímea Bajzát, Júlia Ballagó, Zsuzsanna Havasi, Emese K. Molnár, Eszter Szlávich
-
Resources building for sentiment analysis of content disseminated by Tunisian medias in social networks Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-12-02 Emna Fsih, Rahma Boujelbane, Lamia Hadrich Belguith
-
A corpus of Persian literary text Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-11-23 Shahab Raji, Malihe Alikhani, Gerard de Melo, Matthew Stone
-
The Reading Everyday Emotion Database (REED): a set of audio-visual recordings of emotions in music and language Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-11-20 Jia Hoong Ong, Florence Yik Nam Leung, Fang Liu
-
A corpus of English learners with Arabic and Hebrew backgrounds Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-11-20 Omaima Abboud, Batia Laufer, Noam Ordan, Uliana Sentsova, Shuly Wintner
Learner corpora—datasets that reflect the language of non-native speakers—are instrumental for research of language learning and development, as well as for practical applications, mainly for teaching and education. Such corpora now exist for a plethora of native–foreign language pairs; but until recently, none of them reflected native Hebrew speakers, and very few reflected native Arabic speakers
-
Brazilian Portuguese corpora for teaching and translation: the CoMET project Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-11-16 Stella E. O. Tagnin
-
Automatic genre identification: a survey Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-11-16 Taja Kuzman, Nikola Ljubešić
-
A multilingual, multimodal dataset of aggression and bias: the ComMA dataset Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-11-16 Ritesh Kumar, Shyam Ratan, Siddharth Singh, Enakshi Nandi, Laishram Niranjana Devi, Akash Bhagat, Yogesh Dawer, Bornini Lahiri, Akanksha Bansal
-
LoNLI: An Extensible Framework for Testing Diverse Logical Reasoning Capabilities for NLI Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-11-04 Ishan Tarunesh, Somak Aditya, Monojit Choudhury
-
Building the VisSE Corpus of Spanish SignWriting Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-10-26 Antonio F. G. Sevilla, Alberto Díaz Esteban, José María Lahoz-Bengoechea
-
A new corpus of geolocated ASR transcripts from Germany Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-10-21 Steven Coats
-
Text augmentation for semantic frame induction and parsing Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-10-21 Saba Anwar, Artem Shelmanov, Nikolay Arefyev, Alexander Panchenko, Chris Biemann
-
Beyond plain toxic: building datasets for detection of flammable topics and inappropriate statements Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-10-21 Nikolay Babakov, Varvara Logacheva, Alexander Panchenko
-
NILC-Metrix: assessing the complexity of written and spoken language in Brazilian Portuguese Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-10-17 Sidney Evaldo Leal, Magali Sanches Duran, Carolina Evaristo Scarton, Nathan Siegle Hartmann, Sandra Maria Aluísio
-
A semi-supervised method to generate a persian dataset for suggestion classification Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-09-29 Leila Safari, Zanyar Mohammady
-
NEREL: a Russian information extraction dataset with rich annotation for nested entities, relations, and wikidata entity links Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-09-21 Natalia Loukachevitch, Ekaterina Artemova, Tatiana Batura, Pavel Braslavski, Vladimir Ivanov, Suresh Manandhar, Alexander Pugachev, Igor Rozhkov, Artem Shelmanov, Elena Tutubalina, Alexey Yandutov
-
A survey and study impact of tweet sentiment analysis via transfer learning in low resource scenarios Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-09-14 Manoel Veríssimo dos Santos Neto, Nádia Félix F. da Silva, Anderson da Silva Soares
-
An eye-tracking-with-EEG coregistration corpus of narrative sentences Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-08-29 Stefan L. Frank, Anna Aumeistere
-
Data augmentation strategies to improve text classification: a use case in smart cities Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-08-23 Luciana Bencke, Viviane Pereira Moreira
-
The development of a labelled te reo Māori–English bilingual database for language technology Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-08-20 Jesin James, Isabella Shields, Vithya Yogarajan, Peter J. Keegan, Catherine I. Watson, Peter-Lucas Jones, Keoni Mahelona
-
Comparative performance of ensemble machine learning for Arabic cyberbullying and offensive language detection Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-08-13 Marwa Khairy, Tarek M. Mahmoud, Ahmed Omar, Tarek Abd El-Hafeez
-
RUN-AS: a novel approach to annotate news reliability for disinformation detection Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-08-06 Alba Bonet-Jover, Robiert Sepúlveda-Torres, Estela Saquete, Patricio Martínez-Barco, Mario Nieto-Pérez
-
Fine-tuning language models to recognize semantic relations Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-07-23 Dmitri Roussinov, Serge Sharoff, Nadezhda Puchnina
-
Assessment of pragmatic abilities and cognitive substrates (APACS) brief remote: a novel tool for the rapid and tele-evaluation of pragmatic skills in Italian Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-07-23 Luca Bischetti, Chiara Pompei, Biagio Scalingi, Federico Frau, Marta Bosia, Giorgio Arcara, Valentina Bambini
-
The limitations of irony detection in Dutch social media Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-07-23 Aaron Maladry, Els Lefever, Cynthia Van Hee, Véronique Hoste
-
MarIA and BETO are sexist: evaluating gender bias in large language models for Spanish Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-07-23 Ismael Garrido-Muñoz, Fernando Martínez-Santiago, Arturo Montejo-Ráez
-
FullStop: punctuation and segmentation prediction for Dutch with transformers Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-07-14 Vincent Vandeghinste, Oliver Guhr
-
adaptNMT: an open-source, language-agnostic development environment for neural machine translation Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-07-14 Séamus Lankford, Haithem Afli, Andy Way
-
The Visual Language Research Corpus (VLRC): an annotated corpus of comics from Asia, Europe, and the United States Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-07-14 Neil Cohn, Bruno Cardoso, Bien Klomberg, Irmak Hacımusaoğlu
-
Evaluation of a rule-based approach to automatic factual question generation using syntactic and semantic analysis Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-07-10 Angelina Gašpar, Ani Grubišić, Ines Šarić-Grgić
-
The C-ORAL-ESQ project: a corpus for the study of spontaneous speech of individuals with schizophrenia Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-06-27 Tommaso Raso, Bruno Neves Rati de Melo Rocha, João Vinícius Salgado, Breno Fiuza Cruz, Lucas Machado Mantovani, Heliana Mello
-
Sentiment analysis in Portuguese tweets: an evaluation of diverse word representation models Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-06-28 Daniela Vianna, Fernando Carneiro, Jonnathan Carvalho, Alexandre Plastino, Aline Paes
-
CachacaNER: a dataset for named entity recognition in texts about the cachaça beverage Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-06-17 Priscilla Silva, Arthur Franco, Thiago Santos, Mozar Brito, Denilson Pereira
-
The robotic-surgery propositional bank Lang. Resour. Eval. (IF 2.7) Pub Date : 2023-06-13 Marco Bombieri, Marco Rospocher, Simone Paolo Ponzetto, Paolo Fiorini