当前位置: X-MOL 学术Comput. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Enhanced approach of multilabel learning for the Arabic aspect category detection of the hotel reviews
Computational Intelligence ( IF 2.8 ) Pub Date : 2023-11-14 , DOI: 10.1111/coin.12609
Asma Ameur 1, 2 , Sana Hamdi 2 , Sadok Ben Yahia 3, 4
Affiliation  

In many fields, like aspect category detection (ACD) in aspect-based sentiment analysis, it is necessary to label each instance with more than one label at the same time. This study tackles the multilabel classification problem in the ACD task for the Arabic language. For this purpose, we used Arabic hotel reviews from the SemEval-2016 dataset, comprising 13,113 annotated tuples provided for training (10,509) and testing (2,604). To extract valuable information, we first propose specific data preprocessing. Then, we suggest using the dynamic weighted loss function and a data augmentation method to fix the problem with this dataset's imbalance. Using two possible approaches, we develop new ways to find different categories of things in a review sentence. The first is based on classifier chains using machine learning models. The second is based on transfer learning using pretrained AraBERT fine-tuning for contextual representation. Our findings show that both approaches outperformed the related works for ACD on the Arabic SemEval-2016. Moreover, we observed that AraBERT fine-tuning performed much better and achieved a promising -score of .

中文翻译:

用于酒店评论阿拉伯语方面类别检测的多标签学习增强方法

在许多领域,例如基于方面的情感分析中的方面类别检测(ACD),有必要同时用多个标签来标记每个实例。本研究解决了阿拉伯语 ACD 任务中的多标签分类问题。为此,我们使用了 SemEval-2016 数据集中的阿拉伯语酒店评论,其中包含 13,113 个带注释的元组,用于训练 (10,509) 和测试 (2,604)。为了提取有价值的信息,我们首先提出具体的数据预处理。然后,我们建议使用动态加权损失函数和数据增强方法来解决该数据集不平衡的问题。使用两种可能的方法,我们开发了在评论句子中查找不同类别事物的新方法。第一个是基于使用机器学习模型的分类器链。第二个是基于使用预训练的 AraBERT 微调进行上下文表示的迁移学习。我们的研究结果表明,这两种方法在阿拉伯语 SemEval-2016 上均优于 ACD 的相关工作。此外,我们观察到 AraBERT 微调表现更好,并取得了有希望的结果 F 1 $$ {F}_1 $$ -分数 68 02 % $$ 68.02\% $$
更新日期:2023-11-14
down
wechat
bug