A survey and study impact of tweet sentiment analysis via transfer learning in low resource scenarios,Language Resources and Evaluation

当前位置： X-MOL 学术 › Lang. Resour. Eval. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A survey and study impact of tweet sentiment analysis via transfer learning in low resource scenarios
Language Resources and Evaluation ( IF 2.7 ) Pub Date : 2023-09-14 , DOI: 10.1007/s10579-023-09687-8
Manoel Veríssimo dos Santos Neto , Nádia Félix F. da Silva , Anderson da Silva Soares

Sentiment analysis (SA) is a study area focused on obtaining contextual polarity from the text. Currently, deep learning has obtained outstanding results in this task. However, much annotated data are necessary to train these algorithms, and obtaining this data is expensive and difficult. In the context of low-resource scenarios, this problem is even more significant because there are little available data. Transfer learning (TL) can be used to minimize this problem because it is possible to develop some architectures using fewer data. Language models are a way of applying TL in natural language processing (NLP), and they have achieved competitive results. Nevertheless, some models need many hours of training using many computational resources, and in some contexts, people and organizations do not have the resources to do this. In this paper, we explore the models BERT (Pretraining of Deep Bidirectional Transformers for Language Understanding), MultiFiT (Efficient Multilingual Language Model Fine-tuning), ALBERT (A Lite BERT for Self-supervised Learning of Language Representations), and RoBERTa (A Robustly Optimized BERT Pretraining Approach). In all of our experiments, these models obtain better results than CNN (convolutional neural network) and LSTM (Long Short Term Memory) models. To MultiFiT and RoBERTa models, we propose a pretrained language model (PTLM) using Twitter data. Using this approach, we obtained competitive results compared with the models trained in formal language datasets. The main goal is to show the impacts of TL and language models comparing results with other techniques and showing the computational costs of using these approaches.

中文翻译：

在资源匮乏的情况下通过迁移学习进行推文情感分析的影响的调查和研究

情感分析（SA）是一个专注于从文本中获取上下文极性的研究领域。目前，深度学习在这项任务中取得了突出的成果。然而，训练这些算法需要大量带注释的数据，并且获取这些数据既昂贵又困难。在资源匮乏的情况下，这个问题更加重要，因为可用数据很少。迁移学习（TL）可以用来最小化这个问题，因为可以使用更少的数据来开发一些架构。语言模型是 TL 在自然语言处理（NLP）中应用的一种方式，并且取得了有竞争力的结果。然而，某些模型需要使用大量计算资源进行多个小时的训练，并且在某些情况下，人员和组织没有资源来执行此操作。在本文中，我们探索了模型 BERT（用于语言理解的深度双向变换器的预训练）、MultiFiT（高效多语言语言模型微调）、ALBERT（用于语言表示自监督学习的 Lite BERT）和 RoBERTa（鲁棒优化的 BERT 预训练）方法）。在我们所有的实验中，这些模型比 CNN（卷积神经网络）和 LSTM（长短期记忆）模型获得了更好的结果。对于 MultiFiT 和 RoBERTa 模型，我们提出了一种使用 Twitter 数据的预训练语言模型（PTLM）。使用这种方法，与在正式语言数据集中训练的模型相比，我们获得了有竞争力的结果。主要目标是展示 TL 和语言模型的影响，将结果与其他技术进行比较，并显示使用这些方法的计算成本。

更新日期：2023-09-15

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>