当前位置: X-MOL 学术JMIR Mental Health › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Comparison of ChatGPT and Fine-Tuned Open Pre-Trained Transformers (OPT) Against Widely Used Sentiment Analysis Tools: Sentiment Analysis of COVID-19 Survey Data
JMIR Mental Health ( IF 5.2 ) Pub Date : 2024-01-25 , DOI: 10.2196/50150
Juan Antonio Lossio-Ventura , Rachel Weger , Angela Y Lee , Emily P Guinee , Joyce Chung , Lauren Atlas , Eleni Linos , Francisco Pereira

Background: Health care providers and health-related researchers face significant challenges when applying sentiment analysis tools to health-related free-text survey data. Most state-of-the-art applications were developed in domains such as social media, and their performance in the health care context remains relatively unknown. Moreover, existing studies indicate that these tools often lack accuracy and produce inconsistent results. Objective: This study aims to address the lack of comparative analysis on sentiment analysis tools applied to health-related free-text survey data in the context of COVID-19. The objective was to automatically predict sentence sentiment for 2 independent COVID-19 survey data sets from the National Institutes of Health and Stanford University. Methods: Gold standard labels were created for a subset of each data set using a panel of human raters. We compared 8 state-of-the-art sentiment analysis tools on both data sets to evaluate variability and disagreement across tools. In addition, few-shot learning was explored by fine-tuning Open Pre-Trained Transformers (OPT; a large language model [LLM] with publicly available weights) using a small annotated subset and zero-shot learning using ChatGPT (an LLM without available weights). Results: The comparison of sentiment analysis tools revealed high variability and disagreement across the evaluated tools when applied to health-related survey data. OPT and ChatGPT demonstrated superior performance, outperforming all other sentiment analysis tools. Moreover, ChatGPT outperformed OPT, exhibited higher accuracy by 6% and higher F-measure by 4% to 7%. Conclusions: This study demonstrates the effectiveness of LLMs, particularly the few-shot learning and zero-shot learning approaches, in the sentiment analysis of health-related survey data. These results have implications for saving human labor and improving efficiency in sentiment analysis tasks, contributing to advancements in the field of automated sentiment analysis.

中文翻译:

ChatGPT 和微调开放式预训练 Transformer (OPT) 与广泛使用的情绪分析工具的比较:COVID-19 调查数据的情绪分析

背景:医疗保健提供者和健康相关研究人员在将情绪分析工具应用于健康相关自由文本调查数据时面临重大挑战。大多数最先进的应用程序都是在社交媒体等领域开发的,它们在医疗保健领域的表现仍然相对未知。此外,现有研究表明这些工具往往缺乏准确性并产生不一致的结果。目的:本研究旨在解决在 COVID-19 背景下应用于健康相关自由文本调查数据的情绪分析工具缺乏比较分析的问题。目标是自动预测来自美国国立卫生研究院和斯坦福大学的 2 个独立的 COVID-19 调查数据集的句子情绪。方法:使用人类评估小组为每个数据集的子集创建黄金标准标签。我们在两个数据集上比较了 8 个最先进的情绪分析工具,以评估工具之间的变异性和分歧。此外,通过使用小型注释子集微调开放预训练 Transformer(OPT;具有公开可用权重的大型语言模型 [LLM])和使用 ChatGPT(没有可用权重的 LLM)进行零样本学习,探索了少样本学习。权重)。结果:情绪分析工具的比较显示,当应用于健康相关的调查数据时,评估的工具之间存在很大的变异性和分歧。OPT 和 ChatGPT 表现出了卓越的性能,优于所有其他情绪分析工具。此外,ChatGPT 的性能优于 OPT,准确率提高了 6%,F测量提高了 4% 至 7%。结论:这项研究证明了法学硕士,特别是少样本学习和零样本学习方法在健康相关调查数据的情感分析中的有效性。这些结果对于节省人力和提高情感分析任务的效率具有重要意义,有助于自动化情感分析领域的进步。
更新日期:2024-01-25
down
wechat
bug