当前位置: X-MOL 学术Prostate Cancer Prostatic. Dis. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Quality of information and appropriateness of Open AI outputs for prostate cancer
Prostate Cancer and Prostatic Diseases ( IF 4.8 ) Pub Date : 2024-01-16 , DOI: 10.1038/s41391-024-00789-0
Riccardo Lombardo , Giacomo Gallo , Jordi Stira , Beatrice Turchi , Giuseppe Santoro , Sara Riolo , Matteo Romagnoli , Antonio Cicione , Giorgia Tema , Antonio Pastore , Yazan Al Salhi , Andrea Fuschi , Giorgio Franco , Antonio Nacchia , Andrea Tubaro , Cosimo De Nunzio

Chat-GPT, a natural language processing (NLP) tool created by Open-AI, can potentially be used as a quick source for obtaining information related to prostate cancer. This study aims to analyze the quality and appropriateness of Chat-GPT’s responses to inquiries related to prostate cancer compared to those of the European Urology Association’s (EAU) 2023 prostate cancer guidelines. Overall, 195 questions were prepared according to the recommendations gathered in the prostate cancer section of the EAU 2023 Guideline. All questions were systematically presented to Chat-GPT’s August 3 Version, and two expert urologists independently assessed and assigned scores ranging from 1 to 4 to each response (1: completely correct, 2: correct but inadequate, 3: a mix of correct and misleading information, and 4: completely incorrect). Sub-analysis per chapter and per grade of recommendation were performed. Overall, 195 recommendations were evaluated. Overall, 50/195 (26%) were completely correct, 51/195 (26%) correct but inadequate, 47/195 (24%) a mix of correct and misleading and 47/195 (24%) incorrect. When looking at different chapters Open AI was particularly accurate in answering questions on follow-up and QoL. Worst performance was recorded for the diagnosis and treatment chapters with respectively 19% and 30% of the answers completely incorrect. When looking at the strength of recommendation, no differences in terms of accuracy were recorded when comparing weak and strong recommendations (p > 0,05). Chat-GPT has a poor accuracy when answering questions on the PCa EAU guidelines recommendations. Future studies should assess its performance after adequate training.



中文翻译:

前列腺癌开放人工智能输出的信息质量和适当性

Chat-GPT 是 Open-AI 创建的自然语言处理 (NLP) 工具,有可能用作获取前列腺癌相关信息的快速来源。本研究旨在与欧洲泌尿学协会 (EAU) 2023 年前列腺癌指南相比,分​​析 Chat-GPT 对前列腺癌相关询问的答复的质量和适当性。总体而言,根据 EAU 2023 指南前列腺癌部分收集的建议准备了 195 个问题。所有问题均系统地提交至 Chat-GPT 8 月 3 日版本,两位泌尿科专家独立评估并为每个回答分配 1 至 4 分(1:完全正确,2:正确但不充分,3:正确与误导的混合体)信息,4:完全错误)。对每章和每级推荐进行了子分析。总体而言,共评估了 195 项建议。总体而言,50/195 (26%) 完全正确,51/195 (26%) 正确但不充分,47/195 (24%) 正确但有误导性,47/195 (24%) 不正确。在查看不同章节时,Open AI 在回答有关随访和生活质量的问题时特别准确。诊断和治疗章节的表现最差,分别有 19% 和 30% 的答案完全错误。在查看推荐强度时,比较弱推荐和强推荐时,在准确性方面没有记录到差异 ( p  > 0,05)。Chat-GPT 在回答有关 PCa EAU 指南建议的问题时准确性较差。未来的研究应该在充分的训练后评估其表现。

更新日期:2024-01-17
down
wechat
bug