Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Comparing ChatGPT’s ability to rate the degree of stereotypes and the consistency of stereotype attribution with those of medical students in New Zealand in developing a similarity rating test: a methodological study.
Journal of Educational Evaluation for Health Professions Pub Date : 2023-06-12 , DOI: 10.3352/jeehp.2023.20.17
Chao-Cheng Lin , Zaine Akuhata-Huntington , Che-Wei Hsu

Learning about one’s implicit bias is crucial for improving one’s cultural competency and thereby reducing health inequity. To evaluate bias among medical students following a previously developed cultural training program targeting New Zealand Māori, we developed a text-based, self-evaluation tool called the Similarity Rating Test (SRT). The development process of the SRT was resource-intensive, limiting its generalizability and applicability. Here, we explored the potential of ChatGPT, an automated chatbot, to assist in the development process of the SRT by comparing ChatGPT’s and students’ evaluations of the SRT. Despite results showing non-significant equivalence and difference between ChatGPT’s and students’ ratings, ChatGPT’s ratings were more consistent than students’ ratings. The consistency rate was higher for non-stereotypical than for stereotypical statements, regardless of rater type. Further studies are warranted to validate ChatGPT’s potential for assisting in SRT development for implementation in medical education and evaluation of ethnic stereotypes and related topics.


将 ChatGPT 评估刻板印象程度和刻板印象归因一致性的能力与新西兰医学生进行相似性评级测试进行比较:一项方法学研究。

了解一个人的隐性偏见对于提高一个人的文化能力并从而减少健康不平等至关重要。为了评估之前针对新西兰毛利人开发的文化培训项目中医学生的偏见,我们开发了一种基于文本的自我评估工具,称为相似性评级测试 (SRT)。SRT 的开发过程是资源密集型的,限制了其普遍性和适用性。在这里,我们通过比较 ChatGPT 和学生对 SRT 的评估,探讨了自动化聊天机器人 ChatGPT 在协助 SRT 开发过程中的潜力。尽管结果显示 ChatGPT 和学生的评分之间不存在显着的等同性和差异,但 ChatGPT 的评分比学生的评分更加一致。无论评估者类型如何,非陈规定型陈述的一致性率均高于陈规定型陈述的一致性率。需要进一步的研究来验证 ChatGPT 协助 SRT 开发以实施医学教育以及种族刻板印象和相关主题评估的潜力。