Detection of GPT-4 Generated Text in Higher Education: Combining Academic Judgement and Software to Identify Generative AI Tool Misuse,Journal of Academic Ethics

当前位置： X-MOL 学术 › Journal of Academic Ethics › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Detection of GPT-4 Generated Text in Higher Education: Combining Academic Judgement and Software to Identify Generative AI Tool Misuse
Journal of Academic Ethics Pub Date : 2023-10-31 , DOI: 10.1007/s10805-023-09492-6
Mike Perkins , Jasper Roe , Darius Postma , James McGaughran , Don Hickerson

This study explores the capability of academic staff assisted by the Turnitin Artificial Intelligence (AI) detection tool to identify the use of AI-generated content in university assessments. 22 different experimental submissions were produced using Open AI’s ChatGPT tool, with prompting techniques used to reduce the likelihood of AI detectors identifying AI-generated content. These submissions were marked by 15 academic staff members alongside genuine student submissions. Although the AI detection tool identified 91% of the experimental submissions as containing AI-generated content, only 54.8% of the content was identified as AI-generated, underscoring the challenges of detecting AI content when advanced prompting techniques are used. When academic staff members marked the experimental submissions, only 54.5% were reported to the academic misconduct process, emphasising the need for greater awareness of how the results of AI detectors may be interpreted. Similar performance in grades was obtained between student submissions and AI-generated content (AI mean grade: 52.3, Student mean grade: 54.4), showing the capabilities of AI tools in producing human-like responses in real-life assessment situations. Recommendations include adjusting the overall strategies for assessing university students in light of the availability of new Generative AI tools. This may include reducing the overall reliance on assessments where AI tools may be used to mimic human writing, or by using AI-inclusive assessments. Comprehensive training must be provided for both academic staff and students so that academic integrity may be preserved.

中文翻译：

高等教育中 GPT-4 生成文本的检测：结合学术判断和软件来识别生成式 AI 工具滥用

本研究探讨了学术人员在 Turnitin 人工智能 (AI) 检测工具的协助下识别人工智能生成内容在大学评估中的使用情况的能力。使用 Open AI 的 ChatGPT 工具生成了 22 份不同的实验提交内容，并使用提示技术来降低 AI 检测器识别 AI 生成内容的可能性。这些提交的内容由 15 名学术人员与真实的学生提交的内容一起进行了评分。尽管人工智能检测工具识别出 91% 的实验提交内容包含人工智能生成的内容，但只有 54.8% 的内容被识别为人工智能生成的内容，这凸显了使用高级提示技术时检测人工智能内容的挑战。当学术人员对实验提交的材料进行评分时，只有 54.5% 的人被报告给学术不端行为流程，这强调需要提高对如何解释 AI 探测器结果的认识。学生提交的内容和人工智能生成的内容之间的成绩表现相似（人工智能平均成绩：52.3，学生平均成绩：54.4），显示了人工智能工具在现实生活评估情况下产生类似人类反应的能力。建议包括根据新的生成式人工智能工具的可用性调整评估大学生的总体策略。这可能包括减少对评估的总体依赖，其中人工智能工具可用于模仿人类写作，或使用包含人工智能的评估。必须为学术人员和学生提供全面的培训，以维护学术诚信。

更新日期：2023-11-04

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>