当前位置: X-MOL 学术Research Integrity and Peer Review › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Fighting reviewer fatigue or amplifying bias? Considerations and recommendations for use of ChatGPT and other large language models in scholarly peer review
Research Integrity and Peer Review Pub Date : 2023-05-18 , DOI: 10.1186/s41073-023-00133-5
Mohammad Hosseini 1 , Serge P J M Horbach 2
Affiliation  

Background

The emergence of systems based on large language models (LLMs) such as OpenAI’s ChatGPT has created a range of discussions in scholarly circles. Since LLMs generate grammatically correct and mostly relevant (yet sometimes outright wrong, irrelevant or biased) outputs in response to provided prompts, using them in various writing tasks including writing peer review reports could result in improved productivity. Given the significance of peer reviews in the existing scholarly publication landscape, exploring challenges and opportunities of using LLMs in peer review seems urgent. After the generation of the first scholarly outputs with LLMs, we anticipate that peer review reports too would be generated with the help of these systems. However, there are currently no guidelines on how these systems should be used in review tasks.

Methods

To investigate the potential impact of using LLMs on the peer review process, we used five core themes within discussions about peer review suggested by Tennant and Ross-Hellauer. These include 1) reviewers’ role, 2) editors’ role, 3) functions and quality of peer reviews, 4) reproducibility, and 5) the social and epistemic functions of peer reviews. We provide a small-scale exploration of ChatGPT’s performance regarding identified issues.

Results

LLMs have the potential to substantially alter the role of both peer reviewers and editors. Through supporting both actors in efficiently writing constructive reports or decision letters, LLMs can facilitate higher quality review and address issues of review shortage. However, the fundamental opacity of LLMs’ training data, inner workings, data handling, and development processes raise concerns about potential biases, confidentiality and the reproducibility of review reports. Additionally, as editorial work has a prominent function in defining and shaping epistemic communities, as well as negotiating normative frameworks within such communities, partly outsourcing this work to LLMs might have unforeseen consequences for social and epistemic relations within academia. Regarding performance, we identified major enhancements in a short period and expect LLMs to continue developing.

Conclusions

We believe that LLMs are likely to have a profound impact on academia and scholarly communication. While potentially beneficial to the scholarly communication system, many uncertainties remain and their use is not without risks. In particular, concerns about the amplification of existing biases and inequalities in access to appropriate infrastructure warrant further attention. For the moment, we recommend that if LLMs are used to write scholarly reviews and decision letters, reviewers and editors should disclose their use and accept full responsibility for data security and confidentiality, and their reports’ accuracy, tone, reasoning and originality.



中文翻译:

对抗审稿人疲劳还是放大偏见?在学术同行评审中使用 ChatGPT 和其他大型语言模型的注意事项和建议

背景

OpenAI 的 ChatGPT 等基于大型语言模型 (LLM) 的系统的出现在学术界引发了一系列讨论。由于法学硕士根据提供的提示生成语法正确且大部分相关(但有时完全错误、不相关或有偏见)的输出,因此在包括撰写同行评审报告在内的各种写作任务中使用它们可以提高生产力。鉴于同行评审在现有学术出版领域的重要性,探索在同行评审中使用法学硕士的挑战和机遇似乎迫在眉睫。在法学硕士产生第一个学术成果后,我们预计同行评审报告也将在这些系统的帮助下生成。然而,目前还没有关于如何在审核任务中使用这些系统的指南。

方法

为了调查使用法学硕士对同行评审过程的潜在影响,我们在 Tennant 和 Ross-Hellauer 建议的同行评审讨论中使用了五个核心主题。其中包括 1) 审稿人的角色,2) 编辑的角色,3) 同行评审的功能和质量,4) 可重复性,以及 5) 同行评审的社会和认知功能。我们针对已发现的问题对 ChatGPT 的性能进行了小规模探索。

结果

法学硕士有可能大大改变同行评审员和编辑的角色。通过支持双方有效地撰写建设性报告或决定函,法学硕士可以促进更高质量的审查并解决审查短缺的问题。然而,法学硕士的培训数据、内部运作、数据处理和开发流程的基本不透明性引起了人们对审查报告的潜在偏见、保密性和可重复性的担忧。此外,由于编辑工作在定义和塑造认知社区以及在此类社区内协商规范框架方面具有突出的功能,因此将这项工作部分外包给法学硕士可能会对学术界内的社会和认知关系产生不可预见的后果。关于绩效,我们在短期内发现了重大改进,并期望法学硕士能够继续发展。

结论

我们相信法学硕士可能会对学术界和学术交流产生深远的影响。虽然对学术交流系统可能有利,但仍然存在许多不确定性,并且它们的使用并非没有风险。特别是,对在获得适当基础设施方面现有偏见和不平等现象扩大的担忧值得进一步关注。目前,我们建议,如果法学硕士用于撰写学术评论和决定信,审稿人和编辑应披露其使用情况,并对数据安全和保密性以及报告的准确性、语气、推理和原创性承担全部责任。

更新日期:2023-05-18
down
wechat
bug