当前位置: X-MOL 学术arXiv.cs.AI › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Fake or JPEG? Revealing Common Biases in Generated Image Detection Datasets
arXiv - CS - Artificial Intelligence Pub Date : 2024-03-26 , DOI: arxiv-2403.17608
Patrick Grommelt, Louis Weiss, Franz-Josef Pfreundt, Janis Keuper

The widespread adoption of generative image models has highlighted the urgent need to detect artificial content, which is a crucial step in combating widespread manipulation and misinformation. Consequently, numerous detectors and associated datasets have emerged. However, many of these datasets inadvertently introduce undesirable biases, thereby impacting the effectiveness and evaluation of detectors. In this paper, we emphasize that many datasets for AI-generated image detection contain biases related to JPEG compression and image size. Using the GenImage dataset, we demonstrate that detectors indeed learn from these undesired factors. Furthermore, we show that removing the named biases substantially increases robustness to JPEG compression and significantly alters the cross-generator performance of evaluated detectors. Specifically, it leads to more than 11 percentage points increase in cross-generator performance for ResNet50 and Swin-T detectors on the GenImage dataset, achieving state-of-the-art results. We provide the dataset and source codes of this paper on the anonymous website: https://www.unbiased-genimage.org

中文翻译:

假的还是 JPEG?揭示生成的图像检测数据集中的常见偏差

生成图像模型的广泛采用凸显了检测人工内容的迫切需要,这是打击广泛操纵和错误信息的关键一步。因此,出现了许多探测器和相关数据集。然而,许多这些数据集无意中引入了不良偏差,从而影响了检测器的有效性和评估。在本文中,我们强调人工智能生成的图像检测的许多数据集包含与 JPEG 压缩和图像大小相关的偏差。使用 GenImage 数据集,我们证明检测器确实可以从这些不良因素中学习。此外,我们还表明,消除指定偏差可显着提高 JPEG 压缩的鲁棒性,并显着改变评估检测器的跨生成器性能。具体来说,它使 GenImage 数据集上的 ResNet50 和 Swin-T 检测器的跨生成器性能提高了 11 个百分点以上,实现了最先进的结果。我们在匿名网站上提供了本文的数据集和源代码:https://www.unbiased-genimage.org
更新日期:2024-03-28
down
wechat
bug