GradPaint: Gradient-guided inpainting with diffusion models,Computer Vision and Image Understanding

当前位置： X-MOL 学术 › Comput. Vis. Image Underst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

GradPaint: Gradient-guided inpainting with diffusion models
Computer Vision and Image Understanding ( IF 4.5 ) Pub Date : 2024-01-17 , DOI: 10.1016/j.cviu.2024.103928
Asya Grechka , Guillaume Couairon , Matthieu Cord

Denoising Diffusion Probabilistic Models (DDPMs) have recently achieved remarkable results in conditional and unconditional image generation. The pre-trained models can be adapted without further training to different downstream tasks, by guiding their iterative denoising process at inference time to satisfy additional constraints. For the specific task of image inpainting, the current guiding mechanism relies on copying-and-pasting the known regions from the input image at each denoising step. However, diffusion models are strongly conditioned by the initial random noise, and therefore struggle to harmonize predictions inside the inpainting mask with the real parts of the input image, often producing results with unnatural artifacts. Our method, dubbed GradPaint, steers the generation towards a globally coherent image. At each step in the denoising process, we leverage the model’s “denoised image estimation” by calculating a custom loss measuring its coherence with the masked input image. Our guiding mechanism uses the gradient obtained from backpropagating this loss through the diffusion model itself. GradPaint generalizes well to diffusion models trained on various datasets, improving upon current state-of-the-art supervised and unsupervised methods. Our code will be made available upon publication.

中文翻译：

GradPaint：使用扩散模型进行梯度引导修复

去噪扩散概率模型（DDPM）最近在条件和无条件图像生成方面取得了显着的成果。通过在推理时指导其迭代去噪过程以满足额外的约束，预训练的模型无需进一步训练即可适应不同的下游任务。对于图像修复的特定任务，当前的指导机制依赖于在每个去噪步骤中从输入图像中复制并粘贴已知区域。然而，扩散模型受到初始随机噪声的强烈影响，因此难以协调修复掩模内的预测与输入图像的真实部分，通常会产生具有不自然伪影的结果。我们的方法被称为 GradPaint，引导一代人走向全球一致的图像。在去噪过程的每个步骤中，我们通过计算自定义损失来测量其与屏蔽输入图像的一致性，从而利用模型的“去噪图像估计”。我们的引导机制使用通过扩散模型本身反向传播这种损失而获得的梯度。GradPaint 很好地推广到在各种数据集上训练的扩散模型，改进了当前最先进的监督和无监督方法。我们的代码将在发布后提供。

更新日期：2024-01-17

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>