当前位置: X-MOL 学术arXiv.cs.MM › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
GoodDrag: Towards Good Practices for Drag Editing with Diffusion Models
arXiv - CS - Multimedia Pub Date : 2024-04-10 , DOI: arxiv-2404.07206
Zewei Zhang, Huan Liu, Jun Chen, Xiangyu Xu

In this paper, we introduce GoodDrag, a novel approach to improve the stability and image quality of drag editing. Unlike existing methods that struggle with accumulated perturbations and often result in distortions, GoodDrag introduces an AlDD framework that alternates between drag and denoising operations within the diffusion process, effectively improving the fidelity of the result. We also propose an information-preserving motion supervision operation that maintains the original features of the starting point for precise manipulation and artifact reduction. In addition, we contribute to the benchmarking of drag editing by introducing a new dataset, Drag100, and developing dedicated quality assessment metrics, Dragging Accuracy Index and Gemini Score, utilizing Large Multimodal Models. Extensive experiments demonstrate that the proposed GoodDrag compares favorably against the state-of-the-art approaches both qualitatively and quantitatively. The project page is https://gooddrag.github.io.

中文翻译:

GoodDrag:使用扩散模型进行拖动编辑的良好实践

在本文中,我们介绍了 GoodDrag,这是一种提高拖动编辑稳定性和图像质量的新颖方法。与与累积扰动作斗争并经常导致失真的现有方法不同,GoodDrag 引入了 AlDD 框架,该框架在扩散过程中交替进行拖动和去噪操作,有效提高了结果的保真度。我们还提出了一种保留信息的运动监督操作,该操作保留了起点的原始特征,以实现精确操作和减少伪影。此外,我们还通过引入新的数据集 Drag100 并利用大型多模态模型开发专用的质量评估指标、拖动准确度指数和 Gemini 分数,为拖动编辑的基准测试做出贡献。大量的实验表明,所提出的 GoodDrag 在定性和定量上都优于最先进的方法。项目页面为https://gooddrag.github.io。
更新日期:2024-04-11
down
wechat
bug