Arbitrary 3D stylization of radiance fields,Image and Vision Computing

当前位置： X-MOL 学术 › Image Vis. Comput. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Arbitrary 3D stylization of radiance fields
Image and Vision Computing ( IF 4.7 ) Pub Date : 2024-03-06 , DOI: 10.1016/j.imavis.2024.104971
Sijia Zhang , Ting Liu , Zhuoyuan Li , Yi Sun

3D Stylization that creates stylized multi-view images is quite challenging, as it requires not only generating images which align with the desired style but also maintaining consistency across different perspectives. Most previous image style transfer methods focus on the 2D image domain and stylize each view independently, suffering from multi-view inconsistency. To tackle this challenging problem, we build on the neural radiance fields (NeRF) to stylize each 3D scene, as NeRF inherently ensures consistency across multiple perspectives, and has two sub-networks of geometry and appearance where appearance stylization cannot change the geometry. To enable arbitrary style transfer and more explicit and precise style adjustment, we introduce the CLIP model, which allows for style transfer based on either a text prompt or an arbitrary style image. We employ an ensemble of loss functions, of which CLIP loss ensures the similarity between the shared latent embeddings and generated style images, and Mask Loss is to constrain the 3D geometry to avoid non-smooth surface of NeRF. Experimental results demonstrate the effectiveness of our arbitrary 3D stylization generalized across diverse datasets. The proposed method outperforms most image-based and text-based 3D stylization models in terms of style transfer quality, producing pleasing images.

中文翻译：

辐射场的任意 3D 风格化

创建风格化多视图图像的 3D 风格化非常具有挑战性，因为它不仅需要生成与所需风格一致的图像，而且还需要保持不同视角的一致性。以前的大多数图像风格迁移方法都专注于 2D 图像域并独立地对每个视图进行风格化，从而存在多视图不一致的问题。为了解决这个具有挑战性的问题，我们基于神经辐射场 (NeRF) 来对每个 3D 场景进行风格化，因为 NeRF 本质上确保了多个视角的一致性，并且具有几何和外观两个子网络，其中外观风格化无法改变几何形状。为了实现任意风格迁移以及更明确和精确的风格调整，我们引入了CLIP模型，它允许基于文本提示或任意风格图像进行风格迁移。我们采用了损失函数的集合，其中 CLIP 损失确保了共享潜在嵌入和生成的风格图像之间的相似性，而 Mask Loss 是为了约束 3D 几何以避免 NeRF 的非光滑表面。实验结果证明了我们在不同数据集上推广的任意 3D 风格化的有效性。所提出的方法在风格转移质量方面优于大多数基于图像和基于文本的 3D 风格化模型，产生令人愉悦的图像。

更新日期：2024-03-06

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>