当前位置: X-MOL 学术Image Vis. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Foreground and background separated image style transfer with a single text condition
Image and Vision Computing ( IF 4.7 ) Pub Date : 2024-02-21 , DOI: 10.1016/j.imavis.2024.104956
Yue Yu , Jianming Wang , Nengli Li

Traditional image-based style transfer requires additional reference style images, making it less user-friendly. Text-based methods are more convenient but suffer from issues like slow generation, unclear content, and poor quality. In this work, we propose a new style transfer method SA2-CS (means Semantic-Aware and Salient Attention CLIPStyler), which is based on the Comparative Language Image Pretraining (CLIP) model and a salient object detection network. Masks obtained from the salient object detection network are utilized to guide the style transfer process, and various strategies are employed to optimize according to different masks. Adequate experiments with diverse content images and style text descriptions were conducted, demonstrating our method's advantages: the network is easily trainable and converges rapidly; it achieves stable, superior generation results compared to other methods. Our approach addresses over-stylization issues in the foreground, enhances foreground-background contrast, and enables precise control over style transfer in various semantic regions.

中文翻译:

使用单个文本条件进行前景和背景分离的图像样式传输

传统的基于图像的风格迁移需要额外的参考风格图像,这使得它不太用户友好。基于文本的方法更方便,但存在生成速度慢、内容不清晰、质量差等问题。在这项工作中,我们提出了一种新的风格迁移方法SA2-CS(意为语义感知和显着注意力CLIPStyler),该方法基于比较语言图像预训练(CLIP)模型和显着对象检测网络。利用从显着对象检测网络获得的掩模来指导风格转移过程,并根据不同的掩模采用各种策略进行优化。对不同内容图像和风格文本描述进行了充分的实验,证明了我们的方法的优点:网络易于训练且收敛速度快;与其他方法相比,它可以实现稳定、卓越的发电结果。我们的方法解决了前景中的过度风格化问题,增强了前景-背景对比度,并能够精确控制各种语义区域中的风格转换。
更新日期:2024-02-21
down
wechat
bug