FST-OAM: a fast style transfer model using optimized self-attention mechanism,Signal, Image and Video Processing

当前位置： X-MOL 学术 › Signal Image Video Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

FST-OAM: a fast style transfer model using optimized self-attention mechanism
Signal, Image and Video Processing ( IF 2.3 ) Pub Date : 2024-03-04 , DOI: 10.1007/s11760-024-03064-w
Xiaozhi Du , Ning Jia , Hongyuan Du

Image style transfer is a remarkable research hotspot in computer image processing. However, state-of-the-art models have some drawbacks, such as low efficiency of transfer time, distorted image structure and loss of detail information. To address these key issues, this paper proposes an innovative fast style transfer model using optimized self-attention mechanism, called FST-OAM, which mainly consists of four modules: Transformer, image edge detection, fusion and postprocessing. Transformer module extracts the features of content images and style images by encoding and gets the resultant image sequence by decoding. In the Transformer, we present an improved self-attention mechanism to reduce the computational overhead. The image edge detection module is used to extract the edge features of the content and style images. The outputs of the Transformer encoder and the image edge information are input to the fusion module to generate multidimensional image features. Finally, the transferred image is generated with a three-layer convolutional neural network in the postprocessing module. Some different scenes of the content and style images were taken to evaluate our FST-OAM model. The experimental results show that our FST-OAM model outperforms state-of-the-art models. Compared with StyTr\(^{2}\), ArtFlow and SCAIST, the training time of FST-OAM is reduced by 78%, 75%, and 81%, respectively. Compared with StyTr\(^{2}\), ArtFlow, DFP, and SCAIST, the average transfer time of FST-OAM is reduced by 37%, 10%, 56%, and 88%, respectively. Compared with StyTr\(^{2}\), ArtFlow, DFP, and SCAIST, FST-OAM has the highest average PSNR, the lowest average \(L_{c}\), and lower average Gram Loss, which best preserves the content features of the content image and better transfers the style of the stylized image. Besides, in terms of user preference, FST-OAM gets more votes than the other four methods and is more suitable for users.

中文翻译：

FST-OAM：使用优化的自注意力机制的快速风格转移模型

图像风格迁移是计算机图像处理领域的一个引人注目的研究热点。然而，最先进的模型存在一些缺点，例如传输时间效率低、图像结构扭曲和细节信息丢失。为了解决这些关键问题，本文提出了一种利用优化自注意力机制的创新快速风格迁移模型，称为FST-OAM，该模型主要由四个模块组成：Transformer、图像边缘检测、融合和后处理。Transformer模块通过编码提取内容图像和风格图像的特征，并通过解码得到结果图像序列。在 Transformer 中，我们提出了一种改进的自注意力机制来减少计算开销。图像边缘检测模块用于提取内容和风格图像的边缘特征。Transformer编码器的输出和图像边缘信息输入到融合模块，生成多维图像特征。最后，在后处理模块中使用三层卷积神经网络生成传输图像。一些不同场景的内容和风格图像被用来评估我们的 FST-OAM 模型。实验结果表明，我们的 FST-OAM 模型优于最先进的模型。与StyTr \(^{2}\)、ArtFlow和SCAIST相比，FST-OAM的训练时间分别减少了78%、75%和81%。与StyTr \(^{2}\)、ArtFlow、DFP和SCAIST相比，FST-OAM的平均传输时间分别减少了37%、10%、56%和88%。与 StyTr \(^{2}\)、ArtFlow、DFP 和 SCAIST 相比，FST-OAM 具有最高的平均PSNR、最低的平均\(L_{c}\)和较低的平均Gram Loss，最好地保留了内容图像的内容特征，更好地传递风格化图像的风格。另外，从用户偏好来看，FST-OAM比其他四种方法获得更多的票数，更适合用户。

更新日期：2024-03-05

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>