Hierarchical Patch Aggregation Transformer for Motion Deblurring,Neural Processing Letters

当前位置： X-MOL 学术 › Neural Process Lett. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Hierarchical Patch Aggregation Transformer for Motion Deblurring
Neural Processing Letters ( IF 3.1 ) Pub Date : 2024-04-04 , DOI: 10.1007/s11063-024-11594-0
Yujie Wu , Lei Liang , Siyao Ling , Zhisheng Gao

The encoder-decoder framework based on Transformer components has become a paradigm in the field of image deblurring architecture design. In this paper, we critically revisit this approach and find that many current architectures severely focus on limited local regions during the feature extraction stage. These designs compromise the feature richness and diversity of the encoder-decoder framework, leading to bottlenecks in performance improvement. To address these deficiencies, a novel Hierarchical Patch Aggregation Transformer architecture (HPAT) is proposed. In the initial feature extraction stage, HPAT combines Axis-Selective Transformer Blocks with linear complexity and is supplemented by an adaptive hierarchical attention fusion mechanism. These mechanisms enable the model to effectively capture the spatial relationships between features and integrate features from different hierarchical levels. Then, we redesign the feedforward network of the Transformer block in the encoder-decoder structure and propose the Fused Feedforward Network. This effective aggregation enhances the ability to capture and retain local detailed features. We evaluate HPAT through extensive experiments and compare its performance with baseline methods on public datasets. Experimental results show that the proposed HPAT model achieves state-of-the-art performance in image deblurring tasks.

中文翻译：

用于运动去模糊的分层补丁聚合变压器

基于Transformer组件的编码器-解码器框架已经成为图像去模糊架构设计领域的范例。在本文中，我们批判性地重新审视了这种方法，并发现许多当前的架构在特征提取阶段严重关注有限的局部区域。这些设计损害了编码器-解码器框架的功能丰富性和多样性，导致性能改进遇到瓶颈。为了解决这些缺陷，提出了一种新颖的分层补丁聚合变压器架构（HPAT）。在初始特征提取阶段，HPAT 结合了具有线性复杂度的轴选择性变换器块，并辅以自适应分层注意融合机制。这些机制使模型能够有效地捕获特征之间的空间关系并集成来自不同层次级别的特征。然后，我们重新设计了编码器-解码器结构中 Transformer 块的前馈网络，并提出了融合前馈网络。这种有效的聚合增强了捕获和保留局部细节特征的能力。我们通过大量实验评估 HPAT，并将其性能与公共数据集上的基线方法进行比较。实验结果表明，所提出的 HPAT 模型在图像去模糊任务中实现了最先进的性能。

更新日期：2024-04-05

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>