当前位置: X-MOL 学术IEEE Trans. Image Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
MM-Net: A MixFormer-Based Multi-Scale Network for Anatomical and Functional Image Fusion
IEEE Transactions on Image Processing ( IF 10.6 ) Pub Date : 2024-03-18 , DOI: 10.1109/tip.2024.3374072
Yu Liu 1 , Chen Yu 1 , Juan Cheng 1 , Z. Jane Wang 2 , Xun Chen 3
Affiliation  

Anatomical and functional image fusion is an important technique in a variety of medical and biological applications. Recently, deep learning (DL)-based methods have become a mainstream direction in the field of multi-modal image fusion. However, existing DL-based fusion approaches have difficulty in effectively capturing local features and global contextual information simultaneously. In addition, the scale diversity of features, which is a crucial issue in image fusion, often lacks adequate attention in most existing works. In this paper, to address the above problems, we propose a MixFormer-based multi-scale network, termed as MM-Net, for anatomical and functional image fusion. In our method, an improved MixFormer-based backbone is introduced to sufficiently extract both local features and global contextual information at multiple scales from the source images. The features from different source images are fused at multiple scales based on a multi-source spatial attention-based cross-modality feature fusion (CMFF) module. The scale diversity of the fused features is further enriched by a series of multi-scale feature interaction (MSFI) modules and feature aggregation upsample (FAU) modules. Moreover, a loss function consisting of both spatial domain and frequency domain components is devised to train the proposed fusion model. Experimental results demonstrate that our method outperforms several state-of-the-art fusion methods on both qualitative and quantitative comparisons, and the proposed fusion model exhibits good generalization capability. The source code of our fusion method will be available at https://github.com/yuliu316316 .

中文翻译:

MM-Net:基于 MixFormer 的多尺度网络,用于解剖和功能图像融合

解剖和功能图像融合是各种医学和生物应用中的重要技术。近年来,基于深度学习(DL)的方法已成为多模态图像融合领域的主流方向。然而,现有的基于深度学习的融合方法很难同时有效地捕获局部特征和全局上下文信息。此外,特征的尺度多样性是图像融合中的一个关键问题,但在大多数现有工作中往往缺乏足够的关注。在本文中,为了解决上述问题,我们提出了一种基于 MixFormer 的多尺度网络,称为 MM-Net,用于解剖和功能图像融合。在我们的方法中,引入了改进的基于 MixFormer 的主干,以从源图像中充分提取多个尺度的局部特征和全局上下文信息。基于多源空间注意力的跨模态特征融合(CMFF)模块,来自不同源图像的特征在多个尺度上融合。一系列多尺度特征交互(MSFI)模块和特征聚合上采样(FAU)模块进一步丰富了融合特征的尺度多样性。此外,设计了由空间域和频域分量组成的损失函数来训练所提出的融合模型。实验结果表明,我们的方法在定性和定量比较上都优于几种最先进的融合方法,并且所提出的融合模型表现出良好的泛化能力。我们的融合方法的源代码将在以下位置提供:https://github.com/yuliu316316
更新日期:2024-03-18
down
wechat
bug