A review of deep learning-based information fusion techniques for multimodal medical image classification,arXiv - CS - Computer Vision and Pattern Recognition

当前位置： X-MOL 学术 › arXiv.cs.CV › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A review of deep learning-based information fusion techniques for multimodal medical image classification
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2024-04-23 , DOI: arxiv-2404.15022
Yihao Li, Mostafa El Habib Daho, Pierre-Henri Conze, Rachid Zeghlache, Hugo Le Boité, Ramin Tadayoni, Béatrice Cochener, Mathieu Lamard, Gwenolé Quellec

Multimodal medical imaging plays a pivotal role in clinical diagnosis and research, as it combines information from various imaging modalities to provide a more comprehensive understanding of the underlying pathology. Recently, deep learning-based multimodal fusion techniques have emerged as powerful tools for improving medical image classification. This review offers a thorough analysis of the developments in deep learning-based multimodal fusion for medical classification tasks. We explore the complementary relationships among prevalent clinical modalities and outline three main fusion schemes for multimodal classification networks: input fusion, intermediate fusion (encompassing single-level fusion, hierarchical fusion, and attention-based fusion), and output fusion. By evaluating the performance of these fusion techniques, we provide insight into the suitability of different network architectures for various multimodal fusion scenarios and application domains. Furthermore, we delve into challenges related to network architecture selection, handling incomplete multimodal data management, and the potential limitations of multimodal fusion. Finally, we spotlight the promising future of Transformer-based multimodal fusion techniques and give recommendations for future research in this rapidly evolving field.

中文翻译：

基于深度学习的多模态医学图像分类信息融合技术综述

多模态医学成像在临床诊断和研究中发挥着关键作用，因为它结合了来自各种成像方式的信息，可以更全面地了解潜在的病理学。最近，基于深度学习的多模态融合技术已成为改进医学图像分类的强大工具。这篇综述对基于深度学习的医学分类任务多模态融合的发展进行了全面分析。我们探索了流行的临床模式之间的互补关系，并概述了多模式分类网络的三种主要融合方案：输入融合、中间融合（包括单级融合、分层融合和基于注意力的融合）和输出融合。通过评估这些融合技术的性能，我们深入了解不同网络架构对于各种多模态融合场景和应用领域的适用性。此外，我们深入研究了与网络架构选择、处理不完整的多模态数据管理以及多模态融合的潜在局限性相关的挑战。最后，我们重点关注基于 Transformer 的多模态融合技术的前景，并为这个快速发展的领域的未来研究提出建议。

更新日期：2024-04-25

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>