Road Extraction From Remote Sensing Images via Channel Attention and Multilayer Axial Transformer,IEEE Geoscience and Remote Sensing Letters

当前位置： X-MOL 学术 › IEEE Geosci. Remote Sens. Lett. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Road Extraction From Remote Sensing Images via Channel Attention and Multilayer Axial Transformer
IEEE Geoscience and Remote Sensing Letters ( IF 4.8 ) Pub Date : 2024-03-20 , DOI: 10.1109/lgrs.2024.3379502
Qingliang Meng ₁ , Daoxiang Zhou ₁ , Xiaokai Zhang ₁ , Zhigang Yang ₂ , Zehua Chen ₁

Affiliation

Remote sensing images contain many objects that resemble road structures, making it difficult to distinguish roads from the background. Moreover, road extraction is affected by many factors, such as lighting conditions, noise, occlusions, etc., resulting in incomplete and discontinuous road extraction. Learning discriminative road features from remote sensing images is a highly challenging task. In this letter, a novel road extraction model is proposed for remote sensing images under encoder and decoder U-Net-like architecture. An axial Transformer module (ATM) is designed to learn global road features in the deepest layer with linear computational complexity regarding image size. A multilayer attention fusion module (MLAF) is also presented to fuse multiple layers of Transformer features, obtaining more comprehensive and richer semantic information. In the skip connection, a channel attention module (CAM) is designed to weigh the feature maps along the channel dimension, with the goal of improving the capability of feature representation. Extensive experiments are conducted on the DeepGlobe and Massachusetts road datasets. Compared with other methods, our proposed method in this letter realized road extraction from remote sensing images with higher accuracy and less computational cost, e.g., achieving an intersection over union (IoU) of 81.71% (1.02% improvement) and a 22.38% reduction in convergence time over the latest TransRoadNet on the Massachusetts road dataset. Ablation experiments also demonstrate the effectiveness of the designed model.

中文翻译：

通过通道注意力和多层轴向变换器从遥感图像中提取道路

遥感图像包含许多类似于道路结构的物体，因此很难将道路与背景区分开来。而且，道路提取受光照条件、噪声、遮挡等多种因素影响，导致道路提取不完整、不连续。从遥感图像中学习有区别的道路特征是一项极具挑战性的任务。在这封信中，提出了一种在类似 U-Net 架构的编码器和解码器下针对遥感图像的新型道路提取模型。轴向变换器模块（ATM）旨在学习最深层的全局道路特征，并具有与图像大小相关的线性计算复杂性。还提出了多层注意力融合模块（MLAF）来融合多层 Transformer 特征，获得更全面、更丰富的语义信息。在跳跃连接中，设计了通道注意模块（CAM）来沿通道维度对特征图进行加权，目的是提高特征表示的能力。在 DeepGlobe 和马萨诸塞州道路数据集上进行了广泛的实验。与其他方法相比，我们在这封信中提出的方法以更高的精度和更少的计算成本实现了遥感图像中的道路提取，例如，实现了 81.71% 的交集比并集（IoU）（提高了 1.02%），并且减少了 22.38%马萨诸塞州道路数据集上最新 TransRoadNet 的收敛时间。消融实验也证明了所设计模型的有效性。

更新日期：2024-03-20

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>