Frequency and spatial based multi-layer context network (FSCNet) for remote sensing scene classification,International Journal of Applied Earth Observation and Geoinformation

当前位置： X-MOL 学术 › Int. J. Appl. Earth Obs. Geoinf. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Frequency and spatial based multi-layer context network (FSCNet) for remote sensing scene classification
International Journal of Applied Earth Observation and Geoinformation ( IF 7.5 ) Pub Date : 2024-03-21 , DOI: 10.1016/j.jag.2024.103781
Wei Wang , Yujie Sun , Ji Li , Xin Wang

Remote Sensing Scene Classification (RSSC) is an important and challenging research topic due to the variety of land cover sizes and spatial combinations, as well as significant interclass similarity and intraclass variability. Currently, convolutional neural network (CNN)-based methods have been widely used in RSSC tasks with significant results. However, CNNs lack the ability to obtain long-term correlations. Transformer addressed this problem, thanks to the global receptive field of multi-head self-attention (MSA). Nevertheless, the vanilla transformer also needs further improvement to accommodate the diverse in type and scale of objects in RS scenes. In addition, the existing RSSC methods either use the last layer features, which is not conducive to process multi-scale remote sensing images, or directly fuse the multi-layer features, which will bring redundant or mutually exclusive information. To address the above issues, a novel RSSC framework, named frequency and spatial based multi-layer attention network (FSCNet) for remote sensing scene classification is proposed in this article. First, to fully extract the pyramid multi resolution features of CNN, a cross resolution injection model (CRIM) is proposed. Second, to generate better understand of the multilevel features, a frequency and spatial MLP (FS-MLP) is designed. Third, in order to aggregate contextual relations among multi-layer features, a multi-layer context align attention (MCAA) is adopted. The final classification is integration of top-layer feature and aggregated multi-layer feature. The experiment results on three well-known RS scene classification datasets (UCM, AID, and NWPU) prove the effectiveness of FSCNet and it outperforms many state-of-the-art methods.

中文翻译：

用于遥感场景分类的基于频率和空间的多层上下文网络（FSCNet）

由于土地覆盖规模和空间组合的多样性，以及显着的类间相似性和类内变异性，遥感场景分类（RSSC）是一个重要且具有挑战性的研究课题。目前，基于卷积神经网络（CNN）的方法已广泛应用于 RSSC 任务中，并取得了显着的效果。然而，CNN 缺乏获得长期相关性的能力。 Transformer 借助多头自注意力（MSA）的全局感受野解决了这个问题。然而，香草变压器还需要进一步改进，以适应 RS 场景中对象的不同类型和规模。此外，现有的RSSC方法要么使用最后一层特征，不利于处理多尺度遥感图像，要么直接融合多层特征，这会带来冗余或互斥的信息。为了解决上述问题，本文提出了一种用于遥感场景分类的新型RSSC框架，称为基于频率和空间的多层注意网络（FSCNet）。首先，为了充分提取CNN的金字塔多分辨率特征，提出了交叉分辨率注入模型（CRIM）。其次，为了更好地理解多级特征，设计了频率和空间 MLP (FS-MLP)。第三，为了聚合多层特征之间的上下文关系，采用了多层上下文对齐注意（MCAA）。最终的分类是顶层特征和聚合多层特征的集成。在三个著名的 RS 场景分类数据集（UCM、AID 和 NWPU）上的实验结果证明了 FSCNet 的有效性，并且它优于许多最先进的方法。

更新日期：2024-03-21

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>