当前位置: X-MOL 学术Int. J. Appl. Earth Obs. Geoinf. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Dynamic clustering transformer network for point cloud segmentation
International Journal of Applied Earth Observation and Geoinformation ( IF 7.5 ) Pub Date : 2024-03-23 , DOI: 10.1016/j.jag.2024.103791
Dening Lu , Jun Zhou , Kyle (Yilin) Gao , Jing Du , Linlin Xu , Jonathan Li

Point cloud segmentation is one of the most important tasks in LiDAR remote sensing with widespread scientific, industrial, and commercial applications. The research thereof has resulted in many breakthroughs in 3D object and scene understanding. Existing methods typically utilize hierarchical architectures for feature representation. However, the commonly used sampling and grouping methods in hierarchical networks are not only time-consuming but also limited to point-wise 3D coordinates, ignoring the local semantic homogeneity of point clusters. To address these issues, we propose a novel 3D point cloud representation network, called Dynamic Clustering Transformer Network (DCTNet). It has an encoder–decoder architecture, allowing for both local and global feature learning. Specifically, the encoder consists of a series of dynamic clustering-based Local Feature Aggregating (LFA) blocks and Transformer-based Global Feature Learning (GFL) blocks. In the LFA block, we propose novel semantic feature-based dynamic sampling and clustering methods, which enable the model to be aware of local semantic homogeneity for local feature aggregation. Furthermore, instead of traditional interpolation approaches, we propose a new semantic feature-guided upsampling method in the decoder for dense prediction. To our knowledge, DCTNet is the first work to introduce semantic information-based dynamic clustering into 3D Transformers. Extensive experiments on an object-based dataset (ShapeNet), and an airborne multispectral LiDAR dataset demonstrate the State-of-the-Art (SOTA) segmentation performance of DCTNet in terms of both accuracy and efficiency.

中文翻译:

用于点云分割的动态聚类变压器网络

点云分割是激光雷达遥感中最重要的任务之一,具有广泛的科学、工业和商业应用。其研究在3D物体和场景理解方面取得了许多突破。现有方法通常利用分层架构进行特征表示。然而,层次网络中常用的采样和分组方法不仅耗时,而且仅限于逐点3D坐标,忽略了点簇的局部语义同质性。为了解决这些问题,我们提出了一种新颖的 3D 点云表示网络,称为动态聚类变压器网络 (DCTNet)。它具有编码器-解码器架构,允许本地和全局特征学习。具体来说,编码器由一系列基于动态聚类的局部特征聚合(LFA)块和基于 Transformer 的全局特征学习(GFL)块组成。在LFA块中,我们提出了新颖的基于语义特征的动态采样和聚类方法,使模型能够意识到局部特征聚合的局部语义同质性。此外,我们在解码器中提出了一种新的语义特征引导上采样方法,以实现密集预测,而不是传统的插值方法。据我们所知,DCTNet 是第一个将基于语义信息的动态聚类引入 3D Transformers 的工作。基于对象的数据集 (ShapeNet) 和机载多光谱 LiDAR 数据集的大量实验证明了 DCTNet 在准确性和效率方面的最先进 (SOTA) 分割性能。
更新日期:2024-03-23
down
wechat
bug