当前位置: X-MOL 学术GeoInformatica › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Towards general-purpose representation learning of polygonal geometries
GeoInformatica ( IF 2 ) Pub Date : 2022-10-22 , DOI: 10.1007/s10707-022-00481-2
Gengchen Mai , Chiyu Jiang , Weiwei Sun , Rui Zhu , Yao Xuan , Ling Cai , Krzysztof Janowicz , Stefano Ermon , Ni Lao

Neural network representation learning for spatial data (e.g., points, polylines, polygons, and networks) is a common need for geographic artificial intelligence (GeoAI) problems. In recent years, many advancements have been made in representation learning for points, polylines, and networks, whereas little progress has been made for polygons, especially complex polygonal geometries. In this work, we focus on developing a general-purpose polygon encoding model, which can encode a polygonal geometry (with or without holes, single or multipolygons) into an embedding space. The result embeddings can be leveraged directly (or finetuned) for downstream tasks such as shape classification, spatial relation prediction, building pattern classification, cartographic building generalization, and so on. To achieve model generalizability guarantees, we identify a few desirable properties that the encoder should satisfy: loop origin invariance, trivial vertex invariance, part permutation invariance, and topology awareness. We explore two different designs for the encoder: one derives all representations in the spatial domain and can naturally capture local structures of polygons; the other leverages spectral domain representations and can easily capture global structures of polygons. For the spatial domain approach we propose ResNet1D, a 1D CNN-based polygon encoder, which uses circular padding to achieve loop origin invariance on simple polygons. For the spectral domain approach we develop NUFTspec based on Non-Uniform Fourier Transformation (NUFT), which naturally satisfies all the desired properties. We conduct experiments on two different tasks: 1) polygon shape classification based on the commonly used MNIST dataset; 2) polygon-based spatial relation prediction based on two new datasets (DBSR-46K and DBSR-cplx46K) constructed from OpenStreetMap and DBpedia. Our results show that NUFTspec and ResNet1D outperform multiple existing baselines with significant margins. While ResNet1D suffers from model performance degradation after shape-invariance geometry modifications, NUFTspec is very robust to these modifications due to the nature of the NUFT representation. NUFTspec is able to jointly consider all parts of a multipolygon and their spatial relations during prediction while ResNet1D can recognize the shape details which are sometimes important for classification. This result points to a promising research direction of combining spatial and spectral representations.



中文翻译:

面向多边形几何的通用表示学习

空间数据(例如,点、折线、多边形和网络)的神经网络表示学习是地理人工智能 (GeoAI) 问题的常见需求。近年来,在点、折线和网络的表示学习方面取得了许多进展,而在多边形,尤其是复杂的多边形几何形状方面进展甚微。在这项工作中,我们专注于开发一种通用的多边形编码模型,该模型可以将多边形几何体(有或没有孔,单面或多面体)编码到嵌入空间中。结果嵌入可以直接用于(或微调)下游任务,例如形状分类、空间关系预测、建筑模式分类、制图建筑泛化等。为了实现模型的泛化性保证,我们确定了编码器应满足的一些理想属性:循环原点不变性、平凡顶点不变性、部分置换不变性和拓扑感知。我们探索了编码器的两种不同设计:一种是在空间域中导出所有表示,并且可以自然地捕获多边形的局部结构;另一个利用光谱域表示,可以轻松捕获多边形的全局结构。对于空间域方法,我们提出了 ResNet1D,这是一种基于 CNN 的 1D 多边形编码器,它使用圆形填充来实现简单多边形上的循环原点不变性。对于谱域方法,我们开发了基于非均匀傅立叶变换 (NUFT) 的 NUFTspec,它自然地满足了所有所需的属性。我们对两个不同的任务进行实验:1)基于常用的MNIST数据集的多边形形状分类;2)基于OpenStreetMap和DBpedia构建的两个新数据集(DBSR-46K和DBSR-cplx46K)的基于多边形的空间关系预测。我们的结果表明,NUFTspec 和 ResNet1D 的性能优于多个现有的基线,具有显着的优势。虽然 ResNet1D 在形状不变几何修改后模型性能下降,但由于 NUFT 表示的性质,NUFTspec 对这些修改非常稳健。NUFTspec 能够在预测期间联合考虑多面体的所有部分及其空间关系,而 ResNet1D 可以识别有时对分类很重要的形状细节。该结果指出了结合空间和光谱表示的有前途的研究方向。

更新日期:2022-10-22
down
wechat
bug