Towards general-purpose representation learning of polygonal geometries

Mai, Gengchen; Jiang, Chiyu; Sun, Weiwei; Zhu, Rui; Xuan, Yao; Cai, Ling; Janowicz, Krzysztof; Ermon, Stefano; Lao, Ni

doi:10.1007/s10707-022-00481-2

Towards general-purpose representation learning of polygonal geometries

Published: 22 October 2022

Volume 27, pages 289–340, (2023)
Cite this article

GeoInformatica Aims and scope Submit manuscript

Gengchen Mai ORCID: orcid.org/0000-0002-7818-7309^1,2,3,4,
Chiyu Jiang⁵,
Weiwei Sun⁶,
Rui Zhu^3,4,7,
Yao Xuan⁸,
Ling Cai^3,4,
Krzysztof Janowicz^3,4,9,
Stefano Ermon^2,10 &
…
Ni Lao¹¹

1393 Accesses
6 Citations
6 Altmetric
Explore all metrics

Abstract

Neural network representation learning for spatial data (e.g., points, polylines, polygons, and networks) is a common need for geographic artificial intelligence (GeoAI) problems. In recent years, many advancements have been made in representation learning for points, polylines, and networks, whereas little progress has been made for polygons, especially complex polygonal geometries. In this work, we focus on developing a general-purpose polygon encoding model, which can encode a polygonal geometry (with or without holes, single or multipolygons) into an embedding space. The result embeddings can be leveraged directly (or finetuned) for downstream tasks such as shape classification, spatial relation prediction, building pattern classification, cartographic building generalization, and so on. To achieve model generalizability guarantees, we identify a few desirable properties that the encoder should satisfy: loop origin invariance, trivial vertex invariance, part permutation invariance, and topology awareness. We explore two different designs for the encoder: one derives all representations in the spatial domain and can naturally capture local structures of polygons; the other leverages spectral domain representations and can easily capture global structures of polygons. For the spatial domain approach we propose ResNet1D, a 1D CNN-based polygon encoder, which uses circular padding to achieve loop origin invariance on simple polygons. For the spectral domain approach we develop NUFTspec based on Non-Uniform Fourier Transformation (NUFT), which naturally satisfies all the desired properties. We conduct experiments on two different tasks: 1) polygon shape classification based on the commonly used MNIST dataset; 2) polygon-based spatial relation prediction based on two new datasets (DBSR-46K and DBSR-cplx46K) constructed from OpenStreetMap and DBpedia. Our results show that NUFTspec and ResNet1D outperform multiple existing baselines with significant margins. While ResNet1D suffers from model performance degradation after shape-invariance geometry modifications, NUFTspec is very robust to these modifications due to the nature of the NUFT representation. NUFTspec is able to jointly consider all parts of a multipolygon and their spatial relations during prediction while ResNet1D can recognize the shape details which are sometimes important for classification. This result points to a promising research direction of combining spatial and spectral representations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Poly-GAN: Regularizing Polygons with Generative Adversarial Networks

LS3D: Single-View Gestalt 3D Surface Reconstruction from Manhattan Line Segments

Deep learning aided web-based procedural modelling of LOD2 city models

Article 29 July 2023

Data availability

DBpedia uses GNU General Public License and OpenStreetMap uses Open Database License. Both of them are open dataset for academic usage. They do not have personally identifiable information for the privacy protection purpose. Experimental data and the methods developed will be openly shared for reproducibility and replicability https://github.com/gengchenmai/polygon_encoder.

Notes

The answer to this brain teaser question should be 0 because Canada and the US are adjacent to each other. However, since Google utilizes geometric central points as the spatial representations for geographic entities, Google QA returns 2260 km as the answer as the distance between them.
A simple polygon is a polygon that does not intersect itself and has no holes.
In GIScience, sliver polygon is a technical term referring to the small unwanted polygons resulting from polygon intersection or difference.
We use \(Enc(g_{i})\) to represent \(Enc_{\mathcal {G},\theta }(g_{i})\) in the following
https://mapster.me/right-hand-rule-geojson-fixer/
https://en.m.wikipedia.org/wiki/Simplex#Volume
We only compute the data variance for the real value part for each NUFT complex feature.
https://github.com/SPINlab/geometry-learning
https://wiki.openstreetmap.org/wiki/Overpass_API
https://github.com/maxjiang93/DDSL
https://shapely.readthedocs.io/en/stable/manual.html#binary-predicates

References

Bronstein MM, Bruna J, LeCun Y, Szlam A, Vandergheynst P (2017) Geometric deep learning: going beyond Euclidean data. IEEE Signal Process Mag 34(4):18–42
Article Google Scholar
Mai G, Janowicz K, Hu Y, Gao S, Yan B, Zhu R, Cai L, Lao N (2021) A review of location encoding for GeoAI: methods and applications. Int J Geogr Inf Sci
Monti F, Boscaini D, Masci J, Rodola E, Svoboda J, Bronstein MM (2017) Geometric deep learning on graphs and manifolds using mixture model CNNs. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE Xplore, Honolulu, pp 5115–5124
Google Scholar
Defferrard M, Bresson X, Vandergheynst P (2016) Convolutional neural networks on graphs with fast localized spectral filtering. NIPS
Google Scholar
Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. In: International Conference on Learning Representations, May 2-4, 2016. OpenReview, San Juan, Puerto Rico
Google Scholar
Hamilton WL, Ying R, Leskovec J (2017) Inductive representation learning on large graphs. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp 1025–11035. LongBeach. NeurIPS Proceedings
Schlichtkrull M, Kipf TN, Bloem P, Van Den Berg R, Titov I, Welling M (2018) Modeling relational data with graph convolutional networks. In: European semantic web conference, Heraklion, Crete. Greece, Spinger, pp 593–607
Cai L, Yan B, Mai G, Janowicz K, Zhu R (2019) TransGCN: Coupling transformation assumptions with graph convolutional networks for link prediction. In: Proceedings of the 10th International Conference on Knowledge Capture. ACM Proceeding, Marina Del Rey, pp 131–138
Chapter Google Scholar
Mai G, Janowicz K, Cai L, Zhu R, Regalia B, Yan B, Shi M, Lao N (2020) SE-KGE: A location-aware knowledge graph embedding model for geographic question answering and spatial semantic lifting. Trans GIS. https://doi.org/10.1111/tgis.12629
Qi CR, Su H, Mo K, Guibas LJ (2017) Pointnet: Deep learning on point sets for 3d classification and segmentation. Proceedings of the IEEE conference on computer vision and pattern recognition, In, pp 652–660
Google Scholar
Li Y, Bu R, Sun M, Wu W, Di X, Chen B (2018) Pointcnn: Convolution on x-transformed points. Adv Neural Inf Process Syst 31:820–830
Google Scholar
Mac Aodha O, Cole E, Perona P (2019) Presence-only geographical priors for fine-grained image classification. In: Proceedings of the IEEE International Conference on Computer Vision. IEEE Xplore, Seoul, pp 9596–9606
Google Scholar
Mai G, Janowicz K, Yan B, Zhu R, Cai L, Lao N (2020) Multi-scale representation learning for spatial feature distributions using grid cells. In: The Eighth International Conference on Learning Representations. OpenReview, Addis Ababa
Google Scholar
Masci J, Boscaini D, Bronstein M, Vandergheynst P (2015) Geodesic convolutional neural networks on riemannian manifolds. In: Proceedings of the IEEE international conference on computer vision workshops. Santiago, IEEE Xplore, pp 37–45
Google Scholar
Lazer D, Pentland AS, Adamic L, Aral S, Barabasi AL, Brewer D, Christakis N, Contractor N, Fowler J, Gutmann M et al (2009) Life in the network: the coming age of computational social science. Science (New York, NY) 323(5915):721
Article Google Scholar
Fan W, Ma Y, Li Q, He Y, Zhao E, Tang J, Yin D (2019) Graph neural networks for social recommendation. In: The world wide web conference. IEEE Xplore, San Francisco, pp 417–426
Chapter Google Scholar
Gilmer J, Schoenholz SS, Riley PF, Vinyals O, Dahl GE (2017) Neural message passing for quantum chemistry. In: ICML. Proceedings of Machine Learning Research, Sydney
Google Scholar
Davidson EH, Rast JP, Oliveri P, Ransick A, Calestani C, Yuh CH, Minokawa T, Amore G, Hinman V, Arenas-Mena C et al (2002) A genomic regulatory network for development. Science 295(5560):1669–1678
Article Google Scholar
Li Y, Yu R, Shahabi C, Liu Y (2019) Diffusion convolutional recurrent neural network: Data-driven traffic forecasting. In: International Conference on Learning Representations. OpenReview, New Orleans
Google Scholar
Cai L, Janowicz K, Mai G, Yan B, Zhu R (2020) Traffic transformer: Capturing the continuity and periodicity of time series for traffic forecasting. Trans GIS 24(3):736–755
Article Google Scholar
Lin Y, Mago N, Gao Y, Li Y, Chiang YY, Shahabi C, Ambite JL (2018) Exploiting spatiotemporal patterns for accurate air quality forecasting using deep learning. In: Proceedings of the 26th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. ACM Proceeding, Seattle, pp 359–368
Chapter Google Scholar
Appleby G, Liu L, Liu LP (2020) Kriging convolutional networks. In: Proceedinngs of AAAI 2020. AAAI Digital Library Conference Proceedings, New York
Google Scholar
Wu Y, Zhuang D, Labbe A, Sun L (2021) Inductive graph neural networks for spatiotemporal kriging. Proceedings of AAAI 35:4478–4485
Article Google Scholar
Xu Y, Piao Z, Gao S (2018) Encoding crowd interaction with deep neural network for pedestrian trajectory prediction. In: CVPR 2018. IEEE Xplore, Salt Lake City, pp 5275–5284
Google Scholar
Zhang P, Ouyang W, Zhang P, Xue J, Zheng N (2019) Sr-lstm: State refinement for lstm towards pedestrian trajectory prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach. IEEE Xplore, USA, pp 12085–12094
Google Scholar
Rao J, Gao S, Kang Y, Huang Q (2020) LSTM-TrajGAN: A deep learning approach to trajectory privacy protection. In: GIScience 2020. Leibniz International Proceedings in Informatics series, Poznań, pp 12:1–12:17
Google Scholar
Li Y, Yu R, Shahabi C, Liu Y (2018) Diffusion convolutional recurrent neural network: Data-driven traffic forecasting. In: ICLR 2018. OpenReview, Vancouver
Google Scholar
Rv V, Bloem P, Folmer E (2018) Deep learning for classification tasks on geospatial vector polygons. arXiv preprint arXiv:1806.03857
Yan X, Ai T, Yang M, Tong X (2021) Graph convolutional autoencoder model for the shape coding and cognition of buildings in maps. Int J Geogr Inf Sci 35(3):490–512
Article Google Scholar
He X, Zhang X, Xin Q (2018) Recognition of building group patterns in topographic maps based on graph partitioning and random forest. ISPRS J Photogramm Remote Sens 136:26–40
Article Google Scholar
Yan X, Ai T, Yang M, Yin H (2019) A graph convolutional neural network for classification of building patterns using spatial vector data. ISPRS J Photogramm Remote Sens 150:259–273
Article Google Scholar
Bei W, Guo M, Huang Y (2019) A spatial adaptive algorithm framework for building pattern recognition using graph convolutional networks. Sensors 19(24):5518
Article Google Scholar
Yan X, Ai T, Yang M, Tong X, Liu Q (2020) A graph deep learning approach for urban building grouping. Geocarto International 37(10):2944–2966 Taylor & Francis
Article Google Scholar
Feng Y, Thiemann F, Sester M (2019) Learning cartographic building generalization with deep convolutional neural networks. ISPRS Int J Geo-Inf 8(6):258
Article Google Scholar
Zelle JM, Mooney RJ (1996) Learning to parse database queries using inductive logic programming. In: Proceedings of the national conference on artificial intelligence, Portland. AAAI Proceedings, pp 1050–1055
Punjani D, Singh K, Both A, Koubarakis M, Angelidis I, Bereta K, Beris T, Bilidas D, Ioannidis T, Karalis N et al (2018) Template based question answering over linked geospatial data. In: Proceedings of the 12th Workshop on Geographic Information Retrieval. ACM Proceedings, Seattle, pp 1–10
Google Scholar
Scheider S, Nyamsuren E, Kruiger H, Xu H (2021) Geo-analytical question-answering with gis. Int J Digit Earth 14(1):1–14
Article Google Scholar
Mai G, Yan B, Janowicz K, Zhu R (2019) Relaxing unanswerable geographic questions using a spatially explicit knowledge graph embedding model. In: AGILE. Springer, Limassol, pp 21–39
Google Scholar
Mai G, Janowicz K, Zhu R, Cai L, Lao N (2021) Geographic question answering: Challenges, uniqueness, classification, and future directions. AGILE GIScience Ser 2:1–21
Article Google Scholar
Sun X, Christoudias CM, Fua P (2014) Free-shape polygonal object localization. In: European Conference on Computer Vision. Springer, pp 317–332. Zurich, Springer
Castrejon L, Kundu K, Urtasun R, Fidler S (2017) Annotating object instances with a Polygon-RNN. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE Xplore, Honolulu, pp 5230–5238
Google Scholar
Acuna D, Ling H, Kar A, Fidler S (2018) Efficient interactive annotation of segmentation datasets with Polygon-RNN++. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. Salt Lake City, IEEE Xplore, pp 859–868
Google Scholar
Bai X, Liu W, Tu Z (2009) Integrating contour and skeleton for shape classification. In: 2009 IEEE 12th international conference on computer vision workshops, ICCV workshops, Kyoto, IEEE Xplore, pp 360–367
Wang X, Feng B, Bai X, Liu W, Latecki LJ (2014) Bag of contour fragments for robust shape classification. Pattern Recog 47(6):2116–2125
Article Google Scholar
Regalia B, Janowicz K, McKenzie G (2019) Computing and querying strict, approximate, and metrically refined topological relations in linked geographic data. Trans GIS 23(3):601–619
Article Google Scholar
Jiang C, Lansigan D, Marcus P, Nießner M et al (2019) DDSL: Deep differentiable simplex layer for learning geometric signals. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. IEEE Xplore, Seoul, pp 8769–8778
Google Scholar
Jiang CM, Wang D, Huang J, Marcus P, Niessner M (2019) Convolutional neural networks on non-uniform geometrical signals using euclidean spectral transformation. In: International Conference on Learning Representations. OpenReview, New Orleans
Google Scholar
Kurnianggoro L, Jo KH et al (2018) A survey of 2d shape representation: Methods, evaluations, and future research directions. Neurocomputing 300:1–16
Article Google Scholar
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Article Google Scholar
Randell DA, Cui Z, Cohn AG (1992) A spatial logic based on regions and connection. In: 3rd International Conference on Knowledge Representation and Reasoning. AAAI Proceedings, Haifa, pp 165–176
Google Scholar
Egenhofer MJ, Franzosa RD (1991) Point-set topological spatial relations. Int J Geogr Inf Syst 5(2):161–174
Article Google Scholar
Zhang Z, Fidler S, Waggoner J, Cao Y, Dickinson S, Siskind JM, Wang S (2012) Superedge grouping for object localization by combining appearance and shape information. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition. pp 3266–3273. Rhode Island. IEEE Xplore
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Li Y, Tarlow D, Brockschmidt M, Zemel R (2016) Gated graph sequence neural networks. In: ICLR 2016. OpenReview, New Orleans
Google Scholar
Liang J, Homayounfar N, Ma WC, Xiong Y, Hu R, Urtasun R (2020) Polytransform: Deep polygon transformer for instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE Xplore, Seattle, pp 9131–9140
Google Scholar
Atabay HA (2016) Binary shape classification using convolutional neural networks. IIOAB J 7(5):332–336
Google Scholar
Atabay HA (2016) A convolutional neural network with a new architecture applied on leaf classification. IIOAB J 7(5):226–331
Google Scholar
Hofer C, Kwitt R, Niethammer M, Uhl A (2017) Deep learning with topological signatures. In: NIPS 2017. NeurIPS Proceedings, Long Beach
Google Scholar
Baker N, Lu H, Erlikhman G, Kellman PJ (2018) Deep convolutional networks do not classify based on global object shape. PloS Comput Biol 14(12):1006613
Article Google Scholar
Latecki LJ, Lakamper R, Eckhardt T (2000) Shape descriptors for non-rigid shapes with a single closed contour. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No. PR00662), vol 1. IEEE Xplore, Hilton Head, pp 424–429
Chapter Google Scholar
Söderkvist O (2001) Computer vision classification of leaves from Swedish trees. PhD thesis
Leibe B, Schiele B (2003) Analyzing appearance and contour based methods for object categorization. In: 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Proceedings, vol 2. IEEE Xplore, Madison, pp II–409
Google Scholar
Mallah C, Cope J, Orwell J et al (2013) Plant leaf classification using probabilistic integration of shape, texture and margin features. Signal Process Patt Recogn Appl 5(1):45–54
Google Scholar
Sebastian TB, Kimia BB (2005) Curves vs. skeletons in object recognition. Signal Process 85(2):247–263
Article MATH Google Scholar
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention, Munich. Springer, pp 234–241
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE Xplore, Las Vegas, pp 770–778
Google Scholar
Yu F, Wang D, Shelhamer E, Darrell T (2018) Deep layer aggregation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE Xplore, Salt Lake City, pp 2403–2412
Google Scholar
Rippel O, Snoek J, Adams RP (2015) Spectral representations for convolutional neural networks. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal, vol 2. NeurIPS Proceedings, pp 2449–2457
Mildenhall B, Srinivasan PP, Tancik M, Barron JT, Ramamoorthi R, Ng R (2020) Nerf: Representing scenes as neural radiance fields for view synthesis. In: European conference on computer vision. Springer, Glasgow, pp 405–421
Google Scholar
Tancik M, Srinivasan PP, Mildenhall B, Fridovich-Keil S, Raghavan N, Singhal U, Ramamoorthi R, Barron JT, Ng R (2020) Fourier features let networks learn high frequency functions in low dimensional domains. Advances in Neural Information Processing Systems. Virtual-only. NeurIPS Proceedings, vol 33, pp 7537-7547
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems. NeurIPS Proceedings, Long Beach, pp 5998–6008
Google Scholar
Ba JL, Kiros JR, Hinton GE (2016) Layer normalization. arXiv preprint arXiv:1607.06450
Ha D, Eck D (2018) A neural representation of sketch drawings. In: International Conference on Learning Representations. OpenReview, Vancouver
Google Scholar
Deng C, Litany O, Duan Y, Poulenard A, Tagliasacchi A, Guibas LJ (2021) Vector neurons: A general framework for so (3)-equivariant networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp 12200–12209. Montreal, IEEE Xplore
Esteves C, Allen-Blanchette C, Makadia A, Daniilidis K (2018) Learning so (3) equivariant representations with spherical CNNs. In: Proceedings of the European Conference on Computer Vision (ECCV). Springer, Munich, pp 52–68
Google Scholar
Bordes A, Usunier N, Garcia-Duran A, Weston J, Yakhnenko O (2013) Translating embeddings for modeling multi-relational data. In: Neural Information Processing Systems (NIPS). NeurIPS Proceedings, Lake Tahoe, pp 1–
Chen W (2014) Parameterized spatial SQL translation for geographic question answering. In: 2014 IEEE International Conference on Semantic Computing. IEEE Xplore, Newport Beach, pp 23–27
Chapter Google Scholar
Yan B, Janowicz K, Mai G, Gao S (2017) From ITDL to place2vec: Reasoning about place type similarity and relatedness by learning embeddings from augmented spatial contexts. In: Proceedings of the 25th ACM SIGSPATIAL international conference on advances in geographic information systems. ACM Proceedings, Redondo Beach, pp 1–10
Yan B, Janowicz K, Mai G, Zhu R (2019) A spatially explicit reinforcement learning model for geographic knowledge graph summarization. Trans GIS 23(3):620–640
Article Google Scholar
Janowicz K, Gao S, McKenzie G, Hu Y, Bhaduri B (2020) GeoAI: spatially explicit artificial intelligence techniques for geographic knowledge discovery and beyond. Int J Geogr Inf Sci 34(4):625–636 Taylor & Francis
Article Google Scholar
Li W, Hsu CY, Hu M (2021) Tobler’s first law in geoai: A spatially explicit deep learning model for terrain feature detection under weak supervision. Ann Am Assoc Geogr 111(7):1887–1905
Google Scholar
Mai GM, Cundy C, Choi K, Hu Y, Lao N, Ermon S (2022) Towards a foundation model for geospatial artificial intelligence. In: Proceedings of the 30th SIGSPATIAL international conference on advances in geographic information systems. https://doi.org/10.1145/3557915.3561043
Chapter Google Scholar

Download references

Acknowledgements

This work is mainly funded by the National Science Foundation under Grant No. 2033521 A1 – KnowWhereGraph: Enriching and Linking Cross-Domain Knowledge Graphs using Spatially-Explicit AI Technologies and the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA), via 2021-2011000004. Stefano Ermon acknowledges support from NSF (#1651565), AFOSR (FA95501910024), ARO (W911NF-21-1-0125), Sloan Fellowship, and CZ Biohub. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Author information

Authors and Affiliations

Spatially Explicit Artificial Intelligence Lab, Department of Geography, University of Georgia, Athens, 30602, Georgia, USA
Gengchen Mai
Department of Computer Science, Stanford University, Stanford, 94305, California, USA
Gengchen Mai & Stefano Ermon
STKO Lab, University of California Santa Barbara, Santa Barbara, 93106, California, USA
Gengchen Mai, Rui Zhu, Ling Cai & Krzysztof Janowicz
Center for Spatial Studies, University of California Santa Barbara, Santa Barbara, 93106, California, USA
Gengchen Mai, Rui Zhu, Ling Cai & Krzysztof Janowicz
Department of Mechanical Engineering, University of California Berkeley, Berkeley, 94720, California, USA
Chiyu Jiang
Department of Computer Science, University of British Columbia, Vancouver, V6T 1Z4, British Columbia, Canada
Weiwei Sun
School of Geographical Sciences, University of Bristol, Bristol, BS8 1TH, UK
Rui Zhu
Department of Mathematics, University of California Santa Barbara, Santa Barbara, 93106, California, USA
Yao Xuan
Department of Geography and Regional Research, University of Vienna, Vienna, 1040, Austria
Krzysztof Janowicz
Chan Zuckerberg Biohub, San Francisco, 94158, California, USA
Stefano Ermon
Google, Mountain View, 94043, California, USA
Ni Lao

Authors

Gengchen Mai
View author publications
You can also search for this author in PubMed Google Scholar
Chiyu Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Weiwei Sun
View author publications
You can also search for this author in PubMed Google Scholar
Rui Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Yao Xuan
View author publications
You can also search for this author in PubMed Google Scholar
Ling Cai
View author publications
You can also search for this author in PubMed Google Scholar
Krzysztof Janowicz
View author publications
You can also search for this author in PubMed Google Scholar
Stefano Ermon
View author publications
You can also search for this author in PubMed Google Scholar
Ni Lao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gengchen Mai.

Ethics declarations

Conflicts of interest

This paper has been approved by all co-authors. The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Ni Lao - Work done while working at mosaix.ai.

Appendix A

1.1 A.1 The Type Statistic of Polygonal Geometries in DBSR-cplx46K

Table 10.

Table 10 The place type statistic of geographic entities in DBSR-46K and DBSR-cplx46K dataset

Full size table

1.2 A.2 Model Hyperparameter Tuning

We use grid search for hyperparameter tuning. For all polygon encoders on both tasks, we tune the learning rate lr over \(\{0.02, 0.01, 0.005, 0.002, 0.001\}\), the polygon embedding dimension over \(d \in \{256, 512, 1024\}\). As for all DDSL and NUFTspec-based models, we tune the frequency number \(N_{wx} = \{16, 20, 24, 28, 32, 36, 40, 44\}\) for the shape classification task while \(N_{wx} = \{16, 32, 64\}\) for the spatial relation prediction task. As for NUFTspec[gmf]-based models, we tune the \(w_{min}=\{0.2, 0.4, 0.5, 0.8, 1.0\}\) and we tune \(w_{max}\) around \(N_{wx}/2\). For all PCA models, we vary \(K_{PCA}\) such that the top \(K_{PCA}\) PCA components can account for different data variance \(\sum \nolimits _{PCA} = \{80\%, 85\%, 90\%, 95\%\}\). As for ResNet1D, we tune the KDelta point encoder’s neighbor size \(2t \in \{0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20\}\) and tune the number of \(\text {ResNet1D}_{cp}\) - \(\mathcal {K} \in \{ 1, 2, 3\}\). For DDSL+LeNet5, we tune the hidden dimension of LeNet5 over \(\{128, 256, 512, 1024\}\). As to NUFTspec-based models, DDSL+MLP, and DDSL+PCA+MLP, we tune the number of hidden layers h and the number of hidden dimension o in \(MLP_{F}(\cdot )\) over \(h = \{1, 2, 3\}\), \(o = \{512, 1024\}\). We also try different NUFT spectral feature normalization method \(\Psi (\cdot )\) such as no normalization, L2 normalization, and batch normalization. We find out no normalization usually leads to the best performance on all three datasets.

The best hyperparameter combinations for all models on MNIST-cplx70k are shown in Table 11. As for DBSR-46K and DBSR-cplx46K, each model’s best hyperparameter combinations are shown in Table 12.

Table 11 The best hyperparameter combinations for each model on MNIST-cplx70k dataset

Full size table

Table 12 The best hyperparameter combinations for each model on DBSR-46K and DBSR-cplx46K dataset

Full size table

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Mai, G., Jiang, C., Sun, W. et al. Towards general-purpose representation learning of polygonal geometries. Geoinformatica 27, 289–340 (2023). https://doi.org/10.1007/s10707-022-00481-2

Download citation

Received: 23 December 2021
Revised: 19 September 2022
Accepted: 29 September 2022
Published: 22 October 2022
Issue Date: April 2023
DOI: https://doi.org/10.1007/s10707-022-00481-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Towards general-purpose representation learning of polygonal geometries

Abstract

Access this article

Similar content being viewed by others

Poly-GAN: Regularizing Polygons with Generative Adversarial Networks

LS3D: Single-View Gestalt 3D Surface Reconstruction from Manhattan Line Segments

Deep learning aided web-based procedural modelling of LOD2 city models

Data availability

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Appendix A

1.1 A.1 The Type Statistic of Polygonal Geometries in DBSR-cplx46K

1.2 A.2 Model Hyperparameter Tuning

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Towards general-purpose representation learning of polygonal geometries

Abstract

Access this article

Similar content being viewed by others

Poly-GAN: Regularizing Polygons with Generative Adversarial Networks

LS3D: Single-View Gestalt 3D Surface Reconstruction from Manhattan Line Segments

Deep learning aided web-based procedural modelling of LOD2 city models

Data availability

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Appendix A

Appendix A

1.1 A.1 The Type Statistic of Polygonal Geometries in DBSR-cplx46K

1.2 A.2 Model Hyperparameter Tuning

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation