Abstract
Geospatial artificial intelligence (GeoAI) has emerged as a subfield of GIScience that uses artificial intelligence approaches and machine learning techniques for geographic knowledge discovery. The non-regularity of data structures has recently led to different variants of graph neural networks in the field of computer science, with graph convolutional neural networks being one of the most prominent that operate on non-euclidean structured data where the numbers of nodes connections vary and the nodes are unordered. These networks use graph convolution – commonly known as filters or kernels – in place of general matrix multiplication in at least one of their layers. This paper suggests spatial regression graph convolutional neural networks (SRGCNNs) as a deep learning paradigm that is capable of handling a wide range of geographical tasks where multivariate spatial data needs modeling and prediction. The feasibility of SRGCNNs lies in the feature propagation mechanisms, the spatial locality nature, and a semi-supervised training strategy. In the experiments, this paper demonstrates the operation of SRGCNNs with social media check-in data in Beijing and house price data in San Diego. The results indicate that a well-trained SRGCNN model is capable of learning from samples and performing reasonable predictions for unobserved locations. The paper also presents the effectiveness of incorporating the idea of geographically weighted regression for handling heterogeneity between locations in the model approach. Compared to conventional spatial regression approaches, SRGCNN-based models tend to generate much more accurate and stable results, especially when the sampling ratio is low. This study offers to bridge the methodological gap between graph deep learning and spatial regression analytics. The proposed idea serves as an example to illustrate how spatial analytics can be combined with state-of-the-art deep learning models, and to enlighten future research at the front of GeoAI.
Similar content being viewed by others
Change history
03 March 2022
A Correction to this paper has been published: https://doi.org/10.1007/s10707-022-00461-6
References
Paelinck JHP, Klaassen LH, Ancot J-P, Verster ACP (1979) Spatial econometrics, vol 1. Saxon House
Anselin L (1988) Spatial econometrics: Methods and models, vol 4. Springer Science & Business Media, Berlin
LeSage JP (1997) Regression analysis of spatial data. J Region Anal Policy 27:83–94
LeSage JP, Fischer MM (2008) Spatial growth regressions: model specification, estimation and interpretation. Spatial Economic Analysis 3:275–304
Lehmann A, Overton JMcC, Leathwick JR (2002) Grasp: generalized regression analysis and spatial prediction. Ecol Modell 157:189–207
Anselin L (2010) Thirty years of spatial econometrics. Papers Region Sci 89(1):3–25
Fischer MM, Wang J (2011) Spatial data analysis: models, methods and techniques. Springer Science & Business Media
Liu Y, Liu X, Gao S, Gong L, Kang C, Zhi Y, Chi G, Shi L (2015) Social sensing: A new approach to understanding our socioeconomic environments. Ann Assoc Amer Geograph 105:512–530
Vatsavai R, Chandola V (2016) Guest editorial: big spatial data. GeoInformatica 20:797–799
Cheng T, Adepeju M (2014) Modifiable temporal unit problem (mtup) and its effect on space-time cluster detection. PloS one 9:e100465
Haworth James, Cheng Tao (2012) Non-parametric regression for space–time forecasting under missing data. Comput Environ Urban Syst 36:538–550
Kelejian HH, Prucha IR (2007) The relative efficiencies of various predictors in spatial econometric models containing spatial lags. Region Sci Urban Econ 37:363–374
Fischer MM (1998) Computational neural networks: a new paradigm for spatial analysis. Environ Plann A 30:1873–1891
Gu Y, Wylie BK, Boyte SP, Picotte J, Howard DM, Smith K, Nelson KJ (2016) An optimal sample data usage strategy to minimize overfitting and underfitting effects in regression tree models based on remotely-sensed data. Remote Sens 8:943
Mennis J, Guo D (2009) Spatial data mining and geographic knowledge discovery–an introduction. Comput Environ Urban Syst 33:403–408
Reichstein M, Camps-Valls G, Stevens B, Jung M, Denzler J, Carvalhais N, et al. (2019) Deep learning and process understanding for data-driven earth system science. Nature 566:195–204
Janowicz K, Gao S, McKenzie G, Hu Y, Bhaduri B (2020) GeoAI: spatially explicit artificial intelligence techniques for geographic knowledge discovery and beyond. Int J Geograph Inf Sci 34:625–636
Li W, Hsu C-Y (2020) Automated terrain feature identification from remote sensing imagery: a deep learning approach. Int J Geograph Inf Sci 34 (4):637–660
Yan X, Ai T, Yang M, Yin H (2019) A graph convolutional neural network for classification of building patterns using spatial vector data. ISPRS J Photogramm Remote Sens 150:259–273
Zhang F, Wu L, Zhu D, Liu Y (2019) Social sensing from street-level imagery: A case study in learning spatio-temporal urban mobility patterns. ISPRS J Photogramm Remote Sens 153:48–58
Liu P, De Sabbata S (2021) A graph-based semi-supervised approach to classification learning in digital geographies. Comput Environ Urban Syst 86:101583
Zhu D, Cheng X, Zhang F, Yao X, Gao Y, Liu Y (2020) Spatial interpolation using conditional generative adversarial neural networks. Int J Geograph Inf Sci 34:735–758
Xing X, Huang Z, Cheng X, Zhu D, Kang C, Zhang F, Liu Y (2020) Mapping human activity volumes through remote sensing imagery. IEEE J Sel Top Appl Earth Observ Remote Sens 13:5652–5668
Du Z, Wang Z, Wu S, Zhang F, Liu R (2020) Geographically neural network weighted regression for the accurate estimation of spatial non-stationarity. Int J Geograph Inf Sci 34:1353–1377
Zhu D, Zhang F, Wang S, Wang Y, Cheng X, Huang Z, Liu Y (2020) Understanding place characteristics in geographic contexts through graph convolutional neural networks. Ann Amer Assoc Geograph 110:408–420
Xiao L, Lo S, Zhou J, Liu J, Yang L (2020) Predicting vibrancy of metro station areas considering spatial relationships through graph convolutional neural networks: The case of shenzhen, china. Environ Plann B: Urban Anal City Sci:2399808320977866
Schmidhuber J (2015) Deep learning in neural networks: An overview. Neural networks 61:85–117
Bronstein MM, Bruna J, LeCun Y, Szlam A, Vandergheynst P (2017) Geometric deep learning: going beyond euclidean data. IEEE Signal Process Mag 34:18–42
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444
Defferrard M, Bresson X, Vandergheynst P (2016) Convolutional neural networks on graphs with fast localized spectral filtering. In: Advances in Neural Information Processing Systems, pp 3844–3852
Niepert M, Ahmed M, Kutzkov K (2016) Learning convolutional neural networks for graphs. In: International conference on machine learning, pp 2014–2023
Hamilton W, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs. In: Advances in neural information processing systems, pp 1024–1034
Fan RKC (1997) Spectral graph theory. American Mathematical Society
Hammond DK, Vandergheynst P, Gribonval R (2009) Wavelets on graphs via spectral graph theory. Appl Comput Harmon Anal 30:129–150
Bruna J, Zaremba W, Szlam A, Lecun Y (2014) Spectral networks and locally connected networks on graphs. In: International Conference on Learning Representations
Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: International Conference on Learning Representations
Chen R, Wang X, Zhang W, Zhu X, Li A, Yang C (2019) A hybrid CNN-LSTM model for typhoon formation forecasting. GeoInformatica 23:375–396
Zhao L, Song Y, Zhang C, Liu Y, Wang P, Lin T, Deng M, Li H (2019) T-GCN: A temporal graph convolutional network for traffic prediction. IEEE Transactions on Intelligent Transportation Systems 21(9):3848–3858
Zhang Y, Cheng T, Ren Y, Xie K (2020) A novel residual graph convolution deep learning model for short-term network-based traffic forecasting. Int J Geograph Inf Sci 34:969–995
Bai J, Zhu J, Song Y, Zhao L, Hou Z, Du R, Li H (2021) A3t-gcn: Attention temporal graph convolutional network for traffic forecasting. ISPRS Int J Geo-Inf 10:485
Bui K-HN, Cho J, Yi H (2021) Spatial-temporal graph neural network for traffic forecasting: An overview and open research issues. Appl Intell:1–12
Hu S, Gao S, Wu L, Xu Y, Zhang Z, Cui H, Gong X (2021) Urban function classification at road segment level using taxi trajectory data: A graph convolutional neural network approach. Comput Environ Urban Syst 87:101619
Griffith DA, Paelinck JHP (2011) Non-standard spatial statistics and spatial econometrics. Springer Science & Business Media
Arbia G (2014) A primer for spatial econometrics with applications in r. Springer
Anselin L, Rey SJ (2014) Modern spatial econometrics in practice: A guide to geoda, geodaspace and pysal. GeoDa Press LLC
Kelejian H, Piras G (2017) Spatial econometrics. Academic Press
Yamagata Y, Seya H (2019) Spatial analysis using big data: Methods and urban applications. Academic Press
LeSage JP, Pace RK (2009) Introduction to spatial econometrics. CRC Press/Taylor & Francis, Boca Raton
Fotheringham AS, Yang W, Kang W (2017) Multiscale Geographically Weighted Regression (MGWR). Ann Amer Assoc Geograph 107:1247–1265 (en)
Ord K (1975) Estimation methods for models of spatial interaction. J Amer Stat Assoc 70:120–126
Kelejian HH, Prucha IR (1998) A generalized spatial two-stage least squares procedure for estimating a spatial autoregressive model with autoregressive disturbances. J Real Estate Finance Econ 17(1):99–121
Kelejian HH, Prucha IR (1999) A generalized moments estimator for the autoregressive parameter in a spatial model. Int Econ Rev 40(2):509–533
Goodfellow I, Bengio Y, Courville A, Bengio Y (2016) Deep learning, vol 1. MIT press Cambridge
Paola JD, Schowengerdt RA (1995) A detailed comparison of backpropagation neural network and maximum-likelihood classifiers for urban land use classification. IEEE Trans Geosci Remote Sens 33(4):981–996
Vatsavai RR, Bhaduri B (2011) A hybrid classification scheme for mining multisource geospatial data. GeoInformatica 15:29–47
Zhu D, Wang N, Wu L, Liu Y (2017) Street as a big geo-data assembly and analysis unit in urban studies: A case study using beijing taxi data. Appl Geograph 86:152–164
Long Y, Liu X (2013) How mixed is beijing, china? a visual exploration of mixed land use. Environ Plann A 45:2797–2798
Chen L, Gao Y, Zhu D, Yuan Y, Liu Y (2019) Quantifying the scale effect in geospatial big data using semi-variograms. PloS one 14:e0225139
Anselin L (2009) Spatial regression. In: Fotheringham AS, Rogerson PA (eds) The SAGE Handbook of Spatial Analysis. SAGE Publications, Los Angeles, pp 255–275
Veličković P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2017) Graph attention networks. arXiv:1710.10903
Wu S, Wang Z, Du Z, Huang B, Zhang F, Liu R (2020) Geographically and temporally neural network weighted regression for modeling spatiotemporal non-stationary relationships. Int J Geograph Inf Sci:1–27
Kwan M-P (2012) The uncertain geographic context problem. Ann Assoc Amer Geograph 102:958–968
Acknowledgements
The authors gratefully acknowledge editors, the anonymous reviewers, Dr. Tao Cheng, Dr. Yang Zhang, Dr. Ximeng Cheng, and Dr. Fan Zhang for their helpful comments. This work was partially supported by the New Faculty Set-up Funding of College of Liberal Arts, University of Minnesota (1000-10964-20042-5672018). Prof. Yu Liu is supported by the National Key Research and Development Program of China (2017YFB0503602) and the National Natural Science Foundation of China (41625003).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The original online version of this article was revised due to incorrect wording in the Introduction.
Appendices
Appendix
In this section, we discuss SRGCNNs in a more typical spatial regression scenario: house price modeling. Utilizing a house rent price dataset at San Diego, C.A., U.S., we carefully evaluate the regression accuracies across models, and investigate model performances given different sets of explanatory variables.
A.1 Data description and feature selection
The open data behind the Inside Airbnb siteFootnote 3 was collected for this appendix experiment. The data is sourced from publicly available information in the Airbnb site, which includes the daily rent price of listed properties and many additional attributes for the listings. To compare SRGCNNs with traditional spatial regression models, we will examine some information in all Airbnb listings in San Diego, C.A., U.S. on July, 07, 2016.
In Fig. 11, we visualize the logarithm prices to the base e (ln(price)) in Fig. 11a. The map utilizes a percentile color scheme to highlight both the extreme high prices (head 1%, in yellow) and low prices (tail 1%, in black). There are 6,110 collected Airbnb listings in total, the prices are obviously spatial autocorrelated, with high-value clusters as well as low-value clusters within the study area. In Fig.11b, we present the house characteristics that are of interest in this experiment, with both continuous variables (e.g., number of accommodate people as in “accommodates”, and number of beds as in “beds”) and categorical variables (e.g., rent type as in “rt_XXX”, and property groupy in “pg_XXX”). Also, we include a binary variable named “coastal” to indicate whether a house is near the ocean.
In the following regression analysis, we will use ln(price) as the dependent variable y, and examine two independent sets X (four variables) and X+ (eleven variables). The basic independent variable set X = {“accommodates”, “bathrooms”, “bedrooms”, “beds”} contains only the four continuous intrinsic characteristics: number of accommodate people, number of bedrooms, number of bathrooms, and number of beds. While the extended independent variable set X+ = {“accommodates”, “bathrooms”, “bedrooms”, “beds”, “rt_Private_room”, “rt_Shared_room”, “pg_House”, “pg_Condominium”, “pg_Townhouse”, “pg_Other”, “coastal”} contains additional characteristics of rent type, property group, and the coastal indicator. The rent types are used as dummy variables, denoting whether a listing belongs to a private room, a shared room, or an entire home. The property groups are also used as dummy variables, indicating whether the listing is an apartment, a condom, a townhouse, a single family house or others.
Note that it is possible to include other surrounding environmental context as the independent variables, such as the distance to the highways and number of parks in the neighborhood to further improve the regression accuracy. However, the selection of informative feature variables is beyond the scope of our paper. Here, we just provide two different sets of independent variables in order to shed light on the influence of feature engineering in SRGCNN-based models. Future applications are invited to test out SRGCNNs with different feature combinations in specialized tasks.
A.2 Model training
The regressions on prices at all locations (100% training ratio) are performed using linear regression model (LR), spatial autoregressive model (SAR), and the SRGCNN-GW model (19). For simplicity, models with the additional variable set X+ are referred to as LR+, SAR+, and SRGCNN-GW+, respectively. We choose SRGCNN-GW model here rather than the basic SRGCNN model because SRGCNN-GW is better at fitting the training dataset, while the basic SRGCNN model is better for prediction (as discussed in Section 5.2).
We consider k= 20 nearest neighbors for each location to construct the spatial weights matrix in SAR and the graph structure in SRGCNN-GW. It is optional to change the way of defining the spatial structure, e.g., a different k, or using other measurements such as distance, queen adjacency. We won’t dive into this because the influence of geographic contexts on spatial regression is another topic to investigate [25] and it is beyond the scope of this paper.
We adopt similar training settings as introduced in Section 4.2.3. The learning rate is changed to η = 3 × 10− 2. Training epochs are capped at 15,000 for SRGCNN-GW and 18,000 for SRGCNN-GW+. We record the best results among all epochs. The MSE Loss and MAPE during the training process are plotted in Fig. 12. The hidden feature units are set to be 4 × 8 = 32 for SRGCNN-GW and 11 × 8 = 88 for SRGCNN-GW+, considering the different input features provided in X and X+. As can be seen, SRGCNN-GW reaches the lowest MAPE at epoch 12,161, while SRGCNN-GW+ reaches its lowest MAPE at epoch 15,508. After that, both models exhibit overfitting as the MAPE starts to rise up again. For the whole training, SRGCNN-GW+ converged slightly slower (with more epochs) compared to SRGCNN-GW, because there are more feature parameters to be learned in the geographic weighted graph convolutions.
A.3 Evaluation of the results
The results across all models are summarized in Table 4. We report the R2 and MAPE as two metrics to evaluate the goodness of model fitting. Also, the Z-scored Morans’ I value of prediction errors and the fitted ln(price) are included to indicate how the models capture the spatial effects in data.
It is encouraging to find that SRGCNN-GW significantly outperforms LR and SAR regarding both R2 and MAPE. Using the basic independent variable set X, we can see that LR, the non-spatial linear regression model, can only explain about 56% of the real process; SAR, the most common used spatial lagged model, increases the goodness of fit to about 62%; SRGCNN-GW model, however, reaches a much higher goodness of fit around 83%. By adding more explanatory information, all models exhibit better results using X+. LR increases from 55% to 68%, SAR increases from 62% to 71%, and SRGCNN increases from 83% to 87%. Since SRGCNN-based models consider more complex spatial relationships during the modeling, the influence of additional independent variables is less than traditional models such as linear regression and spatial lagged models. With respect to the MAPE, conclusions are exactly the same, SRGCNN-GW+ reaches the lowest fitting error at only 3.76%. Seen from the Z-scored autocorrelations, SAR and SAR+ are better at handling the spatial errors. SRGCNN-GW models also have lower error autocorrelations compared to LR models. SRGCNN-GW models reports higher global autocorrelation of the fitted price than SAR, indicating an explicitly modeling of spatial structure in its graph convolution layers.
Results are also compared in Fig. 13. The spatial distributions of ln(price) are plotted in the first row for both the original data and the model predictions. The scatter plots and the Pearson correlation coefficients ρ are presented in the second row to further evaluate the models. As shown, SRGCNN-GW+ has done an outstanding job fitting the price data, with a modeled spatial pattern really similar to the original one and a highest Pearson correlation ρ = 0.9334. The number of input features does have influences on the modeling accuracies, but it is still not clear on how to select informative variables for SRGCNN models. Future works are to develop specialized methods for the visualization and analysis of complex feature parameters in SRGCNN models with regard to regression statistics.
Rights and permissions
About this article
Cite this article
Zhu, D., Liu, Y., Yao, X. et al. Spatial regression graph convolutional neural networks: A deep learning paradigm for spatial multivariate distributions. Geoinformatica 26, 645–676 (2022). https://doi.org/10.1007/s10707-021-00454-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10707-021-00454-x