Abstract
The concept of ‘spatial scale’, or simply ‘scale’ is implicit in any discussion of global versus local models. The raison d’etre of local models is that a global scale (where here ‘global’ simply refers to all locations within a predefined area of interest) might be the incorrect scale at which to undertake any analysis of spatial processes; the alternative being a local scale (where here ‘local’ refers to individual locations). Here we explore two well-known scale issues in the context of local modeling: the modifiable areal unit problem (MAUP) and Simpson’s paradox. In doing so, we highlight that scale effects play two very different roles in any consideration of local versus global modeling. First, we examine the sensitivity of global and local models to the MAUP and show how the effects of the MAUP in global models are a function of the degree to which processes vary over space. This generates a new insight into the MAUP: it results from the properties of processes rather than the properties of data. Then we highlight the extreme differences that can result when calibrating global and local models and how Simpson’s paradox can arise in this context. In the examination of the MAUP, scale is treated as a measure of the degree to which data are aggregated prior to any form of modeling; in the study of Simpson’s paradox, scale refers to the geographical entity for which a model is calibrated.
Similar content being viewed by others
Notes
In this paper, the term ‘process’ is used to describe a sequence of events whereby change in one variable leads to a measurable change in another. These events are unknown so in essence the use of the word ‘process’ is shorthand for a conditioned relationship between two variables.
While MGWR is a truly local model in that it uses subsets of the data to undertake a series of local calibrations, Bayesian SVC and ESF are ‘whole-data’ techniques and as such are not true local models. However, all these models are based on the same premise—that the relationships that produce the observed pattern of the dependent variable might not be stationary over space—and all produce locally varying parameter estimates to reflect this premise. Consequently, we refer to all such models as ‘local’.
References
Amrhein CG (1995) Searching for the elusive aggregation effect: evidence from statistical simulations. Environ Plan A: Econ Space 27(1):105–119. https://doi.org/10.1068/a270105
Anselin L (1988) Spatial econometrics: Methods and models. Kluwer Academic Publishers, Dordrecht
Anselin L (1995) Local Indicators of Spatial Association—LISA. Geogr Anal 27(2):93–115. https://doi.org/10.1111/j.1538-4632.1995.tb00338.x
Anselin L (2002) Under the hood Issues in the specification and interpretation of spatial regression models. Agric Econ 27(3):247–267. https://doi.org/10.1111/j.1574-0862.2002.tb00120.x
Anselin L (2009) Spatial regression. In: Fotheringham A, Rogerson P (eds) The SAGE handbook of spatial analysis. SAGE Publications, Thousand Oaks, pp 254–275. https://doi.org/10.4135/9780857020130.n14
Arbia G (1989) Spatial data configuration in statistical analysis of regional economic and related problems. Springer, Netherlands. https://doi.org/10.1007/978-94-009-2395-9
Banerjee S (2003) Hierarchical modeling and analysis for spatial data. CRC Press, Boca Raton
Banerjee S, Carlin BP, Gelfand AE (2014) Hierarchical modeling and analysis for spatial data. CRC Press, Boca Raton
Bickel PJ, Hammel EA, O’connell JW (1975) Sex bias in graduate admissions: data from Berkeley. Science (New York, NY) 187(4175):398–404. https://doi.org/10.1126/science.187.4175.398
Blyth CR (1972) On Simpson’s paradox and the sure-thing principle. J Am Stat Assoc 67(338):364–366. https://doi.org/10.1080/01621459.1972.10482387
Boots BN, Getis A (1988) Point pattern analysis. SAGE Publications, Thousand Oaks
Brunsdon C, Fotheringham AS, Charlton ME (1996) Geographically weighted regression: a method for exploring spatial nonstationarity. Geogr Anal 28(4):281–298. https://doi.org/10.1111/j.1538-4632.1996.tb00936.x
Brunsdon C, Fotheringham S, Charlton M (1998) Geographically weighted regression. J R Stat Soc Ser D (stat) 47(3):431–443. https://doi.org/10.1111/1467-9884.00145
Charig CR, Webb, DR, Payne, SR, Wickham, JE (1986) Comparison of treatment of renal calculi by open surgery percutaneous nephrolithotomy and extracorporeal shockwave lithotripsy. BMJ 292(6524):879–882. https://doi.org/10.1136/bmj.292.6524.879
Chumney ECG, Simpson KN (2006) Methods and designs for outcomes research. ASHP, Bethesda
Cliff AD, Ord JK (1981) Spatial processes models and applications. Pion Ltd, London
Cohen MR, Nagel E (1934) An introduction to logic and scientific method. Harcourt, Brace and Company, San Diego
Cressie N (1996) Change of support and the modifiable areal unit problem. Faculty of Informatics - Papers (Archive), 159–180
Dark SJ, Bram D (2007) The modifiable areal unit problem (MAUP) in physical geography. Prog Phys Geogr Earth Environ 31(5):471–479. https://doi.org/10.1177/0309133307083294
Dempster AP, Schatzoff M, Wermuth N (1977) A simulation study of alternatives to ordinary least squares. J Am Stat Assoc 72(357):77–91. https://doi.org/10.1080/01621459.1977.10479910
Finley AO (2011) Comparing spatially-varying coefficients models for analysis of ecological data with non-stationary and anisotropic residual dependence. Methods Ecol Evol 2(2):143–154. https://doi.org/10.1111/j.2041-210X.2010.00060.x
Fotheringham AS, Wong DWS (1991) The modifiable areal unit problem in multivariate statistical analysis. Environ Plan A: Econ Space 23(7):1025–1044. https://doi.org/10.1068/a231025
Fotheringham AS, Brunsdon C, Charlton M (2003) Geographically weighted regression: the analysis of spatially varying relationships. Wiley, New York
Fotheringham S, Yang W, Kang W (2017) Multiscale geographically weighted regression (MGWR). Ann Am Assoc Geogr. https://doi.org/10.1080/24694452.2017.1352480
Fotheringham AS, Yue H, Li Z (2019) Examining the influences of air quality in China’s cities using multi-scale geographically weighted regression. Trans GIS 23(6):1444–1464. https://doi.org/10.1111/tgis.12580
Gatrell AC, Bailey TC, Diggle PJ, Rowlingson BS (1996) Spatial point pattern analysis and its application in geographical epidemiology. Trans Inst Br Geogr 21(1):256–274. https://doi.org/10.2307/622936
Gehlke CE, Biehl K (1934) Certain effects of grouping upon the size of the correlation coefficient in census tract material. J Am Stat Assoc 29(185A):169–170. https://doi.org/10.1080/01621459.1934.10506247
Gelfand AE, Kim H-J, Sirmans CF, Banerjee S (2003) Spatial modeling with spatially varying coefficient processes. J Am Stat Assoc 98(462):387–396. https://doi.org/10.1198/016214503000170
Getis A (1991) Spatial interaction and spatial autocorrelation: a cross-product approach. Environ Plan A: Econ Space 23(9):1269–1277. https://doi.org/10.1068/a231269
Getis A, Ord JK (1992) The analysis of spatial association by use of distance statistics. Geogr Anal 24(3):189–206. https://doi.org/10.1111/j.1538-4632.1992.tb00261.x
Griffith DA (2003) Spatial autocorrelation and spatial filtering: gaining understanding through theory and scientific visualization. Springer, Berlin
Griffith DA, Csillag F (1993) Exploring relationships between semi-variogram and spatial autoregressive models. Pap Reg Sci 72(3):283–295. https://doi.org/10.1007/BF01434277
Haining R (1993) Spatial data analysis in the social and environmental sciences. Cambridge University Press, Cambridge
Haining RP (2003) Spatial data analysis: theory and practice. Cambridge University Press, Cambridge
Hutcheson G (1999) The multivariate social scientist. SAGE Publications Ltd, Thousand Oaks. https://doi.org/10.4135/9780857028075
Jelinski DE, Wu J (1996) The modifiable areal unit problem and implications for landscape ecology. Landsc Ecol 11(3):129–140. https://doi.org/10.1007/BF02447512
Kedron P, Frazier A, Goodchild M, Fotheringham AS, Li W (2021a) Reproducible and replicable geospatial research: where are we and where might we go? Int J Geogr Inf Sci 35(3):427–445
Kedron P, Frazier A, Trgovac A, Nelson T, Fotheringham AS (2021b) Reproducibility and replicability in geographical analysis. Geogr Anal 53(1):135–147
LeSage JP, Pace RK (2010) Spatial econometric models. In: Fischer MM, Getis A (eds) Handbook of applied spatial analysis: software tools, methods and applications. Springer, Berlin, pp 355–376. https://doi.org/10.1007/978-3-642-03647-7_18
Li Z, Fotheringham AS (2020) Computational improvements to multi-scale geographically weighted regression. Int J Geogr Inf Sci 34(7):1378–1397. https://doi.org/10.1080/13658816.2020.1720692
Li F, Sang H (2019) Spatial homogeneity pursuit of regression coefficients for large datasets. J Am Stat Assoc 114(527):1050–1062. https://doi.org/10.1080/01621459.2018.1529595
Li Z, Fotheringham AS, Oshan TM, Wolf LJ (2020) Measuring bandwidth uncertainty in multiscale geographically weighted regression using Akaike weights. Ann Am Assoc Geogr 110(5):1500–1520. https://doi.org/10.1080/24694452.2019.1704680
Matheron G (1963) Principles of geostatistics. Econ Geol 58(8):1246–1266. https://doi.org/10.2113/gsecongeo.58.8.1246
Moran PAP (1948) The interpretation of statistical maps. J R Stat Soc Ser B (Methodol) 10(2):243–251
Moran PAP (1950) Notes on continuous stochastic phenomena. Biometrika 37(1/2):17–23. https://doi.org/10.2307/2332142
Murakami D, Griffith DA (2015) Random effects specifications in eigenvector spatial filtering: a simulation study. J Geogr Syst 17(4):311–331. https://doi.org/10.1007/s10109-015-0213-7
Openshaw S (1977) A geographical solution to scale and aggregation problems in region-building, partitioning and spatial modelling. Trans Inst Br Geogr. https://doi.org/10.2307/622300
Openshaw S (1983) The modifiable areal unit problem. Geo Books, Norwich
Openshaw S (1979) A million or so correlation coefficients: three experiments on the modifiable areal unit problem. Spatistica Applications in the Spatial Sciences, 127–144
Ord JK, Getis A (2001) Testing for local spatial autocorrelation in the presence of global autocorrelation. J Reg Sci 41(3):411–432. https://doi.org/10.1111/0022-4146.00224
Oshan TM, Fotheringham AS (2018) A comparison of spatially varying regression coefficient estimates using geographically weighted and spatial-filter-based techniques. Geogr Anal 50(1):53–75. https://doi.org/10.1111/gean.12133
Oshan TM, Li Z, Kang W, Wolf L, Fotheringham AS (2019) mgwr: A python implementation of multiscale geographically weighted regression for investigating process spatial heterogeneity and scale. ISPRS Int J Geo-Inf 8(6):269. https://doi.org/10.3390/ijgi8060269
Paelinck JHP, Klaassen LH (1979) Spatial econometrics. Gower, Aldershot
Radelet ML (1981) Racial characteristics and the imposition of the death penalty. Am Sociol Rev 46(6):918–927
Ripley BD (1987) Spatial point pattern analysis in ecology. In: Legendre P, Legendre L (eds) Develoments in numerical ecology. Springer, Berlin, pp 407–429. https://doi.org/10.1007/978-3-642-70880-0_11
Robinson WS (1950) Ecological correlations and the behavior of individuals. Am Sociol Rev 15(3):351–357. https://doi.org/10.2307/2087176
Ross K (2004) A mathematician at the ballpark: odds and probabilities for baseball fans. First Printing edition. Pi Press, New York
Samuels ML (1993) Simpson’s paradox and related phenomena. J Am Stat Assoc 88(421):81–88. https://doi.org/10.1080/01621459.1993.10594297
Seabold S, Perktold J (2010) Statsmodels: econometric and statistical modeling with python. In: Proceedings of the 9th python in science conference
Steel DG, Holt D (1996) Rules for random aggregation. Environ Plan A: Econ Space 28(6):957–978. https://doi.org/10.1068/a280957
Tiefelsdorf M (2000) Modelling spatial processes: the identification and analysis of spatial relationships in regression residuals by means of Moran’s I. Springer, Berlin. https://doi.org/10.1007/BFb0048754
Wagner CH (1982) Simpson’s paradox in real life. Am Stat 36(1):46–48. https://doi.org/10.1080/00031305.1982.10482778
Whittle P (1954) On stationary processes in the plane. Biometrika 41(3/4):434–449. https://doi.org/10.2307/2332724
Wolf LJ, Oshan TM, Fotheringham AS (2018) Single and multiscale models of process spatial heterogeneity. Geogr Anal 50(3):223–246. https://doi.org/10.1111/gean.12147
Wong DWS (2004) The modifiable areal unit problem (MAUP). In: Janelle DG, Warf B, Hansen K (eds) WorldMinds: geographical perspectives on 100 problems: commemorating the 100th anniversary of the association of american geographers 1904–2004. Springer Netherlands, Dordrecht, pp 571–575. https://doi.org/10.1007/978-1-4020-2352-1_93
Yu H, Fotheringham AS, Li Z, Oshan T, Kang W, Wolf LJ (2020a) Inference in multiscale geographically weighted regression. Geogr Anal 52(1):87–106. https://doi.org/10.1111/gean.12189
Yu H, Fotheringham AS, Li Z, Oshan T, Wolf LJ (2020b) On the measurement of bias in geographically weighted regression models. Spat Stat 38:100453. https://doi.org/10.1016/j.spasta.2020.100453
Funding
Funding was provided by National Science Foundation (Grant No. 1758786).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Fotheringham, A.S., Sachdeva, M. Scale and local modeling: new perspectives on the modifiable areal unit problem and Simpson’s paradox. J Geogr Syst 24, 475–499 (2022). https://doi.org/10.1007/s10109-021-00371-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10109-021-00371-5