Introduction

Measuring inequalities in disease prevalence, incidence, and mortality is an essential prerequisite for effective health targeting and intervention. Spatial variations in disease are an important element in such inequalities (Dummer, 2008; Public Health England, 2018), with the work of Faris and Dunham (1939) highlighting geographic variation in mental ill-health. As well as wide neighbourhood variations in serious mental illness, there are also considerable geographic variations in relativities between population sub-groups.

Prevalence inequalities, in particular, are of considerable importance in health service planning, as Saha et al. (2005) discuss in relation to schizophrenia. Ethnic inequalities in serious mental illness are one of several health outcomes with considerable variations between ethnic groups, others being cardiovascular disease and diabetes (Raleigh & Holmes, 2021). Because of the ethnic patterning in residential location typical in developed nations, such ethnic inequalities may contribute substantially to overall neighbourhood inequalities in the disease concerned. Hence efforts to reduce ethnic inequalities in health will often have a locational focus, and vice versa (Public Health England, 2018). It follows that establishing the ethnic basis of neighbourhood health variations is important for ensuring the effectiveness of locality-based health interventions.

Important aspects to consider in any neighbourhood modelling of ethnic health inequalities include the existence of contextual effects, whereby neighbourhood features influence disease variations between sub-groups. An example is the ethnic density effect on psychosis among non-white groups (Baker et al., 2021). Another central aspect is the spatial concentration (spatial clustering) of serious mental illness (Brown, 2013), and how it can be best represented in data modelling.

Relevance of Health Surveys

In assessing inequalities in disease prevalence, including serious mental illness, health surveys are a commonly used form of evidence. Prevalence surveys, such as the National Comorbidity Survey (NCS) in the US, and the Adult Psychiatric Morbidity Survey (APMS) in the UK, typically identify small numbers of subjects with serious mental illnesses, especially for population sub-groups. Survey estimates for psychosis inequalities are therefore typically imprecisely estimated. For the US, Narrow et al. (2002) opine that “current US mental disorder prevalence estimates have limited usefulness for service planning and are often discrepant”. Survey estimates of psychosis also tend to show under-ascertainment, with Perälä et al. (2007) citing a lifetime schizophrenia prevalence of 0.9% based on multiple information sources, compared to 0.4% in a systematic review by Saha et al. (2005).

Survey estimates also provide limited evidence regarding geographic variations in psychosis. For example, the APMS (combined data for 2007 and 2014 surveys) provides annual prevalence estimates of psychotic disorder for nine English regions, but without confidence intervals (NHS Digital, 2016); no further geographic disaggregation is available. On the broader role of population surveys in relation to social indicators, King (1997) opines that “national surveys are insufficient for learning much about spatial variation except for the grossest geographic patterns”.

Neighbourhood estimates of psychosis, and of varying population group relativities in psychosis, may be better based on other sources. For the UK, patient totals with diagnosed psychosis are available from primary care registers (note that comparable data are not available for the US). A scheme known as the Quality Outcomes Framework (QOF) (NHS England, 2023) records 563 thousand patients with diagnosed psychosis in 2019/20. The QOF register data are not, however, disaggregated by patient group, and ethnicity is not an official component of primary care records. Hence other sources of information, such as ethnicity data from the Census, become relevant.

The Contribution of the Present Study

In this paper we explore how ecological inference methods (Schuessler, 1999) applied to small area psychosis cases, as diagnosed under the QOF scheme, can be combined with Census data on neighbourhood ethnic composition to develop neighbourhood estimates of psychosis prevalence by ethnic group. The study considers neighbourhood variations in psychosis between four ethnic groups, with a spatial framework provided by 32,844 small areas (Lower Super Output Areas, LSOAs) in England.

It is shown that ecological inference can provide a distinct perspective on contextual effects – whereby group relativities (contrasting health risks for individual attributes such as ethnicity) are affected by neighbourhood characteristics. Analysis of such cross-level disease interactions has heretofore been mostly restricted to multilevel studies.

We also discuss the spatial dimension of psychosis prevalence, and the high levels of spatial clustering evident in the England neighbourhood data. Our analysis compares spatial error and spatial lag methods to represent this strong spatial patterning. Appropriate treatment of this feature should be incorporated into an ecological inference framework for neighbourhood estimates.

Major Issues: Context Effects and Spatiality

Context Effects

Context effects occur when individual health status or behaviour is affected by the broader geographic setting, illustrating cross-level interaction. With regard to political science applications, Wakefield (2004) mentions that “[voting] registration of an individual will be associated with both their own race and the race of those around them (that is, we believe that registration will be a function of both demography and geography), particularly if we allow race in both cases to be surrogates for other variables such as income, education level, etc.”.

For mental health outcomes, interaction between individual-level ethnicity and neighbourhood characteristics is illustrated by the ethnic density effect: protective social support provided by higher ethnic concentrations provides a buffer against racism, and is associated with lower rates of serious mental illness (Shaw et al., 2012). In relation to psychosis specifically, the density effect has been summarised as the risk of psychosis in minority group individuals being inversely related to neighbourhood-level proportions of others belonging to the same group (Baker et al., 2021).

One may explicitly measure contextual effects in ecological inference models, as discussed below.

Spatial Clustering

A related consideration, in both methodological and substantive terms, is the likely spatial concentration of elevated psychosis rates (spatial clustering) (Chaix et al., 2006). As mentioned by Merlo et al. (2005), such clustering is also evidence of contextual effects whereby “people from the same neighbourhood are more similar to each other than to people from different neighbourhoods with respect to the health outcome variable”. They emphasize that “clustering of individual health within neighbourhoods is not a statistical nuisance that only needs to be considered for obtaining correct statistical estimations, but a key concept in social epidemiology that yields important information by itself”.

Methodological treatments of spatial clustering, generally termed as spatial autoregression, may be classed in terms of their reliance on random effects or on fixed effects (Griffith, 2009; Ma et al., 2012). Bayesian disease mapping most commonly employs spatially configured random effects, with specification in terms of spatially conditionally autoregressive (CAR) schemes (Besag et al., 1991; Byers & Besag, 2000).

Spatial lag models, by contrast, directly express spatial autocorrelation in the response variable as a fixed effect. In substantive terms, a lag effect might express the overlap in health similarities between the artificial boundaries of administrative areas (Chaix et al., 2006). For Poisson (count) responses, Morales-Otero and Núñez-Antón (2021) mention the spatial conditional Poisson model as a form of spatial lag.

The aim of such methods should be to eliminate, or considerably reduce, spatial correlation in the model residuals (comparing actual and fitted values). Measures such as Morans I can therefore be applied after model fitting to assess this (Chen, 2016; Crespo et al., 2022).

Methods

Data Sources

We use 2019/20 data on psychosis prevalence from the Quality Outcomes Framework (Forbes et al., 2017). Small area totals based on GP practice registers are provided by the UK House of Commons “Local Health Conditions” study, with detailed area data at https://github.com/houseofcommonslibrary/local-health-data-from-QOF. According to Public Health England (PHE, 2016) “the [QOF] register is a cumulative count of all identified cases, so as the register builds it will come to show a primary care-based lifetime prevalence”. Typically, a diagnosis of psychosis in the QOF would be based on referrals to specialists or on psychosis hospitalisations, rather than assessment by GPs alone. However, patient care in the UK is commissioned by GPs in primary care, and ongoing care would be primary care based or commissioned.

Regarding ethnicity, we consider four broad groups, with data from the UK 2021 Census (see https://www.nomisweb.co.uk/): black ethnic groups (Caribbean, African and other black combined); South Asian (Asian or Asian British, specifically Bangladeshi, Indian or Pakistani origin); Mixed or Other Non-White; and all White Ethnicities. While some studies consider White British as a reference group, here we consider all persons of white ethnicity as one of the four groups.

The area framework is provided by 32,844 small areas, known as Lower Super Output Areas (LSOAs). These provide a complete coverage of England, the most populous of the four nations making up the United Kingdom. LSOAs contain on average 625 households, and populations of around 1500.

Ecological Inference Framework

Let pi = yi/ni be the proportion of a neighbourhood population total ni with a particular health condition, with prevalence or incidence total yi. In the study here, the yi are totals of diagnosed psychosis. Let k = 1,…,K denote categories of a demographic or socioeconomic attribute, with neighbourhood population totals {ni1,ni2,…,niK}. In the application here there are K = 4 ethnic groups.

Then let pik = yik/nik be the disease rates for the sub-groups, assuming for the moment that yik, and hence pik, are known. These rates represent interactions between the sub-group and neighbourhood. The ecological inference (EI) approach, widely used in political science applications (King et al., 2004), is based on the identity

$${{\text{p}}}_{{\text{i}}}={{\text{p}}}_{{\text{i}}1}({{\text{n}}}_{{\text{i}}1}/{{\text{n}}}_{{\text{i}}})+{{\text{p}}}_{{\text{i}}2}({{\text{n}}}_{{\text{i}}2}/{{\text{n}}}_{{\text{i}}})+\dots +{{\text{p}}}_{{\text{iK}}}({{\text{n}}}_{{\text{iK}}}/{{\text{n}}}_{{\text{i}}}).$$

Frequently in practice, totals yi are known but not group-specific totals yik. When the yik are unobserved, as in EI applications, the above identity is replaced by a regression model:

$${{\text{p}}}_{{\text{i}}}={\uppi }_{{\text{i}}1}({{\text{n}}}_{{\text{i}}1}/{{\text{n}}}_{{\text{i}}})+{\uppi }_{{\text{i}}2}({{\text{n}}}_{{\text{i}}2}/{{\text{n}}}_{{\text{i}}})+\dots + {\uppi }_{{\text{iK}}}({{\text{n}}}_{{\text{iK}}}/{{\text{n}}}_{{\text{i}}}),$$

where the πik are unknown area-specific prevalence (or incidence) probability parameters. Commonly the known population shares nik/ni are written as xik = nik/ni, and the model is written as

$${{\text{p}}}_{{\text{i}}}={\uppi }_{{\text{i}}1}{{\text{x}}}_{{\text{i}}1}+{\uppi }_{{\text{i}}2}{{\text{x}}}_{{\text{i}}2}+\dots + {\uppi }_{{\text{iK}}}{{\text{x}}}_{{\text{iK}}}.$$
(1)

Assuming we take the yi as Poisson distributed with means μini, we can write

$${\upmu }_{{\text{i}}}={\uppi }_{{\text{i}}1}{{\text{x}}}_{{\text{i}}1}+{\uppi }_{{\text{i}}2}{{\text{x}}}_{{\text{i}}2}+\dots +{\uppi }_{{\text{iK}}}{{\text{x}}}_{{\text{iK}}},$$

with logit links to the model terms for the individual πik. Because the yik are unknown, and similarly the πik, framing models to achieve stable estimation is important.

In that regard, Byers and Besag (2000) recommend that “Typically, it will be assumed that the πik follow a simple log-linear model, whose parameters are given appropriate prior distributions of their own. Some interactions can be included but, as the model becomes richer, so the dependence of the results on the prior is increased”. Weakened identifiability (and excess dependence on the prior) may occur, for example, if neighbourhood-group unknowns (e.g. group-specific spatial random effects) are specified in the model for the πik which are themselves unknowns. That is, additional unknowns are used in a model to predict unknowns. Analogous issues regarding potential unstable estimation – in modelling zero inflated count data – are discussed by Agarwal et al. (2002).

To represent spatial clustering in the health response, Byers and Besag (2000) instead adopt a simple model with a spatial error si following a CAR prior at the neighbourhood level (i.e. rather than spatial effects sik at the neighbourhood-group level as these may pose identifiability issues). Group-level intercepts αk are, however, included. Byers and Besag (2000) also include neighbourhood random effects ui without a spatial structure, to represent other sources of neighbourhood heterogeneity. This combination of a spatial and unstructured random effect is known as the spatial convolution model (Duncan et al., 2017). This leads to a model

$$\begin{array}{c}{\text{logit}}({\uppi }_{{\text{i}}1})={\mathrm{\alpha }}_{1}+{{\text{s}}}_{{\text{i}}}+{{\text{u}}}_{{\text{i}}},\\ {\text{logit}}({\uppi }_{{\text{i}}2})={\mathrm{\alpha }}_{2}+{{\text{s}}}_{{\text{i}}}+{{\text{u}}}_{{\text{i}}},\\ \begin{array}{c}\dots .\\ {\text{logit}}\left({\uppi }_{{\text{iK}}}\right)={\mathrm{\alpha }}_{{\text{K}}}+{{\text{s}}}_{{\text{i}}}+{{\text{u}}}_{{\text{i}}}.\end{array}\end{array}$$

Specifications Adopted in this Study: Contextual Effects and Spatial Clustering

As mentioned above, there is a potential for contextual effects such as the ethnic density effect, whereby area characteristics, such as the composition proportions xik, influence the πik. For example, the ethnic density effect might imply lower πik at higher levels of xik for non-white groups. Other, possibly nonlinear patterns of association are possible, such as relatively high psychosis rates at low ethnic concentrations, but flat or slowly declining risk above a threshold value of xik.

In the analysis here, and assuming a spatial random effects approach, we assume quadratic effects of xik on logit(πik) for the black, South Asian and mixed/other ethnic categories. So

$${\text{logit}}({\uppi }_{{\text{ik}}})={\mathrm{\alpha }}_{{\text{k}}}+{{\text{s}}}_{{\text{i}}}+{{\text{u}}}_{{\text{i}}}+{\upbeta }_{1{\text{k}}}{{\text{x}}}_{{\text{ik}}}+{\upbeta }_{2{\text{k}}}{{{\text{x}}}^{2}}_{{\text{ik}}},$$
(2.1)

for k = 1,2, and 3, with

$${\text{logit}}({\uppi }_{{\text{i}}4})={\mathrm{\alpha }}_{{\text{k}}}+{{\text{s}}}_{{\text{i}}}+{{\text{u}}}_{{\text{i}}},$$
(2.2)

for the white ethnic group.

As mentioned above, alternative specifications of spatial dependence are possible, all aiming to represent unmeasured characteristics shared by neighbouring areas but not present in the included covariates (Ma et al., 2012). We accordingly compare the random effects convolution model in (2) with a spatial lag model, which uses a single fixed effect rather than neighbourhood specific random effects.

These specifications are compared in terms of their effectiveness in removing spatial correlation in residuals, and in terms of their complexity – the spatial random effects model involves twice the number of random effects than a spatial lag model, and this additional complexity may affect fit.

Let Ni denote the neighbourhood of area i, the set of areas adjacent to area i. Then the spatial lag effect uses the spatial lag predictor

$${\text{p}^\text{L}}_\text{i}=100{\textstyle\sum_{\mathrm{j\epsilon}\mathrm N}}_{\mathrm i}{\text{y}}_\text{j}/\sum\nolimits_{\mathrm j\mathrm\epsilon{\mathrm N}_{\mathrm i}}{\text{n}}_\text{j},$$

namely the psychosis rate in neighbourhoods adjacent to neighbourhood i. The lag is expressed in terms of percent psychosis rates for improved scaling of regression coefficients.

We adopt an extended spatial lag approach, allowing heteroscedasticity to be also linked to spatial concentration. Thus the variances ϕi of the unstructured random effects ui are also taken to depend on the spatially lagged prevalence, pLi. We also use a Student t (with low degrees of freedom, 4) rather than Normal density for the ui, because this increases flexibility in relation to some neighbourhoods which have unduly high psychosis levels.

Then the alternative spatial lag model is specified as

$${\text{logit}}({\uppi }_{{\text{ik}}})={\upbeta }_{0{\text{k}}}+{{\text{u}}}_{{\text{i}}}+{\upbeta }_{1{\text{k}}}{{\text{x}}}_{{\text{ik}}}+{\upbeta }_{2{\text{k}}}{{{\text{x}}}^{2}}_{{\text{ik}}}+{{\mathrm{\rho p}}^{{\text{L}}}}_{{\text{i}}},$$
(3.1)

for k = 1,2,3, and

$${\text{logit}}({\uppi }_{{\text{i}}4})={\upbeta }_{04}+{{\text{u}}}_{{\text{i}}}+{{\mathrm{\rho p}}^{{\text{L}}}}_{{\text{i}}},$$
(3.2)

while for the random effects we assume a spatial lag model for the variances:

$$\begin{array}{l}{{\text{u}}}_{{\text{i}}} \sim {{\text{t}}}_{4}(0,{\upphi }_{{\text{i}}}),\\ {\text{log}}({\upphi }_{{\text{i}}})={\gamma }_{1}+{\gamma }_{2}{{{\text{p}}}^{{\text{L}}}}_{{\text{i}}}\end{array}$$
(3.3)

Estimation and Model Assessment

We assess the performance of these alternative specifications under a Bayesian inference approach, using Markov chain Monte Carlo (MCMC) sampling, and implemented using the WINBUGS program (Lunn et al., 2000). The estimation can be adapted to R using the nimble or R2OpenBUGS routines. Morans I values are obtained using the R program moran.test, as applied to standardized residuals (yi-niμi)/sqrt(niμi).

Fit and complexity are assessed using the widely applicable information criterion (WAIC) (Watanabe, 2013), which is lower for better fit. To assess predictive fit we also consider the sum of squared deviations between actual totals and predicted totals yi,new from the posterior predictive density p(ynew|y). Thus we take D1 = \(\sum\limits_{\mathrm i}\)(yi- yi,new)2, and the sum of absolute differences, so D2 = \(\sum\limits_{\mathrm i}\)|yi- yi,new|. Also considered are total absolute deviations between actual observations and Poisson means, D3 =\(\sum\limits_{\mathrm i}\)|yi-niμi|.

Fixed effects (the regression coefficients) are assigned N(0,100) priors. Estimates of hyperparameters, and of the πik, are based on iterations 10,000–20,000 of two chain runs, with convergence assessed using the criteria in Brooks and Gelman (1998).

An important additional output are population weighted psychosis rates by ethnic group, namely

$${{\text{P}}}_{{\text{k}}}= \sum\limits_{\mathrm i}{\uppi }_{{\text{ik}}} \;{{\text{n}}}_{{\text{ik}}}/\sum\limits_{\mathrm i} {{\text{n}}}_{{\text{ik}}}.$$

Of interest in many studies (e.g. Qassem et al., 2015) are relative risks comparing psychosis rates for non-white groups compared to whites, namely

$$\begin{array}{cc}{{\text{RR}}}_{{\text{k}}}={{\text{P}}}_{{\text{k}}}/{{\text{P}}}_{4},& {\text{k}}=\mathrm{1,2},3.\end{array}$$

Geographic variations can also be monitored, such as psychosis totals by region and ethnicity; there are nine standard regions subdividing England. LSOAs are also nested within local authorities, local government agencies with typically quarter of a million population. There are 317 such authorities, and because of the markedly uneven spatial distribution of non-white populations across England, one may anticipate wide disparities in the ethnic composition of their psychosis case load.

Results

Comparisons of Fit

Table 1 shows that the spatial lag model of Eq. (3) has better performance than the spatial convolution model (2). It has better fit (a WAIC of 166,415 compared to 170,188 under the spatial convolution model), and less complexity (a lower effective parameter count, less than half that of the spatial convolution model).

Table 1 Model Comparison, Spatial Errors vs Spatial Lag

The spatial lag model also provides better fit according to the criteria D1, D2 and D3, and is much more effective in eliminating spatial correlation in residuals. The Morans I in the original prevalence rates yi/ni is 0.77. The spatial errors model (2) provides a 95% interval for Morans I, applied to standardized residuals, of (0.161,0.171). By contrast, the analogous interval for the spatial lag model in (3), namely (-0.025,-0.015), is close to the null Morans value of -0.00003. We therefore focus in the following sections on the results from the spatial lag model regarding neigbourhood ethnic inequalities in psychosis prevalence.

Appendix Table 5 sets out parameter estimates for the spatial lag model. A coefficient is judged as significant if it’s 95% interval is entirely positive or negative. This Table shows a significant positive γ2 coefficient in (3.3), so that the variance of the heterogeneity term is increased as neighbourhood levels of psychosis increase. The ρ lag coefficient in (3.2) is also significantly positive, with mean and 95% interval, ρ  = 0.873 (0.864,0.884), in line with a prevalence “feedback” effect across arbitrary administrative boundaries. Finally, the significant β1 and β2 coefficients imply non-linearity in contextual associations; absence of contextual effects would show in zero β1 and β2 coefficients.

Estimates of Neighbourhood Psychosis Prevalence by Ethnic Group

Table 2 shows summary statistics for the estimated psychosis rates by ethnic group, both England-wide and by standard region. Table 3 summarizes the regional distribution of psychosis rates by ethnicity.

Table 2 Psychosis Rates by Ethnic Group
Table 3 Regional Disaggregation of Psychosis Cases

It can be seen from Table 2 that black and South Asian groups account for a disproportionate share of total cases: their share of cases (16.4% combined) is higher than their share of population (11.4%). These groups have significantly elevated relative risks, comparing their rates to white rates, respectively 1.73 for black groups and 1.40 for South Asians.

However, regional disparities are also apparent: the highest rates for psychosis among black and south Asian groups are in London and in North West England; these regions also have the highest all population rates (rates for all ethnic groups combined), namely 1.31% for London and 1.10% for the North West.

The share of cases accounted for by black and South Asian groups varies more markedly at lower geographic scales. This is relevant to assessing the health burden associated with psychosis, and sub-national resourcing for mental health. Thus, Qassem et al. (2015) mention that “whatever the mechanisms, the increased prevalence [of higher ethnic psychosis in some areas] bears on the equitable funding of services: areas with greater numbers of people from these ethnic minorities will merit higher levels of funding”.

Table 4 extracts the five highest and lowest five percentages of total psychosis cases from black and South Asian groups across English local authorities. The highest percentages are for three authorities in east London, for Leicester in the east Midlands, and for Slough, a town just to the west of London.

Table 4 Varying Ethnic Mix of Psychosis Cases by Local Authority

Also of relevance for public health intervention is the varying spatial extent of ethnic inequalities. The most pronounced ethnic inequalities in psychosis are not necessarily in the areas with the highest ethnic densities – a reflection of the ethnic density effect discussed below. Higher inequalities may tend to be associated more with ethnic group isolation.

Accordingly Fig. 1 maps out, for local authorities, the average relative risks comparing psychosis rates in black and south Asian groups (combined) with white groups. The highest relative risks are in less urbanized regions in East Anglia and central England, with generally low ethnic densities. At regional level, the lowest relative risk (1.28) comparing black and south Asian groups with whites is in London, the region with the largest proportion of blacks and south Asians, while the highest (1.38) is in the East of England.

Fig. 1
figure 1

Psychosis Relative Risk, England Local Authorities, Black & South Asian (combined) vs White Ethnicity

Contextual Effects

The most cited contextual effect in relation to ethnic prevalence variations is the ethnic density effect: that psychosis rates will be lower among non-white ethnic groups when such groups are at higher concentrations (e.g. population shares). Such a protective effect would also imply that relative risks comparing psychosis rates among non-white as against whites would be less marked in neighbourhoods with higher non-white concentrations.

Figure 2 shows the relationship (at LSOA level) between the relative risks, as mapped out in Fig. 1, and the proportion of neighbourhood populations in black or South Asian groups. A considerably lowered relative risk at higher ethnic concentrations is clearly apparent. This suggests another contextual expression of the ethnic density effect, namely a convergence of non-white and white ethnic psychosis rates at higher non-white population concentrations.

Fig. 2
figure 2

Relative Risk vs Concentration

Figure 3 plots out the psychosis rate among black or South Asian groups against the population concentration of these groups; this plot is based on a first order random walk in the R program INLA. The density effect shows here as a descending risk at higher concentrations, contrasting with ascending risk at relatively low concentrations.

Fig. 3
figure 3

Psychosis Rate vs Concentration

Discussion

Tackling ethnic inequality in prevalence and incidence of serious mental illness, and inequality in access to suitable care, is of major importance in strategies to improve health equity and reduce the burden of serious mental illness. These inequalities extend to physical illness and excess early mortality (Katsampa et al., 2021). With regard to mental illness in the UK, Lloyd (1998) mentions that “the relative prevalence and treatment of mental illness among different ethnic groups in Britain is probably one of the most controversial issues in the field of health variations”.

These inequalities – relating both to prevalence and treatment – affect other developed societies. With reference to the US, McGuire and Miranda (2008) mention that “mental health care disparities, defined as unfair differences in access to or quality of care according to race and ethnicity, are quite common in mental health. Although some studies question this consensus, the weight of the evidence supports the existence of serious and persistent mental health care disparities”. Coleman et al. (2016) mention that “non-Hispanic blacks, who were still less likely than whites to receive a medication for their schizophrenia even though they were nearly twice as likely as whites to receive this diagnosis”.

Relevance of The Present Study to Health Care Strategy

Ethnic inequalities in psychosis are a substantial source of overall neighbourhood variations in this condition. Geographically targeted interventions to tackle ethnic inequalities in mental illnesses such as psychosis have, however, been impeded by lack of suitably geographically disaggregated profiles of such inequalities. The review by Public Health England (2018) states in this regard that “information on health needs by ethnic group is often inadequate at local level”.

The analysis here has demonstrated the feasibility of a neighbourhood profiling analysis of ethnic contrasts in psychosis rates, from which aggregation to higher geographic scales can be made. The approach readily extends to other conditions where ethnic disparities are a central feature, examples including diabetes. For some conditions, including mental health and related outcomes, white groups have higher risks. Examples include suicide and self-harm (Alothman et al., 2022).

As well as providing evidence of wide neighbourhood variations in serious mental illnesses, this study has highlighted considerable geographic variation in relativities between population sub-groups. These are important for localized profiling and targeting of inequalities by public health agencies, as ethnic inequalities may involve disease areas other than mental ill-health, and so point to systematic health inequalities between ethnic groups. This finding regarding psychosis replicates broader evidence: Public Health England (2018, p. 32) report that “the extent of inequality between ethnic minorities and the White British population varies greatly between areas”.

In summary, two major applications of ecological inference relevant to public health prioritisation have been highlighted. The first is in establishing the location of neighbourhoods with high numbers of psychosis patients in non-white ethnic groups. The second is detecting those locations where ethnic inequalities in psychosis are most pronounced. Regions with highest concentrations of ethnic minority psychosis patients are not necessarily the same as those with highest ethnic inequality. This is likely to be because of contextual effects, in particular the protective ethnic density effect.

Geographic Scope

Several existing UK and international studies provide evidence on contextual effects on mental illness, and on the excess prevalence of psychosis among black and South Asian groups in particular. However, the current study is distinctive in providing a national perspective on variations in neighbourhood psychosis by ethnicity, for one of the UK nations. A small area definition (using LSOAs) has been adopted, with the benefit of internal social and demographic homogeneity (Pinzari et al., 2018).

By contrast, existing UK studies. albeit at small area level, have been geographically confined, for example to a single health district or clinical catchment area. Many UK studies into psychosis outcomes (most commonly of incidence rather than prevalence) which have included a neighbourhood perspective have had a focus within London.

Comparability of Ethnic Relativities

It is important that findings from ecological inference studies have substantive plausibility, based on comparison to independent evidence (Altman et al., 2004) The findings of the present study are consistent with other studies regarding ethnic variations in psychosis in the UK and developed societies more generally. Thus Qassem et al. (2015) report that “higher rates of psychosis in ethnic minorities in general, and in black ethnic groups in particular, have been consistently replicated, and are almost universal in western industrialized countries”.

The England-wide excess relative risk for black groups in the current study, around 1.73, is lower than reported by Qassem et al. (2015) – they report an odds ratio comparing blacks to whites of 2.7 (albeit with a wide confidence interval). The excess relative risk found here is comparable to that reported by the Fourth National Survey of Ethnic Minorities (Nazroo & King, 2002, page 47). Nazroo (1998) notes that relative risks for prevalence – as against incidence – may be lower if ethnic groups differ in the length of the disease course.

Psychosis risk is generally reported as highest for black groups, though Jongsma et al. (2021) and Kirkbride et al. (2012) find elevated risk for Asian ethnic groups. However, the excess risk among South Asians may be less than for other major ethnic groups. Bhavsar et al. (2021) report that “South Asians have lower rates of psychosis than other minority ethnic groups in the UK”. Here we find a relative risk for south Asian groups of 1.4, as compared to white groups.

Existing evidence for mixed ethnicity is less complete, certainly on a national scale. The official site reporting the most recent APMS data, NHS Digital (2016), provides no estimate for mixed ethnic groups. The study by Oduola et al. (2021) covering two London boroughs reports first episode psychosis (FEP) incidence among mixed groups slightly higher than for whites, particularly among males. They also report a decline in FEP incidence for mixed ethnicities. Here we find broadly comparable prevalence for mixed as against white ethnicity.

Methodological Contribution

Regarding contextual density effects, quite striking patterns of association have been found in this study, such as the declining relative risk of non-white to white psychosis as non-white population concentration increases. This suggests an additional contextual expression of the ethnic density effect, namely a convergence of non-white and white ethnic psychosis rates at high non-white ethnic concentrations.

Among methodological issues that have also received attention in the study here is that of spatial dependence in psychosis rates. It is important that any method to assess psychosis variations between population sub-groups reflects this feature, which shows in a highly elevated Moran’s I for the original psychosis rates (point estimates).

While the conditional autoregressive (CAR) spatial errors model has been the default option in Bayesian disease mapping of neighbourhood health outcomes (e.g. Best et al., 2005), the present study has shown the benefit of considering alternative representations of spatial dependence. For example, as well as showing a spatial lag effect relevant to prevalence variations, the current study has shown significant spatial heteroscedasticity linked to spatial clustering.

Study Limitations

Among limitations of the present study – and all ecological inference—is the potential dependence of the prevalence estimates on the geographic framework and modelling assumptions. As in all geographic studies, there may be scale dependence – findings for LSOAs (with an average population of 1500 may not be entirely replicated in a study at middle level super output area (MSOA) level, areas with an average 8200 population size (Ifcher et al., 2019). Regarding modelling assumptions, specific model features, such as the inclusion of a quadratic contextual effects regression, even though possibly improving fit, may affect prevalence estimates. A sensitivity analysis over different modelling assumptions is a useful recommended feature of ecological inference studies (Glynn & Wakefield, 2010). Ecological inference, in common with disease mapping studies generally, may also be subject to spatial confounding effects, with correlation between random spatial errors and fixed effects (here intercepts and coefficients in the contextual regression) (Hodges & Reich, 2010).

A research agenda to evaluate approaches such as used in the current study would likely include assessment against simulated neighbourhood ethnic specific psychosis data, that is ability to reproduce simulated yik (Freedman et al., 1998). Simulation is most sensible as an assessment check when it incorporates realistic features that match the intended application; for example, similar small area population sizes, an allowance for spatial clustering, similar overall event rates, similar event rates by age and so on.

Full assessment would, however, ideally use observed data. The neighbourhood data on psychosis available in the UK is more extensive than in many countries, but still partial in that no disaggregation by demographic attribute is available. A complete assessment of ecological inference across an extensive geographic scale might be possible for countries with more detailed health registers, including both small neighbourhood and demographic attribute information – so enabling multilevel analysis (e.g. the Danish register used by Pedersen et al., 2022). This assessment would enable comparison of the contextual effects detected by ecological inference against those detected by multilevel analysis.