Introduction

Policing analyses and intelligence-led policing are moving towards the study of small geographic areas, or micro places, to develop place-based policing strategies to reduce crime and disorder (Hutt et al. 2018; Weisburd 2018). Place-based policing draws from the empirical observation that crime is concentrated at micro geographical units, which are sometimes referred to as ‘hot spots of crime’ (Weisburd 2015, 2018). Sherman et al. (1989) found that only 3.5% of addresses in the city of Minneapolis produce 50% of all annual crime calls to the police. Pierce et al. (1988) found similar results in Boston: 2.6% of addresses produce the 50% of police calls. Weisburd et al. (2004) examined the distribution of crime in Seattle from 1989 to 2002, and found that 50% of crimes were located at 4.5% of street segments, which showed that the concentration of crimes in small areas is stable across time. Therefore, Weisburd (2015) argues that there is a law of crime concentration, which states that “for a defined measure of crime at a specific microgeographic unit, the concentration of crime will fall within a narrow bandwidth of percentages for a defined cumulative proportion of crime” (Weisburd 2015:138). Place-based policing interventions target those areas with high levels of crime and are successful in reducing crime and disorder, as shown by Braga et al. (2014) in their meta-analysis of quasi-experimental evaluations of hot spots policing. Braga et al. (2014) also found that the crime control benefits of such strategies diffuse into areas surrounding targeted places. This shows the need for the study of small areas in policing research and practice. However, the police effectiveness in reducing crime in places highly depends on its relationship with the public (Bennett et al. 2014; Jackson et al. 2013; Tyler and Bies 1990; Weisburd 2018). Areas with higher confidence in police work tend to have larger citizens’ cooperation with the police, thus enhancing the police capacity to prevent crime and deviance. Moreover, government inspections into police forces assess not only their effectiveness in reducing crime, but also they expect the police to develop programs to enhance its legitimacy and public confidence in those geographical areas where public cooperation with police services is lower (HMICFRS 2017). The confidence in police work is also distributed at micro places (Williams et al. 2019), and thus should be taken into account to design place-based policing strategies.

Police-recorded offences and crime calls are relatively easy to geocode and map, and advanced geographical analyses can be drawn from crime maps with a high level of spatial accuracy (Hutt et al. 2018). However, the confidence in policing cannot be directly observed and is mainly recorded by crime surveys, such as the Crime Survey for England and Wales (CSEW) and the National Crime Victimization Survey (NCVS). Crime surveys are usually designed to record large samples and provide reliable direct estimates only for large geographies, such as regions or cities, and small areas within these are usually unplanned domains and have small or zero sample sizes. This is the reason why more advanced statistical methods are needed to map the confidence in police work. Groves and Cork (2008) argue that model-based small area estimation (SAE) techniques are a potential tool to overcome such limitations and produce reliable small area estimates from crime surveys. SAE seeks to produce reliable estimates for unplanned areas where direct estimates are not precise enough (Rao and Molina 2015). Those estimates allow for advanced geographical analyses and precise maps of the confidence in policing and associated constructs.

In this paper we provide background information, a simulation study and an application to introduce model-based SAE techniques that account for spatially correlated random area effects to place-based policing. This is one of the first papers that evaluates and applies these methods in policing research and practice. Confidence in police work tends to show high levels of spatial clustering (Jackson et al. 2013; Williams et al. 2019), which can be taken into account in SAE models to increase the estimates’ precision. In SAE, the use of spatially correlated random area effects is increasingly in use (Chandra et al. 2007; Petrucci and Salvati 2006; Pratesi and Salvati 2008; Salvati et al. 2014). Small area estimators that incorporate the spatial autocorrelation parameter have been shown to reduce the estimates’ mean squared error when the level of spatial autocorrelation (henceforth ρ) is large. ρ measures the correlation of a variable with itself across neighboring areas. Thus, a large ρ means that geographically nearby areas tend to have similar values (i.e. high values of a variable in one area are surrounded by high values in neighboring areas and low values of a variable in one area are surrounded by low values in neighboring areas), while a ρ close to zero represents a geographically random phenomenon. Specifically, this paper introduces the Spatial Empirical Best Linear Unbiased Predictor (SEBLUP) to place-based policing. The SEBLUP is an extension of the Empirical Best Linear Unbiased Predictor (EBLUP), which is based on the Fay-Herriot (FH) model (Fay and Herriot 1979), considering correlated random area effects between neighboring areas through the simultaneous autoregressive (SAR) process (Cressie 1993; Salvati 2004).

The level of ρ of the variable of interest has shown to be relevant to improve SEBLUP estimates. Less attention has been paid to the effect of the number of areas under study, m, on SEBLUP’s performance, and particularly how m interacts with ρ to explain the SEBLUP’s increased precision. m measures the number of geographical areas for which we aim to produce estimates. For example, confidence in police work can be estimated in London at a metropolitan (m = 1), borough (m = 32) or ward level (m = 610), or even at lower geographical scales with larger number of areas. This is especially relevant for crime analysts and police departments aiming to select appropriate methods to estimate confidence in police work at different geographical scales with dissimilar number of areas. There are few studies examining the efficiency of the SEBLUP under different geographical conditions and these show contradicting results (Asfar and Sadik 2016; Petrucci and Salvati 2006; Pratesi and Salvati 2008; Salvati 2004). Thus, further examinations and applications of the method are needed.

This paper assesses the SEBLUP performance, in terms of bias and mean squared error, under different scenarios with unequal m and ρ, and provides an empirical evaluation and application to confidence in police work in London. The confidence in policing is measured here by the proportion of people who think that the police do a good job (Stanko and Bradford 2009). Thus, we gain evidence about the SEBLUP estimates’ reliability under different conditions, to examine the cases in which this estimator provides better estimates than basic model-based estimators when applied to policing data. In the simulation study, quality measures for SEBLUP estimates are compared to post-stratified and EBLUP estimates controlling for m and ρ. In the empirical evaluation, estimates of confidence in police work are produced at ward level in five London sub-regions with different number of wards. Furthermore, the application contributes to the increasing criminological research on understanding the geographical distribution of citizens’ confidence in the police (Jackson and Bradford 2010; Jackson et al. 2013; Tankebe 2012).

Section 2 provides background information on the need for accounting for the confidence in police work in policing strategies, and section 3 bridges the gap between SAE and place-based policing. Section 4 describes the SEBLUP and results of previous studies. Section 5 presents the simulation study and its results. Section 6 applies SEBLUP to produce estimates of confidence in police work in London. Section 7 draws final conclusions.

Confidence in the Police and Policing Strategies

The police effectiveness in maintaining order and preventing crime depends on its relationship with the public (Jackson and Bradford 2010; Jackson et al. 2013). Citizens’ willingness to cooperate and support police officers is essential for an effective policing service, and public cooperation with the police is shaped by the citizens’ trust in police work (Bennett et al. 2014; Tyler 2004). The residents’ confidence in police services, which shows heterogeneity between neighborhoods, affects the unequal police capacity to prevent crime in different areas. Thus, effective policing strategies need to develop measures to enhance the public confidence in police work, and inspections into police forces assess the efforts made by the police to increase their public confidence at different geographical areas (HMICFRS 2017). This is especially important in the case of place-based policing strategies, which have been criticised for having negative impacts on the perceptions about the police of targeted communities (Rosenbaum 2006).

Confidence in policing and police legitimacy are known to be driven by a series of demographic and social variables that operate at individual, micro and meso levels, and increasing research focuses on understanding their predictors at different scales. Several individual characteristics have been related with decreased confidence in police work and less willingness to cooperate with the police, such as being male and young, belonging to an ethnic minority, low education, poverty, negative perceptions of procedural justice and negative experiences with the police (Jackson et al. 2013; Sampson and Bartusch 1998; Tankebe 2012; Tyler 2004). Particular attention has been given to the study of the relationship between procedural justice and public confidence in police: citizens tend to be more confident in police services and legitimize police activities when police officers are perceived to treat people with respect and dignity (Tyler 2004; Tyler and Bies 1990).

Research has also found that confidence in policing is higher in certain neighborhoods than others, and the confidence and trust in the police are known to be influenced by neighborhood-level variables that operate at the scales of small communities (Jackson et al. 2013; Sampson and Bartusch 1998). Some of the variables used to explain the unequal distribution of the neighbors’ confidence in police work and associated constructs are the average income, unemployment rates, social cohesion, residential mobility, concentration of minorities and immigrants, and crime rates (Bradford et al. 2017; Dai and Johnson 2009; Jackson et al. 2013; Kwak and McNeeley 2017; Sampson and Bartusch 1998; Wu et al. 2009). Wu et al. (2009:150) argue that “racial composition, concentrated disadvantage, residential mobility, and violence crime rate are all good neighborhood-level predictors in determining public perception of police”. Sampson and Bartusch (1998) found that the combined effect of concentrated disadvantage, crime and ethnic concentration explains 82% of the variation between small areas in levels of satisfaction with police. Neighborhood poverty and unemployment, as forms of concentrated disadvantage, are known to shape neighbors’ social identities and decrease citizens’ attitudes and perceptions of policing services (Wu et al. 2009). Confidence in police work tends to be lower in deprived areas, while wealthy neighborhoods have more confidence in the police. While some argue that this is due to the larger police control and the more violent techniques used by the police in deprived areas (Dai and Johnson 2009), others argue that it is explained by differential social identities within cities: “residents of more socially integrated neighborhoods may feel they are connected to larger formal institutions such as the police” (Kwak and McNeeley 2017:10). People living in poor socioeconomic conditions are not only likely to be dissatisfied with the police, but with all government services (Dai and Johnson 2009).

The concentration of minorities and immigrants has also been used to explain neighborhood-level confidence in policing. Areas with larger concentrations of minorities and immigrants are likely to have lesser confidence in police work (Sampson and Bartusch 1998; Wu et al. 2009), although research conducted in the United Kingdom has found the opposite: “trust in the police was on average higher among immigrants to the United Kingdom than among the UK-born population” (Bradford et al. 2017:381). Dai and Johnson (2009) argue that the relationship between concentration of minorities and dissatisfaction with the police in the US is likely to be explained by the neighborhood concentrated disadvantage, as citizens from minority groups are disproportionately represented in deprived areas. In relation to crime rates, Kwak and McNeeley (2017) and Wu et al. (2009) found that, contrarily to what one might expect, these are not significant in predicting confidence in policing and dissatisfaction with the police. We will use this information to select covariates to fit our SAE models of confidence in policing.

Small Area Estimation in Place-Based Policing

Since 2008, when the US Panel to Review the Programs of the Bureau of Justice Statistics suggested the use of model-based SAE to produce estimates from the NCVS (Groves and Cork 2008), there have been several applications of SAE methods to policing data. Buelens and Benschop (2009) used the EBLUP based on the FH model to produce estimates of victimization rate per police zone in Netherlands. Fay and Diallo (2012) presented an extension of the temporal model developed by Rao and Yu (1994) and applied it to estimate crime by states in the US. Whitworth (2012) produced regression-based synthetic estimates of fear of crime in England and Wales. Taylor (2013) made use of multilevel models to produce synthetic estimates of perceived antisocial behaviour in England and Wales. Williams et al. (2019) introduced the spatially correlated random area effects and produced neighborhood estimates of public confidence in policing from a spatiotemporal Bayesian approach. Wheeler et al. (2017) made use of spatial models to produce synthetic estimates of attitudes towards the police. Regression-based synthetic estimates, however, are known to suffer from a high risk of bias arising from possible misspecification of models (Rao and Molina 2015). Spatial microsimulation approaches have also been used to produce estimates of crime rates (Kongmuang 2006).

Several of these studies have shown the need for incorporating the spatial autocorrelation parameter to SAE when producing estimates for designing place-based policing strategies. The spatial autocorrelation accounts for the geographical concentration of attitudes towards policing and estimators that incorporate it tend to provide more precise estimates than basic model-based estimators. The SEBLUP has shown promising results not only in simulation studies (Asfar and Sadik 2016; Chandra et al. 2007; Pratesi and Salvati 2008; Salvati 2004), but also when it has been applied to social science research, such as the estimation of poverty (Salvati et al. 2014). Thus, the SEBLUP is expected to produce promising results in the field of place-based policing. Hence, we aim to bridge this gap by demonstrating its use for estimating confidence in police work at small area level. In order to gain evidence about cases in which the SEBLUP provides better estimates than basic model-based estimators when applied to policing data, we provide a simulation study and an application.

Model Description: SEBLUP

Let us consider a target population partitioned into m small areas. In our application, estimates of confidence in policing will be produced for London wards, thus, m equals 610. In the traditional EBLUP derived from the FH model (Fay and Herriot 1979), we assume that a linking model linearly relates the quantity of inferential interest (i.e. proportion of citizens who think that police do a good job), which is usually an area mean or total δi, to p area level auxiliary variables xi = (xi1, …, xip)′ with a random effect vi:

$$ {\delta}_i={\boldsymbol{x}}_i^{\prime}\boldsymbol{\beta} +{v}_i,\kern0.5em i=1,\dots, m, $$
(1)

where β is the p × 1 vector of regression parameters and \( {v}_i\sim iid\left(0,{\sigma}_u^2\right) \). In our case, δi represents the confidence in police work and xi denotes the covariates known to be associated to confidence in policing (e.g. unemployment, concentration of minorities, poverty). The model assumes that a design-unbiased direct estimate denoted yi for δi, which is obtained from the observed sample, is available for each area i = 1, …, m:

$$ {y}_i={\delta}_i+{e}_i,\kern0.5em i=1,\dots, m, $$
(2)

where ei ∼ N(0, ψi) denotes the sampling errors, independent of vi, and ψi refers to the sampling variance of the direct estimates (Rao and Molina 2015).

The SEBLUP borrows strength from neighboring areas by adding spatially correlated random area effects (Petrucci and Salvati 2006; Salvati 2004). If we combine (1) with (2) we can write the following model:

$$ \boldsymbol{y}=\boldsymbol{X}\boldsymbol{\beta } +\boldsymbol{v}+\boldsymbol{e}, $$
(3)

where y = (y1, …, ym)′ is the vector of direct estimates of confidence in policing for m areas, X = (x1, …, xm)′ denotes the covariates associated to the outcome measure for m areas, v = (v1, …, vm)′ is a vector of area effects and e = (e1, …, em)′ is a vector of sampling errors independent of v. We assume v to follow a SAR process with unknown autoregression parameter ρ ϵ (−1, 1) and a contiguity matrix W (Cressie 1993):

$$ \boldsymbol{v}=\rho \boldsymbol{Wv}+\boldsymbol{u}, $$
(4)

where ρ represents the spatial autocorrelation coefficient of our outcome measure (i.e. confidence in policing) and W is a standardised matrix that relates each area with all neighboring areas.

We also assume (Im − ρW) to be non-singular, where Im is a the m × m identity matrix, so we can express (4) as follows:

$$ \boldsymbol{v}={\left({\boldsymbol{I}}_m-\rho \boldsymbol{W}\right)}^{-1}\boldsymbol{u}, $$
(5)

where u = (u1, …, um)′ satisfies \( \boldsymbol{u}\sim N\left({\mathbf{0}}_m,{\sigma}_u^2{\boldsymbol{I}}_m\right) \). Thus,

$$ \boldsymbol{y}=\boldsymbol{X}\boldsymbol{\beta } +{\left({\boldsymbol{I}}_m-\rho \boldsymbol{W}\right)}^{-\mathbf{1}}\boldsymbol{u}+\boldsymbol{e} $$
(6)

The vector of variance components are denoted as \( \boldsymbol{\theta} ={\left({\theta}_1,{\theta}_2\right)}^{\prime }=\left({\sigma}_u^2,\rho \right)^{\prime } \). Then, the Spatial Best Linear Unbiased Predictor (SBLUP) of \( {\delta}_i={\boldsymbol{x}}_i^{\prime}\boldsymbol{\beta} +{v}_i \) is given by

$$ {\overset{\sim }{\delta}}_i^{SBLUP}\left(\boldsymbol{\theta} \right)={\boldsymbol{x}}_i^{\prime}\overset{\sim }{\boldsymbol{\beta}}\left(\boldsymbol{\theta} \right)+{\boldsymbol{b}}_i^{\prime}\boldsymbol{G}\left(\boldsymbol{\theta} \right){\boldsymbol{\varSigma}}^{-\mathbf{1}}\left(\boldsymbol{\theta} \right)\left\{\boldsymbol{y}-\boldsymbol{X}\overset{\sim }{\boldsymbol{\beta}}\left(\boldsymbol{\theta} \right)\right\} $$
(7)

where \( {\boldsymbol{b}}_i^{\prime } \) is a 1 × m vector (0,…,1,0,…,0) with 1 in position i. G(θ), the covariance matrix of v, is given by \( \boldsymbol{G}\left(\boldsymbol{\theta} \right)={\sigma}_u^2{\left\{\left({\boldsymbol{I}}_m-\rho \boldsymbol{W}\right)\prime \left({\boldsymbol{I}}_m-\rho \boldsymbol{W}\right)\right\}}^{-\mathbf{1}} \). Σ(θ), which is the covariance matrix of y, is defined as Σ(θ) = G(θ) + Ψ, where Ψ =  diag (ψ1, …, ψm). And \( \overset{\sim }{\boldsymbol{\beta}}\left(\boldsymbol{\theta} \right) \), the weighted least squares estimator of β, is obtained as \( \overset{\sim }{\boldsymbol{\beta}}\left(\boldsymbol{\theta} \right)={\left\{{\boldsymbol{X}}^{\prime }{\boldsymbol{\varSigma}}^{-\mathbf{1}}\left(\boldsymbol{\theta} \right)\boldsymbol{X}\right\}}^{-\mathbf{1}}{\boldsymbol{X}}^{\prime }{\boldsymbol{\varSigma}}^{-\mathbf{1}}\left(\boldsymbol{\theta} \right)\boldsymbol{y} \).

The SEBLUP is obtained by replacing a consistent estimator of θ by \( \hat{\boldsymbol{\theta}}=\left({\hat{\sigma}}_u^2,\hat{\rho}\right)^{\prime } \):

$$ {\hat{\delta}}_i^{SEBLUP}={\overset{\sim }{\delta}}_i^{SEBLUP}\left(\hat{\boldsymbol{\theta}}\right)={\boldsymbol{x}}_i^{\prime}\overset{\sim }{\boldsymbol{\beta}}\left(\hat{\boldsymbol{\theta}}\right)+{\boldsymbol{b}}_i^{\prime}\boldsymbol{G}\left(\hat{\boldsymbol{\theta}}\right){\boldsymbol{\varSigma}}^{-\mathbf{1}}\left(\hat{\boldsymbol{\theta}}\right)\left\{\boldsymbol{y}-\boldsymbol{X}\overset{\sim }{\boldsymbol{\beta}}\left(\hat{\boldsymbol{\theta}}\right)\right\}. $$
(8)

If we assume the normality of the random effects, we can estimate \( {\sigma}_u^2 \) and ρ based on different procedures. In this research, we consider the Restricted Maximum Likelihood estimator, which takes into account for the loss in degrees of freedom derived from estimating β, while other estimators, such as the Maximum Likelihood estimator, do not (Rao and Molina 2015). The assumption of normality of the random effects is reasonable in those cases in which area-level direct estimates are normally distributed, as tends to be the case in criminological studies looking into the confidence in police work (Williams et al. 2019), emotions about crime (Whitworth 2012) and rates of some crime types at large spatial scales (Fay and Diallo 2012). However, such assumption may be considered invalid in those cases in which the normality of direct estimates is not met. This may be the case of studies analysing specific crime types at detailed spatial scales, as these may show zero inflated skewed distributions and thus robust SAE techniques adjusted to non-normal distributions are needed (Dreassi et al. 2014).

Previous Studies Using the SEBLUP

The SEBLUP has not yet been used to estimate crime rates or confidence in the police. However, a series of simulation studies and applications analysing economic and agricultural outcomes have shown that the SEBLUP tends to outperform EBLUP estimators when ρ moves away from zero -especially when it is close to −1 or 1 (Chandra et al. 2007; Petrucci and Salvati 2006; Pratesi and Salvati 2008). There are very few simulation studies that investigate the impact of m, and the interaction between m and ρ, on the SEBLUP’s performance, and these show contradicting results. Salvati (2004) examined the precision of SEBLUP estimates for m equal to 25 and 50, and ρ = {±0.25,±0.5,±0.75}, and concluded that the improvement in the estimates’ accuracy is higher when the spatial autoregressive coefficient increases, but also that “benefit is bigger as the number of small areas increase” (Salvati 2004:11). In policing research, the SEBLUP is thus expected to produce more reliable estimates than the EBLUP when the values of the variable of interest geographically cluster together, as observed in many studies on crime and crime perceptions (Baller et al. 2001; Williams et al. 2019), and when the number of areas for which we aim to produce estimates is large. Therefore, in cases like the one encountered by Gemmell et al. (2004), who produced estimates of drug use for ten local authorities in Greater Manchester, the EBLUP is expected to produce better estimates than the SEBLUP due to the small number of areas under study.

Asfar and Sadik (2016) analyzed the SEBLUP’s relative mean squared errors under m equal to 16, 64 and 144, and they found large relative improvement of SEBLUP estimates even when ρ is very small (ρ = 0.05) and small (ρ = 0.25), also in cases of very few areas under study (m = 16). In addition, such improvement was sometimes larger when m was equal to 16 than in cases of m equal to 64 and 144. These results are not consistent with other simulation studies, which show that SEBLUP’s relative performance improves as the number of areas increases (Salvati 2004), and the SEBLUP’s precision is not improved if ρ ≅ 0 in cases of m equal to 25 and 50 (Salvati 2004), 61 (Petrucci and Salvati 2006), 23 (Chandra et al. 2007) and 42 (Pratesi and Salvati 2008). Therefore, further research is needed to understand how both ρ and m affect the SEBLUP’s relative precision, and we assess the performance of the SEBLUP in Section 5.

Simulation Study

In this section we describe the simulation study designed to assess the effect of m and ρ on the SEBLUP’s performance in comparison to EBLUP and post-stratified estimators.

Generating the Population and Simulation Steps

The population is generated based on previous simulation studies such as Petrucci and Salvati (2006) and Pratesi and Salvati (2008). Similar approaches have also been used in Asfar and Sadik (2016), Molina et al. (2009) and Salvati (2004). Simulation parameters are based on previous simulation experiments to allow comparisons and reproducibility. The population is generated following a linear mixed-effect model with random area effects of neighboring areas correlated to the SAR dispersion matrix with fixed autoregressive coefficient:

$$ {y}_{ij}={x}_{ij}\beta +{v}_i+{e}_{ij},\kern0.5em i=1,\dots, m,\kern0.5em j=1,\dots, {N}_i, $$
(9)

where xij is the value of the covariate x for unit j in area i, vi denotes the area effect and eij is the individual error. The simulation parameters are given as follows: β = 0.74, \( {\sigma}_u^2=90 \), σ2 = 1.5 (Petrucci and Salvati 2006). v = [v1, …, vm] is generated from a \( \mathrm{MVN}\left(0,{\sigma}_u^2{\left[\left(\mathbf{I}-\rho \mathbf{W}\right)\left(\mathbf{I}-\rho {\mathbf{W}}^{\prime}\right)\right]}^{-1}\right) \), and \( \mathbf{e}={\left[{e}_{11},{e}_{12},\dots, {e}_{ij},\dots, {e}_{m{N}_m}\right]}^{\prime } \) from a N(0, σ2). xij values are generated from a uniform distribution between 0 and 1000 and Ni = [N1, …, Nm] is generated from uniform distribution between 100 and 300. The population size is \( N=\sum \limits_{i=1}^m{N}_i \). Thus, we simulate 42 different populations based on different values of spatial autoregressive coefficient, ρ = {0, ±0.25, ±0.5, ±0.75}, and number of areas, m = {16, 25, 36, 64, 144, 225}. yij is then produced as a continuous and normally-distributed variable with random area effects of contiguous areas. As a result, area-level aggregates and estimates are continuous, normally distributed and geographically aggregated, as is usually the case of many criminological variables such as confidence in police services, fear of crime or general crime rates at large scales (Fay and Diallo 2012; Williams et al. 2019; Whitworth 2012). Future research should also examine different simulation parameters with smaller intra-class correlations.

All maps used are hypothetical maps based on perfect squares divided into m number of areas, where the maximum number of neighbors is 8 and the minimum is 3 at the corners (see Fig. 1). Future research should conduct similar studies using more realistic maps. Neighboring areas are defined based on a ‘Queen Contiguity’ matrix, typically the most common structure used in simulation studies, which defines as neighbors all areas that share borders or at least one vertex. The W matrix is standardised by rows, so that every row adds up to 1.

Fig. 1
figure 1

Three examples of hypothetical maps used in simulation study

The simulation consists in the following steps for each simulated population:

  1. 1.

    Selection of t = 1, …, T (T = 1000) simple random samples without replacement. Sample sizes are drawn with the only constraint of a minimum of two units selected in each area (Salvati 2004). The average sample size per area is \( \overline{n}=48.8 \).

  2. 2.

    In each sample, post-stratified, EBLUP and SEBLUP estimates are computed and compared based on Pratesi and Salvati (2008). The post-stratified estimator is given by the following:

$$ {\hat{Y}}_i(pst)={\sum}_{j\in {s}_i}\frac{y_{ij}}{n_i}, $$
(10)

where si is the set of ni sample units falling in area i.

  1. 3.

    The results are evaluated by the absolute relative bias, absolute relative error, relative root mean squared error, and mean squared error averaged through the samples and small areas (Petucci and Salvati 2006). These are denoted by \( \overline{ARB} \), \( \overline{ARE} \), \( \overline{RRMSE} \), and \( \overline{MSE} \), and given by the following formulas, respectively:

$$ \overline{ARB}=\frac{1}{m}\sum \limits_i^m\left|\frac{1}{T}\sum \limits_{t=1}^T\left(\frac{{\hat{Y}}_{it}}{Y_i}-1\right)\right| $$
(11)
$$ \overline{ARE}=\frac{1}{m}\sum \limits_i^m\frac{1}{T}\sum \limits_{t=1}^T\left(\left|\frac{Y_{it}}{Y_i}-1\right|\right) $$
(12)
$$ \overline{RRMSE}=\frac{1}{m}\sum \limits_i^m\frac{\left[\overline{MSE}{\left({\hat{Y}}_i\right)}^{1/2}\right]}{Y_i} $$
(13)

with

$$ \overline{MSE}=\frac{1}{m}\sum \limits_i^m\frac{1}{T}\sum \limits_{t=1}^T{\left({\hat{Y}}_{it}-{Y}_i\right)}^2, $$
(14)

where \( {\hat{Y}}_{it} \) denotes the estimate (post-stratified, EBLUP or SEBLUP) for small area i in sample t and Yi the true value observed in the population for area i.

The simulation study has been coded in R software (Molina and Marhuenda 2015) and results are detailed in Tables 1, 2, 3 and 4.

Table 1 Estimates’ relative root mean squared error, absolute relative Bias and absolute relative error (×100)
Table 2 Relative difference between EBLUP and spatial EBLUP’s RRMSE (×100)
Table 3 Relative difference between EBLUP and spatial EBLUP’s ARB (×100)
Table 4 Relative difference between EBLUP and spatial EBLUP’s ARE (×100)

Results: Comparison of EBLUP and SEBLUP Estimates

Table 1 shows the \( \overline{RRMSE} \), \( \overline{ARB} \) and \( \overline{ARE} \) of post-stratified, EBLUP and SEBLUP estimates from each simulated population. Both EBLUP and SEBLUP estimators outperform post-stratified estimators in all cases, in terms of \( \overline{RRMSE} \) and \( \overline{ARE} \), regardless of the spatial correlation parameter and the number of areas under study. The post-stratified estimator performs better in terms of \( \overline{ARB} \), as expected. ρ and m do not affect the EBLUP or SEBLUP’s relative difference towards post-stratified estimates regardless of the quality measure selected. The relative difference between post-stratified and SEBLUP estimates’ \( \overline{RRMSE} \), which expresses the absolute percentage change of the estimate quality measure, has been calculated as follows:

$$ RD\%=\frac{\overline{RRMSE}\left[{\hat{\delta}}^{SEBLUP}\right]-\overline{RRMSE}\ \left[\hat{Y}(pst)\right]\ }{\overline{RRMSE}\ \left[\hat{\mathrm{Y}}(pst)\right]}\times 100 $$
(15)

Equation (15) gives the measure of efficiency of \( {\hat{\delta}}^{SEBLUP} \) over \( \hat{Y}(pst) \) estimates.

The relative difference between post-stratified and SEBLUP estimates’ \( \overline{RRMSE} \) varies between a maximum of −5.83% in the case of m = 64 and ρ = 0.75 and a minimum of −14.29% in the case of m = 16 and ρ = 0, having also small values such as −13.99% in the case of m = 25 and ρ = 0.25, −13.40% in the case of m = 144 and ρ = 0, and − 13.00% in the case of m = 144 and ρ =  − 0.5. In other words, neither ρ nor m can be used to interpret the increased precision, in terms of \( \overline{RRMSE} \) and \( \overline{ARE} \), of EBLUP and SEBLUP estimates when compared to post-stratified estimates. However, both ρ and m have a large impact in the improvement of the SEBLUP estimates, which perform substantially better than EBLUP estimates for those cases with a medium and large spatial correlation parameter (especially ρ = {±0.50, ±0.75}) and a large number of areas (notably m = {144, 255}) (see Tables 2, 3 and 4).

Table 2 shows the relative difference between EBLUP and SEBLUP estimates’ \( \overline{RRMSE} \), as shown in Eq. (15), formatting the cells based on a black-to-white colour scale. Darker scales represent positive values, meaning a better performance of EBLUP estimates with respect to their quality measure, and white scales refer to negative values, which show that SEBLUP estimates improve their quality measure when compared to EBLUP estimates. First, it is clear from Table 2 that SEBLUP estimates outperform EBLUP estimates, in terms of \( \overline{RRMSE} \), when the spatial correlation parameter is large, while EBLUP estimates tend to be more precise than the SEBLUP when ρ is close to 0. The SEBLUP is thus preferred over the EBLUP to examine social issues that spatially cluster together, as is the case of crime rates (Baller et al. 2001) and perceptions about crime and the police (Jackson et al. 2013; Williams et al. 2019). Second, the relative difference between EBLUP and SEBLUP estimates’ \( \overline{RRMSE} \) shows that the benefit obtained by borrowing strength from neighboring areas is larger as the number of areas increases. For example, for m = 25 the relative difference of \( \overline{RRMSE} \) shows that SEBLUP estimates are more precise than the EBLUPs only when the spatial correlation parameter is very large (ρ = 0.75), while the SEBLUP outperforms the EBLUP in all cases for m = 255, even when ρ = 0. In other words, the EBLUP is expected to outperform the SEBLUP in studies producing estimates for a small number of areas (e.g. estimates of drug use for ten local authorities; Gemmell et al. 2004); while the SEBLUP produces more reliable estimates when the number of areas under study is large (e.g. estimates of perceived disorder for 282 neighborhoods; Buil-Gil et al. 2019). Therefore, both ρ and m need to be taken into account to explain SEBLUP estimates increased precision in terms of \( \overline{RRMSE}s \), and SEBLUP estimates perform better as the number of areas under study increases.

Table 3 shows the relative difference between EBLUP and SEBLUP estimates’ \( \overline{ARB} \) and Table 4 shows the relative difference between their \( \overline{ARE} \). Looking at Table 3, it is clear that SEBLUP estimates perform better than EBLUPs, in terms of \( \overline{ARB} \), when the number of areas is large (especially m = {144, 255}), but not in cases of m = {16, 25, 36}. For m = 64, SEBLUP estimates’ \( \overline{ARB} \) is only improved when ρ =  − 0.5 and ρ =  − 0.75. Again, while the \( \overline{ARB} \) of SEBLUP estimates was not improved in any case for m = {16, 25, 36}, such quality measure shows that SEBLUP estimates outperform EBLUPs, in terms of \( \overline{ARB} \), in all simulations performed for m = 255.

Table 4 also shows that both ρ and m have a large impact to improve SEBLUP estimates’ precision, now in terms of \( \overline{ARE} \). For example, for m = 25 the relative difference between EBLUP and SEBLUP’s \( \overline{ARE} \) shows that EBLUP estimates outperform SEBLUPs in all cases except for ρ = 0.75; while for m = 144 such value shows a better precision of SEBLUP estimates except when ρ = 0, and for m = 255 the SEBLUP estimator produces better estimates than the EBLUP in every single case.

Empirical Evaluation and Application: Confidence in Police Work in London

In this section we assess and apply the SEBLUP in a real case scenario. We produce direct, EBLUP and SEBLUP estimates of confidence in police work at ward level in Greater London from Metropolitan Police Service Public Attitudes Survey (MPSPAS) 2012 data. Such an application provides further evidence about the SEBLUP performance when applied to policing data. Moreover, this application produces a reliable map of the confidence in police work in London and deepens the meso-level explanatory mechanisms of confidence in policing, by which we mean the proportion of citizens who think the police do a good job (Jackson and Bradford 2010; Stanko and Bradford 2009). We then draw the map of the distribution of confidence in policing in London.

There are various reasons why this research has been conducted using London survey data instead of any other city. First, London is one of the few cities with an available local survey designed to measure the confidence in police work. Second, the Greater London Authority website provides information about many auxiliary variables that are relevant for this research and may be used as covariates. Third, London is a well-researched city (Hutt et al. 2018; Jackson et al. 2013; Stanko and Bradford 2009) and thus it is easier to exclude the possibility of drawing spurious associations due to uncontrolled variables. And fourth, during preliminary conversation with Greater London Authority’s officers it was acknowledged that this research’s potential insights may be of great value for decision-making purposes.

Data and Methods

Data from the MPSPAS 2012 have been used to produce estimates of confidence in police work. MPSPAS is an annual survey conducted by the Metropolitan Police Service since 1983, which records information about perceptions of policing needs, worry about victimization and perceived security and disorder. It consists on a face-to-face questionnaire conducted at the homes of respondents, and it obtains responses from a random probability sample of residents in each of the 32 boroughs in Greater London. Household addresses are selected randomly in each borough, and then the person in each household whose next birthday is closest to the date of the interview is asked to answer the questionnaire. The sample is representative of residents aged 15 or over and it should be large enough to allow analyses at borough level but not at smaller scales. Access to the low level geographies of the MPSPAS was only granted for the 2012 edition, and thus small area estimates of confidence in policing are only produced for this year.

Small area estimates will be produced at the ward level for the five London sub-regions. Each sub-region contains a different number of wards: Central London is composed of 114 wards in six boroughs, North London is composed of 61 wards in three boroughs, South London is composed of 120 wards in six boroughs, East London is composed of 192 wards in ten boroughs, and West London is composed of 140 wards within seven boroughs. The average sample size per borough is \( \overline{n}=401.03\ \left( sd=3.82\right) \) and the average sample size per ward is similar in all sub-regions: in Central London \( \overline{n}=20.23 \), in the North \( \overline{n}=19.02 \), in the South \( \overline{n}=19.37 \), in the East \( \overline{n}=20.6 \), and in West London \( \overline{n}=19.44 \). On average, there are 19.85 citizens sampled per ward. Note that three wards in Central London and fourteen in East London suffered from zero sample sizes, and thus were not included in our analyses. Regression-based synthetic estimates are used in these seventeen areas.

The variable used to measure confidence in police work has been obtained from the question “Taking everything into account, how good a job do you think the police in this area are doing?”, as suggested by Stanko and Bradford (2009). In order to produce more easily interpretable results, responses were dichotomised to a 0–1 measure, where 1 refers to “Excellent” or “Good”, while “Very poor”, “Poor” and “Fair” responses were recoded as 0. “Don’t know” answers were coded as missing data. We then produce estimates of the proportion of people who think the police are doing a good or excellent job in local area (defined in the survey as the area within about 15 min’ walk from home). Based on the literature review, we fitted EBLUP and SEBLUP models using the following area-level covariates: proportion of black and minority ethnic groups 2011, mean household income 2011–12, crime rate 2011–12, proportion of residents born outside the UK 2011, and proportion of citizens unemployed 2011. All covariates are recorded by the Greater London Authority’s Ward Profiles and Atlas (https://data.london.gov.uk/dataset/ward-profiles-and-atlas). We found no available or reliable estimates at the ward level of other covariates explored by previous literature, such as residential instability, perceived disorder and collective efficacy, and thus these are subject of future research.

Direct estimates of the proportion of residents who think that police services do a good or excellent job are produced from the following estimator (Horvitz and Thompson 1952):

$$ {\hat{Y}}_i(dir)={N}_i^{-1}{\sum}_{j\in {s}_i}{w}_{ij}{y}_{ij}, $$
(16)

where wij corresponds to the survey weight of unit j from area i (provided by the original survey), and yij is the score of unit j from area i. Original survey weights are computed as the proportional distribution by borough of all citizens aged 15 or more across London (derived from Census data) divided by the proportional distribution of the unweighted sample by borough. In order to produce the SEBLUP estimates, a first-order ‘Queen Contiguity’ structure is used to define neighboring areas.

Estimates Reliability Measures

In order to assess the estimates produced in each sub-region, Table 5 shows direct, EBLUP and SEBLUP estimates’ average RRMSE, as well as the average Relative Difference (\( \overline{RD}\% \)) between EBLUP and SEBLUP’s estimates \( \overline{RRMSE} \). The direct estimates’ RRMSE is the Coefficient of Variation (Rao and Molina 2015), while the EBLUP estimates’ RRMSE is obtained from Prasad-Rao analytical approximation (Prasad and Rao 1990) and SEBLUPs’ RRMSEs have been produced using an analytical approximation as in Molina et al. (2009).

Table 5 Estimates’ quality measures

Table 5 shows that direct estimates are the least precise (larger \( \overline{RRMSE} \)) in all cases, as expected. SEBLUP estimates are more reliable than EBLUPs, in terms of \( \overline{RRMSE} \), in all six scenarios. The \( \overline{RD}\% \) shows that the averaged increased precision of SEBLUP estimates compared to EBLUPs is larger as both the ρ and m increase. First, although ρ is similar in North (ρ= 0.03) and East London (ρ=0.06), the \( \overline{RD}\% \) shows better results in the East (\( \overline{RD}\%=\hbox{--} 1.89 \)) compared to the North (\( \overline{RD}\%=\hbox{--} 0.28 \)) partly due to the larger m in East London (m=178). Then, even though the low ρ partly explains the small increased precision of SEBLUP estimates when compared to EBLUPs, the spatial autocorrelation parameter cannot be used on its own to explain why such increased precision is higher in the case of m = 178 than m = 61. Second, although m is slightly larger in South London (m=120) compared to Central London (m=111), the \( \overline{RD}\% \) is higher in Central London (\( \overline{RD}\%=-7.25 \)) due to the high spatial autocorrelation parameter (ρ = 0.74). Finally, the best relative results of the SEBLUP estimator have been obtained in Central London, where both m (111) and ρ (0.74) are large, and West London for the same reason (m=140 and ρ=0.60). In the case of all areas, m is large (610) and ρ is equal to 0.46, and thus the averaged Relative Difference between EBLUP and SEBLUP’s estimates \( \overline{RRMSE} \) is quite high (\( \overline{RD}\%=-4.76 \)). These results provide empirical evidence to support the simulation study results: the SEBLUP should be used in those studies producing estimates of geographically concentrated phenomena (Baller et al. 2001) for a large number of areas; while the EBLUP is preferred when producing estimates of non-geographically concentrated phenomena with a small spatial autocorrelation coefficient for a small number of domains.

Table 5 also shows that the level of spatial clustering of the public confidence in police work is much larger in Central and Western London than in the North and East, and there is a medium level of spatial concentration in the South. In other words, while neighboring areas tend to show similar values of confidence in the police in Central and Western London, and thus policing interventions may be planned for groups of areas, in the North and East place-based policing strategies should be adjusted to the characteristics and needs of each small area.

Mapping the Confidence in Police Work

Goodness-of-fit indices are analyzed to assess the models used in this application. Log-likelihood, AIC and BIC measures show that the SEBLUP model has a better goodness of fit than the EBLUP, and thus we focus on its results (see Table 6).

Table 6 Goodness-of-fit indices of EBLUP and SEBLUP models of confidence in police work

Table 7 shows the results of the EBLUP and SEBLUP models fitted to produce estimates of confidence in police work for all London wards. All covariates but the crime rate show significant relations with the confidence in police work (Kwak and McNeeley 2017; Wu et al. 2009). The proportion of citizens unemployed is the most important covariate introduced in our area-level SEBLUP model, followed by the concentration of ethnic minorities and the proportion of immigrants (Dai and Johnson 2009; Kwak and McNeeley 2017; Sampson and Bartusch 1998; Wu et al. 2009). The mean income also shows a significant but smaller positive relation with the confidence in the police.

Table 7 EBLUP and SEBLUP models of confidence in police work (all areas)

Figure 2 shows the geographical distribution of SEBLUP estimates of confidence in police work at ward level in Greater London, where lighter scales of grey indicate a lower proportion of citizens who think that police do a good or excellent job, and darker scales of grey shows higher confidence in police work. The highest estimates of confidence in police work have been found in eight wards located in Central London, six of which are in Kensington and Chelsea (Chelsea Riverside (97.3%), Campden (89.99%), Earl’s Court (86.66%), Courtfield (86.28%), Queen’s Gate (85.47%) and Brompton and Hans Town (84.62%)) and two in Westminster (Lancaster Gate (88.46%) and Marylebone High Street (88.38%)). There are also high proportions of citizens who think that police do a good job in some western areas of Harrow, Richmond upon Thames and Hammersmith and Fulham. The lowest proportions have been estimated in Alexandra, located in Haringey (43.79%), followed by 27 eastern wards distributed among Lewisham, Newham, Barking and Dagenham, Redbridge, Tower Hamlets, Barking and Dagenham and Greenwich. From a broader perspective, these results add evidence to the estimates produced by the London Mayor’s Office for Policing and Crime (https://maps.london.gov.uk/NCC/) at a larger geographical scale, which show the highest levels of trust in policing in Central and Southwest London and lower trust in the police in East and North London.

Fig. 2
figure 2

Proportion of citizens who think the police do a good or excellent job (SEBLUP estimates). Division based on quartiles

Model Diagnostics

We provide diagnostics of our spatial models by analysing the normality of SEBLUP standard residuals. Residuals are produced as suggested by Petrucci and Salvati (2006:178) and normal q-q plots are shown in Fig. 3. Most residuals show no important deviations. The Shapiro-Wilk test for normality fails to reject the null hypothesis of normal distribution in all five cases: W = 0.984 and p − value = 0.204 in the case of Central London, W = 0.969 and p − value = 0.128 in the model fitted for North London, W = 0.967 and p − value = 0.089 for South, W = 0.939 and p − value = 0.079 in the case of East London, and W = 0.975 and p − value = 0.098 for West London. We also fail to reject the null hypothesis of normal distribution for the model fitted with all areas: W = 0.964 and p − value = 0.121.

Fig. 3
figure 3

Normal q-q plots of standardised residuals of SEBLUP estimates

Discussion and Conclusions

Place-based policing requires the incorporation of SAE when producing maps of confidence in police work at small geographical levels. By producing reliable small area estimates of confidence in policing, we allow for advanced spatial analyses to explain its distribution and provide precise maps to develop place-based interventions to enhance confidence in police work and reduce crime and disorder. While police records are easily geocoded and mapped, advanced statistical analyses are required to produce reliable estimates of survey-recorded confidence in the police. Small geographical areas are unplanned domains in most crime surveys, and thus model-based SAE is needed to produce estimates of adequate precision (Rao and Molina 2015). Due to the typically high levels of spatial autocorrelation of confidence in policing, we propose making use of the SEBLUP to increase the reliability of estimates produced from crime surveys. The simulation study and application results allow examining the cases in which the SEBLUP produces better estimates than traditional model-based estimators when applied to policing data. Our estimates of confidence in police work not only have tactical and strategical value to design place-based policing interventions, but they also are important from an accountability point of view: government and auditors’ inspections into the police expect that police forces enhance their public confidence and legitimacy (HMICFRS 2017).

We have assessed the SEBLUP performance under different scenarios with unequal number of areas and spatial correlation parameters. Our results show that the SEBLUP tends to outperform the EBLUP not only when ρ moves away from zero and is close to 1 and − 1, but also when m is large. The SEBLUP performs better as the number of areas under study increases, while the EBLUP estimator outperforms the SEBLUP both when ρ ≅ 0 and m is small. Future work will investigate the SEBLUP using different simulation parameters with smaller intra-class correlations and more complex contiguity matrices, such as second-order ‘Queen Contiguity’ and distance weighted matrices. Furthermore, future research will examine whether small area estimators that borrow strength from temporal series, such as the Rao and Yu (1994) model, provide more reliable estimates in policing research, since confidence in policing is known to be quite stable over time and thus temporally correlated random effect can be used in this field.

From a substantive perspective, our estimates show that citizens are more confident in policing in most Central and Southwestern London neighborhoods, while estimates show a lower confidence in the police in East and North London. Unlike previous research, our estimates are produced at a ward level and thus allow not only for mapping the distribution of confidence in police work at a large scale, but these also bring to light internal heterogeneity in the levels of confidence at a neighborhood level. In Central London, for example, estimates are significantly higher in the northern part of the River Thames, where Westminster and Kensington and Chelsea are located, than in the Southern part of the river. Although crime rates are higher in the northern part of the river, these do not appear to be as significant as the unemployment rate and concentration of minorities, which are more prominent in the southern part of Central London, to explain the distribution of confidence in policing. Our estimates also allow distinguishing clear differences within West London, where confidence in police is clearly higher in most Hounslow wards than in the majority of Ealing neighborhoods, where unemployment and deprivation is more common. These estimates are useful to develop more accurate explanations of the distribution of confidence in police work and to design place-based policing strategies to increase the public confidence in policing and their cooperation with police services.

The unemployment rates, concentration of minorities and immigrants and average income have shown to be good area-level predictors of the confidence in police work (Bradford et al. 2017; Dai and Johnson 2009; Jackson et al. 2013; Kwak and McNeeley 2017; Sampson and Bartusch 1998; Wu et al. 2009). The two most important covariates (among those included in our models) to explain the geographies of confidence in police work in London are the unemployment rates and concentration of ethnic minorities. As argued by Sampson and Bartusch (1998:801): “perhaps we should not be surprised that those most exposed to the numbing reality of pervasive segregation and economic subjugation become cynical about human nature and legal systems of justice”. High levels of unemployment and ethnic segregation, as forms of deprivation, might explain that neighbors’ local identities shaped by deprivation are less willing to trust and cooperate with police services (Kwak and McNeeley 2017), but also with other government services (Dai and Johnson 2009). Other researchers argue that this might also be due to an excessive police control and use of force on certain communities with larger concentration of minorities (Dai and Johnson 2009). Open access to Metropolitan Police stop and search data was available only after 2015 and the spatial information about police use of stop and search was available only since mid-2016, and thus we could not include this covariate in our analyses (based on survey data from 2012). However, our area-level estimates of confidence in police work from 2012 show a significant negative Spearman correlation with the proxy measure of stop and search in 2017 (stop and search count: ρ = −0.22, p value < 0.01; stop and search per resident: ρ = −0.16, p value < 0.01). Thus, future research with newer survey data should incorporate this covariate to explore the effect of stop and search on the confidence in police work. Similar mechanisms are used to explain the effect of the concentration of immigrants and average income in the confidence in police work, although these show smaller coefficients in our study. Immigrants and citizens with low income tend to cluster in areas with large levels of concentrated disadvantage -and possibly higher police control and use of force- where social attitudes of distrust towards the police are likely to emerge.

Future research with newer survey data will focus on scoping for other available covariates (e.g. residential instability, collective efficacy, stop and search) to estimate confidence in the police at a ward or smaller spatial levels; and to examine causal mechanisms between economic deprivation, ethnic segregation and confidence in police work (Dai and Johnson 2009). Further research will also replicate similar analyses in other cities and countries with different social and demographic characteristics (and available survey data) to assess the generalizability of the current study’s findings. In addition, new SAE methods are needed that deal with semicontinuous zero-inflated skewed data in policing data (see Dreassi et al. 2014). By expanding the body of research that makes use of SAE techniques in policing research and practice, these methods may become a core tool in survey-based crime analysis and place-based policing.