Abstract
In this paper we attempt to answer the following question: “Is it possible to obtain reliable estimates for the prevalence of anemia rates in children under five years in the districts of Peru?” Specifically, the objective of the present paper is to understand to which extent employing the basic and the spatial Fay–Herriot models can compensate for inadequate sample size in most of the sampled districts, and whether the way of choosing the spatial neighbors has an impact on the resulting inference. Furthermore, we explore the question of how to choose an optimal way to define the neighbors. As such, our research focuses on studying the prediction accuracy of the aforementioned models, and on the sensitivity of the results to the definition of “neighbor”. We use the data from the Demographic and Family Health Survey of the year 2019, and the National Census carried out in 2017.
Similar content being viewed by others
References
Alcázar L (2012) Impacto económico de la anemia en el Perú. Grupo de Análisis para el Desarrollo (GRADE)
Anselin L (1992) Spatial econometrics. Methods and models. Kluwer, Boston
Banerjee S, Carlin B, Gelfand A (2004) Hierarchical modeling and analysis for spatial data. Chapman and Hall, New York
Bivand RS, Pebesma E, Gomez-Rubio V (2013) Applied spatial data analysis with R, 2nd edn. Springer, New York
Chen S, Lahiri P (2003) A comparison of different MSPE estimators of EBLUP for the Fay-Herriot model. In: Proceedings of the section on survey research methods. Washington, DC, American Statistical Association, pp 903–911
Cressie N (1993) Statistics for spatial data. Wiley, New York
Cressie N, Chan NH (1989) Spatial modeling of regional variables. J Am Stat Assoc 84:393–401
Datta GS, Lahiri PS (2000) A unified measure of uncertainty of estimated best linear unbiased predictors in small area estimation problems. Stat Sinica 10:613–627
Datta GS, Rao JNK, Smith DD (2005) On measuring the variability of small area estimators under a basic area level model. Biometrika 92:183–196
Fay RE, Herriot RA (1979) Estimates of income for small places: an application of James-Stein procedures to census data. J Am Stat Assoc 74:269–277
Hall P, Maiti T (2006) On parametric bootstrap methods for small area prediction. J Roy Stat Soc B 68:221–238
Harville D, Jeske D (1992) Mean squared error of estimation or prediction under a general linear model. J Am Stat Assoc 87:724–731
INEI, Perú (2019) Encuesta Demográfica y de Salud Familiar-ENDES. https://proyectos.inei.gob.pe/microdatos/
Jiang J, Lahiri PS, Wan SM (2002) A unified jackknife theory for empirical best prediction with M-estimation. Ann Stat 30:1782–1810
Kackar RN, Harville DA (1984) Approximations for standard errors of estimators for fixed and random effects in mixed models. J Am Stat Assoc 79:853–862
Kreutzmann A, Pannier S, Rojas-Perilla N, Schmid T, Templ M, Tzavidis N (2019) The R Package EMDI for estimating and mapping regionally disaggregated indicators. J Stat Softw 91(7):1–33. https://doi.org/10.18637/jss.v091.i07
Marhuenda Y, Molina I, Morales D (2013) Small area estimation with spatio temporal Fay–Herriot models. Comput Stat Data Anal 58:308–325
Ministerio de Salud (2014) Plan Nacional para la Reducción de la Desnutrición Crónica Infantil y la Prevención de la Anemia en el país: 2014-2016. RM No 258-2014 Lima, Minsa
Ministerio de Salud (2017) Plan Nacional para la Reducción y Control de la Anemia Materno Infantil y la Desnutrición Crónica Infantil en el Perú: 2017-2021. RM No 249-2017 Lima, Minsa
Molina I, Salvati N, Pratesi M (2009) Bootstrap for estimating the MSE of the spatial EBLUP. Comput Stat 24:441–458
Moran PAP (1950) Notes on continuous stochastic phenomena. Biometrika 37(1):17–23
Organización Mundial de la Salud (2011) Concentraciones de hemoglobina para diagnosticar la anemia y evaluar su gravedad. Ginebra, OMS. (WHO/NMH/NHD/ MNM/11.1)
Pebesma E, Bivand R (2023) Spatial data science with applications. Chapman & Hall
Petrucci A, Salvati N (2006) Small area estimation for spatial correlation in watershed erosion assessment. J Agric Biol Environ Stat 11(2):169–182
Pfeffermann D (2002) Small area estimation- new developments and directions. Int Stat Rev 70:125–143
Pfeffermann D, Tiller RB (2005) Bootstrap approximation to prediction MSE for state-space models with estimated parameters. J Time Ser Anal 26:893–916
Prasad NGN, Rao JNK (1990) New important developments in small area estimation. J Am Stat Assoc 85(409):163–171
Pratesi M, Salvati N (2008) Small area estimation: the EBLUP estimator based on spatially correlated random area effects. Stat Method Appl 17(1):113–141
Pratesi M, Salvati N (2009) Small area estimation in the presence of correlated random area effects. J Off Stat 25(1):37–53
R Core Team (2020) R: language and environment for statistical computing. R foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/
Rao JNK (2003) Small area estimation. Wiley, London
Rao JNK, Molina I (2015) Small area estimation. Wiley series in survey methodology, 2nd edn. Wiley, Hoboken
Singh BB, Shukla K, Kundu D (2005) Spatial-temporal models in small area estimation. Surv Methodol 31(2):183–195
World Health Organization (2004) Centers for Disease Control and Prevention. Assessing the Iron Status of Populations, Ginebra, WHO
Acknowledgments
The authors thank the reviewers for very thoughtful and helpful comments.
Funding
This research is supported by a grant from the Unidad de Investigación de la FIEECS-UNI.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
We wish to confirm that there are no known conflicts of interest associated with this publication and there has been no significant financial support for this work that could have influenced its outcome. We confirm that we have given due consideration to the protection of intellectual property associated with this work and that there are no impediments to publication, including the timing of publication, with respect to intellectual property. In so doing we confirm that we have followed the regulations of our institutions concerning intellectual property
Authors have no conficts of interest to disclose.
Consent to publication
This manuscript has not been published anywhere and is not being considered for publication elsewhere.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A
For this study, we use R software (see R Core Team 2020). The initial step was creating a SpatialPointsDataFrame object that encompasses both the geographic information and survey data. To accomplish this, we employed the SpatialPointsDataFrame function, which is available in the sp package (see Bivand et al. 2013), as follows.
proj=CRS("+proj=longlat +datum=WGS84 +no_defs")
data.sp=SpatialPointsDataFrame(coords=
cbind(data$Latitude,data$Longitude),
data=data,proj4string=proj)
To establish neighboring districts based on their geographical coordinates, it was employed the spdep package (see Pebesma and Bivand 2023). More precisely, it was utilized the knearneigh function which returns a list of the districts along with their corresponding K nearest neighbors.
neighbors<-knearneigh(coordinates(data.sp), k=K)
We use the list of the neighbors obtained in the output, in order to produce the matrix of spatial weights, W by assigning a weight of 1/K to each neighbor. Finally, the fh function from the emdi package (see Kreutzmann et al. 2019) was used to fit both the basic and spatial Fay Herriot models.
fh(fixed=formula, vardir="VR",
combined_data=data,domains="District",method = "reml",
correlation="spatial",corMatrix=W,
MSE=TRUE, mse_type = "spatialparbootbc")->spatial.fh
Here, the variables "VR" and "District" refer to the variance of the direct estimator and the identification district code, respectively. Note that omitting the third line would result in fitting the basic Fay Herriot model.
Appendix B
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sikov, A., Cerda-Hernandez, J. Estimating the prevalence of anemia rates among children under five in Peruvian districts with a small sample size. Stat Methods Appl 32, 1779–1804 (2023). https://doi.org/10.1007/s10260-023-00698-x
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10260-023-00698-x