On the choice of the number of Monte Carlo iterations and bootstrap replicates in Empirical Best Prediction

Adam Chwila; Tomasz Żądło

doi:10.21307/stattrans-2020-013

On the choice of the number of Monte Carlo iterations and bootstrap replicates in Empirical Best Prediction

Adam Chwila University of Economics in Katowice, Katowice, Poland ORCID:https://orcid.org/0000-0003-4671-4298 , Tomasz Żądło University of Economics in Katowice, Katowice, Poland ORCID:https://orcid.org/0000-0003-0638-0748 Statistics in Transition new series, vol. 21, 2020, 2, pages: 35-60 Published online: 5 June 2020 DOI 10.21307/stattrans-2020-013

1293 Views 42 Downloads

ARTICLE

(English) PDF

ABSTRACT

Empirical Best Predictors (EBPs) are widely used for small area estimation purposes. In the case of longitudinal surveys, this class of predictors can be used to predict any given population or subpopulation characteristic for any time period, including future periods. Generally, the value of an EBP is computed by means of Monte Carlo algorithms, while its MSE is usually estimated using the parametric bootstrap method. Model-based simulation studies of the properties of the predictors require numerous repetitions of the random generation of population data. This leads to a question about the dependence between the number of iterations in all the procedures and the stability of the results. The aim of the paper is to show this dependence and to propose methods of choosing the appropriate number of iterations in practice, using a set of real economic longitudinal data available at the United States Census Bureau website.

KEYWORDS

survey sampling, economic longitudinal data, prediction for future periods

REFERENCES

ANDREWS, D. W. K., BUCHINSKY, M., (1997). On the number of bootstrap repetitions for bootstrap standard errors, confidence intervals, and tests, Cowles Foundation Discussion Paper No. 1141R, pp. 1–51.

ANDREWS, D. W. K., BUCHINSKY, M., (2000). A three-step method for choosing the number of bootstrap repetitions, Econometrica, Vol. 67, pp. 23–51.

ANDREWS, D. W. K., BUCHINSKY, M., (2001). Evaluation of a three-step method for choosing the number of bootstrap repetitions, Journal of Econometrics, Vol. 103, pp. 345–386.

BARBIERO, A., MECATTI, F., (2010). Bootstrap algorithms for variance estimation in ?PS sampling, In: Mantovan, P., Secchi, P. (Eds.), Complex Data Modeling and Computationally Intensive Statistical Methods. Contributions to Statistics, Springer, Milano, pp. 57–69.

BERAN, R., (1997). Diagnosing Bootstrap Success, Annals of the Institute of Statistical Mathematics, Vol. 49, pp. 1–24.

BERG, E., CHANDRA, H., (2014). Small area prediction for a unit-level lognormal model, Computational Statistics and Data Analysis, Vol. 78, pp.159–175.

BOUBETA, M., LOMBARDÍA, M. J, MORALES, D., (2016). Empirical best prediction under area-level Poisson mixed models, Test, Vol. 25, pp. 548–569.

BOUBETA, M., LOMBARDÍA, M. J, MORALES, D., (2017). Poisson mixed models for studying the poverty in small areas, Computational Statistics and Data Analysis, Vol. 107, pp. 32–47.

BUTAR, B. F., LAHIRI, P., (2003). On measures of uncertainty of empirical Bayes small-area estimators, Journal of Statistical Planning and Inference, Vol. 112, pp. 63–76.

CHATTERJEE, S., LAHIRI, P. LI, H., (2008). Parametric bootstrap approximation to the distribution of EBLUP and related prediction intervals in linear mixed models, Annals of Statistics, Vol. 36 (3), pp. 1221–1245.

DAS, S., HASLETT, S., (2019). A comparison of methods for poverty estimation in developing countries, International Statistical Review, DOI: 10.1111/insr.12314, available online: https://onlinelibrary.wiley.com/doi/pdf/10.1111/insr.12314.

DIALLO, M. S., (2014). Small area estimation under skew-normal nested error models, PhD diss., Carleton University.

DIALLO, M. S., RAO, J. N. K., (2018). Small area estimation of complex parameters under unit-level models with skew-normal errors, Scandinavian Journal of Statistics, Vol. 2018, pp.1–25.

DAVISON, A. C., HINKLEY D. V., (1997). Bootstrap Methods and their Application, Cambridge University Press.

EFFRON, B., TIBSHIRANI, R., (1986), Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy, Statistical Science, Vol. 1(1), pp. 54–75.

ELBERS, CH., VAD DER WEIDE, R., (2014). Estimation of normal mixtures in a nested error model with an application to small area estimation of poverty and inequality. World Bank Group. Policy Research Working Paper 6962, pp. 1–31.

GONZÁLEZ-MANTEIGA W., LOMBARDÍA, M. J., MOLINA, I., MORALES, D., SANTAMARÍA, L., (2008). Bootstrap mean squared error of small-area EBLUP, Journal of Statistical Computation and Simulation, Vol. 78(5), pp. 443–462.

GUADARRAMA, M., MOLINA, I., RAO, J. N. K., (2018). Small area estimation of general parameters under complex sampling designs, Computational Statistics and Data Analysis, Vol.121, pp. 20–40.

HALL, P., MAITI, T., (2006). On Parametric Bootstrap Methods for Small Area Prediction, Journal of the Royal Statistical Society. Series B, Vol. 68(2), pp. 221–238.

HALL, P., MARTIN, M. A., (1988). Exact convergence rate of bootstrap quantile variance estimator, Probability Theory and Related Fields, Vol. 80, pp. 261–268.

HOBZA, T., MORALES, D., (2016). Empirical best prediction under unit-level logit mixed models, Journal of Official Statistics, Vol. 32(3), pp. 661–692.

JIANG, J., (1996). REML estimation: asymptotic behavior and related topics, Annals of Statistics, Vol. 24 (1), pp. 255–286.

JIANG, J., (2003). Empirical best prediction for small-area inference based on generalized linear mixed models, Journal of Statistical Planning and Inference, Vol. 111, pp. 117–127.

JIANG, J., (2007). Linear and Generalized Linear Mixed Models and Their Appliactions, Springer, New York.

JIANG, J., LAHIRI, P., (2001). Empirical best prediction for small area inference with binary data, Annals of the Institute of Statistical Mathematics, Vol. 53(2), pp. 217– 243.

JIANG, J., LAHIRI, P., (2006). Estimation of Finite Population Domain Means, Journal of the American Statistical Association, Vol. 101(473), pp. 301–311.

MARINO, M. F., RANALLI, M. G., SALVATI, N., ALFO, M., (2019). Semi-Parametric Empirical Best Prediction for small area estimation of unemployment indicators, The Annals of Applied Statistics, Vol. 13(2), pp. 1166–1197.

MOLINA, I., MARTIN, N., (2018). EBP under a nested error model with log transformation, Annals of Statistics, Vol. 46(5), pp. 1961–1993.

MOLINA, I., RAO, J. N. K., (2010). Small area estimation of poverty indicators, The Canadian Journal of Statistics, Vol. 38(3), pp. 369–385.

SINGH, K., (1981). On the Asymptotic Accuracy of Efron's Bootstrap, The Annals of Statistics, Vol. 9(6), pp. 1187–1195.

TZAVIDIS, N., ZHANG, L.-C., LUNA, A., SCHMID, T., ROJAS-PERILLA, N., (2018). From start to finish: a framework for the production of small area official statistics, Journal of the Royal Statistical Society A, Vol. 181(4), pp. 927–979.

VERBEKE, G., MOLENBERGHS, G., (2009). Linear mixed models for longitudinal data, Springer-Verlag, New York.

ZIMMERMANN, T., MÜNNICH, R., (2018). Small area estimation with a lognormal mixed model under informative sampling, Journal of Official Statistics, Vol. 34(2), pp. 523–542.