There are many models in the current statistical literature for making inferences based on samples selected from a finite population. Parametric models may be problematic because statistical inference is sensitive to parametric assumptions. The Dirichlet process (DP) prior is very flexible and determines the complexity of the model. It is indexed by two hyperparameters: the baseline distribution and concentration parameter. We address two distinct problems in the article. Firstly, we review the current sampling methods for the concentration parameter, which use the continuous baseline distribution. We compare three different methods: the adaptive rejection method, the mixture of Gammas method and the grid method. We also propose a new method based on the ratio of uniforms. Secondly, in practice, some survey responses are known to be discrete. If a continuous distribution is adopted as the baseline distribution, the model is misspecified and standard inference may be invalid. We propose a discrete baseline approach to the DP prior and sample the unobserved responses from the finite population both using a Polya urn scheme and a Multinomial distribution. We applied our discrete baseline approach to a Phytophthora data set.
concentration parameter, discrete baseline, empirical study, grid method, nonparametric Bayesian statistics
Antoniak, C. E., (1974). Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. The Annals of Statistics, 2 (6), pp. 1152–1174.
Antonelli, J., Trippa, L. and Haneuse, S., (2016). Mitigating bias in generalized linear mixed models: The case for Bayesian nonparametrics. Statistical Science, 31 (1), pp. 80–95.
Binder, D. A., (1982). Non-parametric Bayesian models for samples from finite populations. Journal of the Royal Statistical Society, Series B, 44(3), pp. 388–393.
Blackwell, D., MacQueen, J. B., (1973). Ferguson distributions via Polya urn schemes. The Annals of Statistics, 1 (2), pp. 353–355.
Camerlenghi, F., Dunson, D. B., Lijoi, A., Prunster, I. and Rodrigue,A., (2019). Latent nested nonparametric priors. Bayesian Analysis, 14, pp. 1303–1356.
Chaudhuri, S., Gosh, M., (2011). Empirical likelihood for small area estimation. Biometrika, 98, 2, pp. 473–480.
Escobar, M. D., West, M., (1995). Bayesian density estimation and inference using mixtures. Journal of the American Statistical Association, 90 (430), pp. 577–588.
Ferguson, T. S., (1973). A Bayesian analysis of some nonparametric problems. The Annals of Statistics, 1 (2), pp. 209–230.
Ishwaran, H., James, L. F., (2001). Gibbs sampling methods for stick-breaking priors. Journal of the American Statistical Association, 96 (453), pp. 161–173.
Gumpertz, M. L., Graham, J. M. and Ristaino, J. B., (1997). Autologistic Model of spatial pattern of Phytophthora Epidemic in bell pepper: Effects of Soil Variables on Disease Presence. Journal of Agricultural, Biological, and Environmental Statistics, Vol. 2, No. 2, pp. 131–156.
Kalli, M., Griffin, J. E. andWalker, S. G., (2011). Slice sampling mixture models. Statistics and Computing, 21 (1), pp. 83–105.
Kinderman, A. J., Monahan J. F., (1977). Computer generation of random variables using the ratio of uniform deviates. Association for Computing and Machinery, Inc.
Liu, Y., Nandram, B., (2020). Sampling methods for the concentration parameter of the Dirichlet process. In JSM Proceedings, Nonparamtric Section. Alexandria, VA: American Statistical Association. pp. 1121–1131.
Nandram, B., Choi, J. W., (2004). Nonparametric Bayesian analysis of a proportion for a small area under nonignorable nonresponse. Journal of Nonparametric Statistics, 16 (6), pp. 821–839.
Nandram, B., Yin, J., (2016a). Bayesian predictive inference under a Dirichlet process with sensitivity to the normal baseline. Statistical Methodology, 28, pp. 1–17.
Nandram, B., Yin, J., (2016b). A nonparametric Bayesian prediction interval for a finite population mean. Journal of Statistical Computation and Simulation, 86 (16), pp. 3141–3157.
Neal, R. M., (2000). Markov chain sampling methods for Dirichlet process mixture models. Journal of Computational and Graphical Statistics, 9 (2), 249–265.
Rasmussen, C. E., (2000). The infinite Gaussian mixture model. Advances in Neural Information Processing Systems pp. 554–560.
Sethuraman, J., (1994). A constructive definition of Dirichlet priors. Statistica Sinica, 4, pp. 639–650.
Teh, Y. W., Jordan, M. I., Beal, M. J. and Blei, D. M., (2006). Hierarchical Dirichlet processes. Journal of the American Statistical Association, 101 (476), pp. 1566–1581.
Yin, J., Nandram, B., (2020a). A Bayesian small area model with Dirichlet processes on responses. Statistics in Transition, New Series, 21 (3), pp. 1–19.
Yin, J., Nandram, B., (2020b). A Nonparametric Bayesian Analysis of Response Data with Gaps, Outliers and Ties. Statistics and Applications, New Series, 18 (2), pp. 121–141.