Probability vs. Nonprobability Sampling: From the Birth of Survey Sampling to the Present Day

Graham Kalton

doi:https://doi.org/10.59170/stattrans-2023-029

Probability vs. Nonprobability Sampling: From the Birth of Survey Sampling to the Present Day

Graham Kalton Joint Program in Survey Methodology, University of Maryland, College Park, MD, USA. ORCID:https://orcid.org/0000-0002-9685-2616 Statistics in Transition new series, vol. 24, 2023, 3, pages: 1-22 Published online: 13 June 2023 https://doi.org/10.59170/stattrans-2023-029

589 Views 106 Downloads

ARTICLE

(English) PDF

ABSTRACT

At the beginning of the 20th century, there was an active debate about random selection of units versus purposive selection of groups of units for survey samples. Neyman’s (1934) paper tilted the balance strongly towards varieties of probability sampling combined with design-based inference, and most national statistical offices have adopted this method for their major surveys. However, nonprobability sampling has remained in widespread use in many areas of application, and over time there have been challenges to the Neyman paradigm. In recent years, the balance has tilted towards greater use of nonprobability sampling for several reasons, including: the growing imperfections and costs in applying probability sample designs; the emergence of the internet and other sources for obtaining survey data from very large samples at low cost and at high speed; and the current ability to apply advanced methods for calibrating nonprobability samples to conform to external population controls. This paper presents an overview of the history of the use of probability and nonprobability sampling from the birth of survey sampling at the time of A. N. Kiar (1895) to the present day.

KEYWORDS

Anders Kiar, Jerzy Neyman, representative sampling, quota sampling, hard-to-survey populations, model-dependent inference, internet surveys, big data, administrative records.

REFERENCES

Aldrich, J., (2008). Professor A. L. Bowley’s theory of the representative method. (Discussion Papers in Economics and Econometrics, 801) University of Southampton. https://eprints.soton.ac.uk/150493.

Baker, R., Blumberg, S. J., Brick, J. M., Couper, M. P., Courtright, M., Dennis, J. M., Dillman, D., Frankel, M. R., Garland, G., Groves, R. M., Kennedy, C., Krosnick, J., Lavrakas, P. J., (2010). AAPOR Report on Online Panels. Public Opinion Quarterly, 74(4), pp. 711–781.

Bauer J. J., (2014). Selection errors of random route samples. Sociological Methods and Research, 43(3), pp. 519–544.

Bauer J. J., (2016). Biases in random route surveys. Journal of Survey Statistics and Methodology, 4(2), pp. 263–287.

Beaumont J-F., Rao, J. N. K., (2021). Pitfalls of making inferences from non-probability samples: Can data integration through probability samples provide remedies? The Survey Statistician, 83, pp. 11–22.

Bennett, S., (1993). Cluster sampling to assess immunization: a critical appraisal. Bulletin of the International Statistical Institute, 49th Session, 55(2), pp. 21–35.

Bowley, A. L., (1913). Working-class households in Reading. Journal of the Royal Statistical Society, 76(7), pp. 672–701.

Bradley, V. C., Kuriwaki, S., Isakov, M., Sejdinovic, D., Meng, X.-L., Flaxman, S., (2021). Unrepresentative big surveys significantly overestimated US vaccine uptake. Nature, 600, pp. 695–700.

Brewer, K. R. W., (1963). Ratio estimation in finite populations: some results deducible from the assumption of an underlying stochastic process, Australian Journal of Statistics, 5, pp. 93–105.

Caradog Jones, D., (1949). Social Surveys. Hutchinson’s University Library, London.

CDC, (2019). Community Assessment for Public Health Emergency Response (CASPER) Toolkit. 3rd ed., CDC, Atlanta. https://www.cdc.gov/nceh/casper/.

Chambers, R., Clark, R., (2012). An Introduction to Model-Based Survey Sampling with Applications. Oxford University Press, Oxford.

Chambers, R. L., Skinner, C. J., Eds., (2003). Analysis of Survey Data. Wiley, Chichester. Cochran, W. G., (1953). Sampling Techniques. Wiley, New York.

Converse, J. M., (2017). Survey Research in the United States: Roots and Emergence 1890-1960. Routledge, New York.

DuMouchel, W. H., Duncan, G. J., (1983). Using sample survey weights in multiple regression analyses of stratified samples. Journal of the American Statistical Association, 78, pp. 535–543.

Ghosh, M., (2020). Small area estimation: its evolution in five decades (with discussion). Statistics in Transition, 21(4), pp. 1–67.

Gile, K. J., Hancock, M. S., (2010). Respondent-driven sampling: an assessment of current methodology. Sociological Methodology, 40(1), pp. 285–327.

Gini, C., Galvani, L., (1929). Di una applicazione del metodo representative. Annali di Statistica, 6(4), pp. 1–107.

Hand, D. J., (2018). Statistical challenges of administrative and transaction data (with discussion). Journal of the Royal Statistical Society, A, 181(3), pp. 555–605.

Hansen, M. H., Hurwitz, W. N. and Madow, W. G. (1953). Sample Survey Methods and Theory. Volume 1: Methods and Applications. Volume II: Theory. Wiley, New York.

Heckathorn, D. D., (1997). Respondent-driven sampling: a new approach to the study of hidden populations. Social Problems, 44(2), pp. 174–199.

Heeringa, S. G., West, B. T., Berglund, P. A., (2017). Applied Survey Data Analysis. Chapman & Hall/ CRC, Boca Raton, FL.

Jensen, A., (1926) The report on the representative method in statistics. Bulletin of the International Statistical Institute, 22, 355–376.

Kalton, G., (1983). Models in the practice of survey sampling. International Statistical Review, 51, pp. 175–188.

Kalton, G., (1991). Sampling flows of mobile human populations. Survey Methodology, 17(2), pp. 183–194.

Kalton, G., (2002). Models in the practice of survey sampling (revisited). Journal of Official Statistics, 18, pp. 129–154.

Kalton, G., (2019). Developments in survey research over the past 60 years: A personal perspective. International Statistical Review, 87 (S1), pp. S10–S30.

Kalton, G., (2021). Introduction to Survey Sampling. 2nd ed. SAGE Publications, Thousand Oaks, California.

Ki?r, A. N., (1976). The Representative Method of Statistical Surveys. English translation, Statistik Centralbyro, Oslo.

Kim, J. K., Wang, Z., (2019). Sampling techniques for big data analysis. International Statistical Review, 87 (S1), pp. S177–S191.

Kish, L., (1995). The hundred years’ war of survey sampling. Statistics in Transition, 2(5), pp. 813–830.

Korn, E. L., Graubard, B. I., (1999). Analysis of Health Surveys. Wiley, New York.

Kruskal, W., Mosteller, F., (1980). Representative sampling, IV: The history of the concept in statistics, 1895-1939. International Statistical Review, 48(2), pp. 169–195.

Lazer, D., Kennedy, R., King, G., Vespignani, A., (2014). The parable of Google flu: Traps in big data analysis. Science, 343, pp. 1203–1205.

Levy, P. S., Lemeshow, S., (2008). Sampling of Populations. Methods and Applications. 4th ed. Wiley, Hoboken, NJ.

Lie, E., (2002). The rise and fall of sampling surveys in Norway, 1875–1906. Science in Context, 15(3), pp. 385–409.

Lohr, S. L., Brick, J. M., (2017). Roosevelt predicted to win: Revisiting the 1936 Literary Digest Poll. Statistics, Politics, and Policy, 8(1), pp. 65–84.

MacKellar, D. A., Gallagher, K. M., Findlayson, T., Sanchez, T., Lansky, A., Sullivan, P. S., (2007). Surveillance of HIV risk and prevention behaviors of men who have sex with men—a national application of venue-based, time-space sampling. Public Health Reports, 122 (1), Supplement 1, pp. 39–47.

Mahanalobis, P. C., (1946). Recent experiments in statistical sampling in the Indian Statistical Institute (with discussion). Journal of the Royal Statistical Society, 109, pp. 325–378.

Meng, X-L., (2018). Statistical paradises and paradoxes in big data (I): Law of large populations, big data paradox, and the 2016 US presidential election. Annals of Applied Statistics, 12(2), pp. 685–726.

Moser, C. A., Kalton, G., (1971). Surveys Methods in Social Investigation. 2nd ed. Heinemann, London.

Moser, C. A., Stuart, A., (1953). An experimental study of quota sampling. Journal of the Royal Statistical Society, A, 116, pp. 349–405.

Neyman, J., (1934). On two different aspects of the representative method: The method of stratified sampling and the method of purposive selection. Journal of the Royal Statistical Society, 97, pp. 558–625.

Rao, J. N. K., (2021). On making valid inferences by integrating data from surveys and other sources. Sankhya B, 83, pp. 242–272.

Rao, J. N. K., Molina, I., (2015). Small Area Estimation. 2nd ed. Wiley, Hoboken, N. J.

Royall, R. M., (1970). On finite population sampling theory under certain regression models. Biometrika, 57, pp. 377–387.

Royall, R. M., (1976). The linear least squares prediction approach to two-stage sampling. Journal of the American Statistical Association, 71, pp. 657–664.

S?rndal, C.E., Swensson, B., Wretman, J., (1992). Model Assisted Survey Sampling. Springer-Verlag, New York.

Skinner, C. J., Holt, D., Smith, T. M. F., Eds., (1989). Analysis of Complex Surveys. Wiley, Chichester.

Smith, T. M. F., (1976). The foundations of survey sampling: a review. Journal of the Royal Statistical Society, A, 139, pp. 183–204.

Smith, T. M. F., (1994). Sample surveys 1975-90; an age of reconciliation? International Statistical Review, 62, pp. 5–34.

Stephan, F. F., (1948). History of the uses of modern sampling procedures. Journal of the American Statistical Association, 43(241), pp. 12–39.

Stephan, F. F., McCarthy P. J., (1958). Sampling Opinions. An Analysis of Survey Procedures. Wiley, New

Stephenson, C. B., (1979). Probability sampling with quotas: An experiment. Public Opinion Quarterly, 43(4), pp. 477–497.

Sudman, S., (1966). Probability sampling with quotas. Journal of the American Statistical Association, 61, pp. 749–771.

Tourangeau, R., Edwards, B., Johnson, T. P., Wolter, K.M., Bates, N., Eds., (2014). Hard-to-Survey Populations. Cambridge University Press, Cambridge, U. K.

Valliant, R. (2020). Comparing alternatives for estimation from nonprobability samples. Journal of Survey Statistics and Methodology, 8(2), pp. 231–263.

Valliant, R., Dorfman, A. H., Royall, R. M., (2000). Finite Population Sampling and Inference. A Prediction Approach. Wiley, New York.

Yates, F., (1949). Sampling Methods for Censuses and Surveys. Griffen, London.