Jacek Białek https://orcid.org/0000-0002-0952-5327 , Tomasz Panek https://orcid.org/0000-0002-1034-7222 , Jan Zwierzchowski

© Jacek Białek, Tomasz Panek, Jan Zwierzchowski. Article available under the CC BY-SA 4.0 licence


(English) PDF


One of the greatest challenges facing official statistics in the 21st century is the use of alternative sources of data about prices (scanned and scraped data) in the analysis of price dynamics, which also involves selecting the appropriate formula of the price index at the elementary group (5-digit) level. When consumer price indices of goods and services are constructed, a number of subjective decisions are made at different stages, e.g. regarding the choice of data sources and types of indices used for the purpose of estimation. All of these decisions can affect the bias of consumer price indices, i.e. the extent to which they contribute to the overall uncertainty about the resulting index values. By measuring how robust consumer price indices are, one can assess the impact that the decisions made at the different stages of index construction have on the index values. This assessment involves analysing uncertainty and sensitivity. The purpose of the study described in the article was to determine how much and in which direction the consumer price index changes when including scanner and scraped data in the analysis, in addition to the data on prices collected by enumerators. The impact of these new data sources was assessed by analysing uncertainty and sensitivity under the deterministic approach. To the best of the authors’ knowledge, it is a novel application of robustness analysis to measure inflation using new data sources. The empirical study was based on data for February and March 2021, while scanner and scraped data about selected categories of food products were obtained from one retail chain operating hundreds of points of sale in Poland and selling products online. It was found that the choice of a data source has the most significant impact on the final value of the index at the elementary group level, while the choice of the aggregation formula used to consolidate different data sources is of secondary importance.


price indices, scraped data, scanner data, robustness analysis, inflation


C43, E31


Białek J., (2017). Approximation of the Fisher price index by using Lowe, Young and AG Mean indices. Communications in Statistics – Simulation and Computation, 46(8), pp. 6454–6467.

Białek, J., Dominiczak-Astin, A. and Turek, D., (2021). Porównanie cen i wskaźników cen konsumpcyjnych: tradycyjna metoda uzyskiwania danych a źródła alternatywne. Wiadomości Statystyczne. The Polish Statistician, 66(9), pp. 32–69.

Białek, J., (2021). PriceIndices – a New R Package for Bilateral and Multilateral Price Index Calculations. Statistika – Statistics and Economy Journal, Vol. 2/2021, pp. 122–141, Czech Statistical Office, Praga.

Białek, J., Beręsewicz, M., (2021). Scanner data in inflation measurement: From raw data to price indices. Statistical Journal of the IAOS, Vol. 37, pp. 1315–1336.

Chan, K., Tarantola, S., Saltelli, A. and Sobol, I. M., (2000). Variance based methods, in: A. Saltelli, K. Chan i M. Scott (eds.). Sensitivity analysis, Wiley, New York, pp. 167–197.

Chessa, A., (2015). Towards a generic price index method for scanner data in the Dutch CPI. In 14th meeting of the Ottawa Group, Tokyo, pp. 20–22.

Chessa, A., (2016). A new methodology for processing scanner data in the Dutch CPI. Eurostat review of National Accounts and Macroeconomic Indicators, Vol. 1, pp. 49–69.

International Labour Office, (2004). Consumer price index manual: Theory and practice, Geneva.

De Haan, J., Hendriks, R. and Scholz, M., (2021). Price measurement using scanner data: Time-product dummy versus time dummy hedonic indexes. Review of Income and Wealth 67(2), pp. 394–417.

Fisher, I., (1922). The making of index numbers: a study of their varieties, tests, and reliability, Number 1, Houghton Mifflin.

Jaro, M., (1989). Advances in record-linkage methodology as applied to matching the 1985 census of tampa, florida. Journal of the American Statistical Association, Vol. 84, pp. 414–420.

Laspeyres, K., (1871). IX. Die berechnung einer mittleren waarenpreissteigerung. Jahrbücher für Nationalökonomie und Statistik, 16(1), pp. 296–318.

Nardo, M., Saisana, M., Saltelli, A. and Tarantola, S., (2011). Tools for composite indicators building. Paperback – European Commission, Dictus Publishing.

Nardo, M., Saisana, M., Saltelli, A., Tarantola, S., Hoffman, A. and Giovannini, E., (2005). Handbook on constructing composite indicators: Methodology and user guide. OECD, Statistics Working Paper.

OECD, (2008). Handbook on constructing composite indicators. Methodology and user guide. OECD Publications, Paris.

Panek, T., (2016). Quality of Life – from conception to measurement. Warsaw School of Economic Press, Warsaw.

Paasche, H., (1874). Über die preisentwicklung der letzten jahre nach den hamburger börsennotirungen. Jahrbücher für Nationalökonomie und Statistik, pp. 168–178

Saisana, M., Saltelli A. and Tarantola S., (2005). Uncertainty and sensitivity techniques as tools for the analysis and validation of composite indicators. Journal of the Royal Statistical Society, A., Vol. 168(2), pp. 307–323.

Saltelli, A., (2002). Making best use of model valuations to compute sensitivity indices. Computer Physics Communications, Vol. 145, pp. 280–297.

Saltelli A., Chan K. and Scott, E. M. (red.), (2000). Sensitivity analysis. John Wiley & Sons, New York.

Saltelli, A., Ratto, M., Andres, T., Campolongo, F., Cariboni, J., Gatelli, D., Saisana, M. and Tarantola, S., (2008). Global sensitivity analysis. The primer. John Wiley & Sons, Chichester.

Sharpe, A., Salzman, J. (2004). Methodological choices encountered in the construction of composite indices of economic and social well-being. Center for the Study of Living Standards, Ottawa, CAN.

Sobol, I. M., (1993). Sensitivity analysis for non-linear mathematical models. Mathematical Modelling and Computational Experiment, Vol. 1, pp. 407–414.

Törnqvist, L., (1936). The Bank of Finland’s consumption price index. Bank of Finland Monthly Bulletin, pp. 1–8.

Winkler, W., (1990). String comparator metrics and enhanced decision rules in the Fellegi-Sunter model of record linkage. In Proceedings of the Section on Survey Research Methods, American Statistical Association, pp. 354–359.

Back to top
© 2019–2024 Copyright by Statistics Poland, some rights reserved. Creative Commons Attribution-ShareAlike 4.0 International Public License (CC BY-SA 4.0) Creative Commons — Attribution-ShareAlike 4.0 International — CC BY-SA 4.0