The choice of normalization method and rankings of the set of  objects based on composite indicator values

Marek  Walesiak

doi:https://doi.org/10.21307/stattrans-2018-036

The choice of normalization method and rankings of the set of objects based on composite indicator values

Marek Walesiak Wroclaw University of Economics, Department of Econometrics and Computer Science, Jelenia Góra, Poland. Statistics in Transition new series, vol. 19, 2018, 4, pages: 693-710 Published online: 3 December 2018 https://doi.org/10.21307/stattrans-2018-036

801 Views 45 Downloads

ARTICLE

(English) PDF

ABSTRACT

The choice of the normalization method is one of the steps for constructing a composite indicator for metric data (see, e.g. Nardo et al., 2008, pp. 19-21). Normalization methods lead to different rankings of the set of objects based on composite indicator values. In the article 18 normalization methods and 5 aggregation measures (composite indicators) were taken into account. In the first step the groups of normalization methods, leading to identical rankings of the set of objects, were identified. The considerations included in Table 3 reduce this number to 10 normalization methods. Next, the article discusses the procedure which allows separating groups of normalization methods leading to similar rankings of the set of objects separately for each composite indicator formula. The proposal, based on Kendall’s tau correlation coefficient (Kendall, 1955) and cluster analysis, can reduce the problem of choosing the normalization method. Based on the suggested research procedure the simulation results for five composite indicators and ten normalization methods were presented. Moreover, the proposed approach was illustrated by an empirical example. Based on the analysis of the dendrograms three groups of normalization methods were separated. The biggest differences in the results of linear ordering refer to methods n2, n9a against the other normalization methods.

KEYWORDS

variables normalization, rankings, composite indicators, Kendall’s tau correlation coefficient, cluster analysis.

REFERENCES

BĄK, A., (1999). Modelowanie symulacyjne wybranych algorytmów wielowymiarowej analizy porównawczej w języku C++ [Simulation modeling of selected algorithms of multivariate comparative analysis with C++ language], Wrocław: Wydawnictwo Akademii Ekonomicznej we Wrocławiu, ISBN: 8370114016.

BORG, I., GROENEN, P. J. F., (2005). Modern multidimensional scaling, New York: Springer. ISBN: 978-0387-25150-9, http://dx.doi.org/10.1007/0-387-28981-X.

BORYS, T., (1984). Kategoria jakości w statystycznej analizie porównawczej [Category of quality in statistical comparative analysis], Prace Naukowe Akademii Ekonomicznej we Wrocławiu No. 284, Series: Monografie i opracowania No. 23. ISBN: 83-7011-000-0.

EVERITT, B. S., LANDAU, S., LEESE, M., STAHL, D., (2011). Cluster analysis, Chichester: Wiley, ISBN: 978-0-470-74991-3.

GENZ, A., AZZALINI, A., (2016). mnormt: The Multivariate Normal and t Distributions. R package, version 1.5-5,https://CRAN.R-project.org/package=mnormt.

GRABIŃSKI, T., (1984). Wielowymiarowa analiza porównawcza w badaniach dynamiki zjawisk ekonomicznych [Multivariate comparative analysis in research over the dynamics of economic phenomena], Zeszyty Naukowe Akademii Ekonomicznej w Krakowie, Special series: Monografie No. 61,ISSN: 0209-1674.

GRABIŃSKI, T., (1992). Metody taksonometrii [Taxonometric methods], Kraków: Wydawnictwo Akademii Ekonomicznej w Krakowie.GRABIŃSKI, T., WYDYMUS, S., ZELIAŚ, A., (1989). Metody taksonomii numerycznej w modelowaniu zjawisk społeczno-gospodarczych [Numerical taxonomy methods in modeling socioeconomic phenomena], Warszawa:

PWN, ISBN 83-208-0042-0.

GRYSZEL, P., WALESIAK, M., (2018). The application of selected multivariate statistical methods for the evaluation of tourism competitiveness of the Sudety communes, Argumenta Oeconomica, No. 1 (40), pp. 147–166, https://doi.org/10.15611/aoe.2018.1.06.

HELLWIG, Z., (1968). Zastosowanie metody taksonomicznej do typologicznego podziału krajów ze względu na poziom ich rozwoju i strukturę wykwalifikowanych kadr [Procedure of evaluating high level manpower data and typology of countries by means of the taxonomic method], Przegląd Statystyczny, Tom 15, z. 4, pp. 307–327.

HELLWIG, Z., (1972). Procedure of Evaluating High-Level Manpower Data and Typology of Countries by Means of the Taxonomic Method, [in:] Gostkowski Z. (ed.), Towards a system of Human Resources Indicators for Less Developed Countries, Papers Prepared for UNESCO Research Project, Ossolineum, The Polish Academy of Sciences Press, Wrocław, pp. 115–134.

HELLWIG, Z., (1976). Przechodniość relacji skorelowania zmiennych losowych i płynące stąd wnioski ekonometryczne [Transitivity of correlation and some econometric implications], Przegląd Statystyczny, Tom 23, z. 1, pp. 3–20.

HELLWIG, Z., (1981). Wielowymiarowa analiza porównawcza i jej zastosowanie w badaniach wielocechowych obiektów gospodarczych [Multivariate comparative analysis and applications in research of multifeature economic objects], In: W. Welfe (ed.), Metody i modele ekonomiczno-matematyczne w doskonaleniu zarządzania gospodarką socjalistyczną [Economic and mathematical methods and models in the improvement of socialist economy management], Warszawa: PWE, 46-68. ISBN 83-208-0042-0.

HUBERT, L., ARABIE, P., (1985). Comparing partitions, Journal of Classification, No. 1, pp. 193–218.

HWANG, C. L., YOON, K., (1981). Multiple attribute decision making – methods and applications. A state-of-the-art. Survey, New York: Springer-Verlag. ISBN:978-3-540-10558-9, http://dx.doi.org/10.1007/978-3-642-48318-9.

JAJUGA, K., WALESIAK, M., (2000). Standardisation of Data Set under Different Measurement Scales, In: Decker, R., Gaul, W., (Eds.), Classification and Information Processing at the Turn of the Millennium, pp. 105–112, Springer Verlag, Berlin, Heidelberg, http://dx.doi.org/10.1007/978-3-642-57280-7_11.

JAJUGA, K., WALESIAK, M., BĄK, A., (2003). On the General Distance Measure, in Schwaiger, M., Opitz, O., (Eds.), Exploratory Data Analysis in Empirical Research. Berlin, Heidelberg: Springer-Verlag, pp. 104–109, https://dx.doi.org/10.1007/978-3-642-55721-7_12.

KENDALL, M. G., (1955). Rank correlation methods, London: Griffin.

KENDALL, M. G., BUCKLAND, W. R., (1986). Słownik terminów statystycznych [A dictionary of statistical terms], Warszawa: PWE, ISBN: 83-208-0504-X.

MILLIGAN, G. W., COOPER, M. C., (1988). A study of standardization of variables in cluster analysis, Journal of Classification, Vol. 5, No. 2, pp. 181–204.

NARDO, M., SAISANA, M., SALTELLI, A., TARANTOLA, S., HOFFMANN, A., GIOVANNINI, E., (2008). Handbook on Constructing Composite Indicators. Methodology and User Guide, Paris: OECD Publishing, ISBN: 978-92-64-04345-9.

PAWEŁEK, B., (2008). Metody normalizacji zmiennych w badaniach porównawczych złożonych zjawisk ekonomicznych [Normalization of variables methods in comparative research on complex economic phenomena], Kraków: Wydawnictwo Uniwersytetu Ekonomicznego w Krakowie, ISBN: 978-83-7252-398-3.

R CORE TEAM, (2018). R: A language and environment for statistical computing, R Foundation for Statistical Computing, Vienna, Austria,https://cran.r-project.org.

STEVENS, S. S., (1946). On the theory of scales of measurement, Science, Vol. 103, No. 2684, pp. 677–680.

WALESIAK, M., (1995). The analysis of factors influencing the choice of the methods in the statistical analysis of marketing data, Statistics in Transition, June, Vol. 2, No. 2, pp. 185–194.

WALESIAK, M., (2002). Uogólniona miara odległości w statystycznej analizie wielowymiarowej [The Generalized distance measure in multivariate statistical analysis], Wrocław: Wydawnictwo Akademii Ekonomicznej we Wrocławiu,ISBN: 83-7011-583-7.

WALESIAK, M., (2011). Uogólniona miara odległości GDM w statystycznej analizie wielowymiarowej z wykorzystaniem programu R [The Generalized distance measure GDM in multivariate statistical analysis with R], Wrocław: Wydawnictwo Uniwersytetu Ekonomicznego we Wrocławiu, ISBN: 978-83-7695-132-4.

WALESIAK M., (2014a). Wzmacnianie skali pomiaru w statystycznej analizie wielowymiarowej [Reinforcing measurement scale for ordinal data in multivariate statistical analysis], Taksonomia 22, Prace Naukowe Uniwersytetu Ekonomicznego we Wrocławiu, No. 327, pp. 60–68.

WALESIAK M., (2014b). Przegląd formuł normalizacji wartości zmiennych oraz ich własności w statystycznej analizie wielowymiarowej [Data normalization in multivariate data analysis. An overview and properties], Przegląd Statystyczny, Tom 61, z. 4, pp. 363–372.

WALESIAK M., DUDEK A., (2018). clusterSim: Searching for Optimal Clustering Procedure for a Data Set. R package, version 0.47-2, http://CRAN.R-project.org/package=clusterSim.

ZELIAŚ, A., (2002). Some notes on the selection of normalization of diagnostic variables, Statistics in Transition, Vol. 5, No. 5, pp. 787–802