This article proposes the application of regression trees for analysing income polarization. Using an approach to polarization based on the analysis of variance, we show that regression trees can uncover groups of homogeneous income receivers in a data-driven way. The regression tree can deal with nonlinear relationships between income and the characteristics of income receivers, and it can detect which characteristics and their interactions actually play a role in explaining income polarization. For these features, the regression tree is a flexible statistical tool to explore whether income receivers concentrate around local poles. An application to Italian individual income data shows an interesting partition of income receivers.
polarization, regression trees, recursive partitioning, ANOVA
D31, D63, C14.
BANCA D’ITALIA, (2012). Survey on Household Income and Wealth 2010, Rome, Italy, 2012.
BREIMAN, L., FRIEDMAN, J. H., OLSHEN, R. A., STONE, C. J., (1993).Classification and regression trees, Chapman & Hall/CRC press, Boca Raton.
CAMPANELLA, F., (2014). Assess the Rating of SMEs by using Classification and Regression Trees (CART) with Qualitative Variables, Review of Economics & Finance, 4, 16–32.
DE’ATH, G., FABRICIUS, K. E., (2000). Classification and regression trees: a powerful yet simple technique for ecological data analysis, Ecology, 81, 3178–92.
DEUTSCH, J., FUSCO, A., SILBER, J., (2013). The BIP trilogy (Bipolarization, Inequality and Polarization): one saga but three different stories, Economics, Discussion Paper No. 2013–22.
DUCLOS, J. Y., ESTEBAN, J. M., RAY, D., (2004). Polarization: Concepts, measurement, estimation, Econometrica, 72, 1737–72.
ESTEBAN, J. M., RAY, D., (1994). On the measurement of polarization, Econometrica, 62, 819–51.
GASS, K., KLEIN, M., CHANG, H. H., FLANDERS, W. D., STRICKLAND, J., (2014). Classification and regression trees for epidemiological research: an air pollution example, Environmental Health, 13:17.
HSIAO, W. C., SHIH, Y. S., (2006). Splitting variable selection for multivariate regression trees, Statistics and Probability Letters, 77, 265–71.
MUSSINI, M., (2013). A matrix approach to the Gini index decomposition by subgroup and by income source, Applied Economics, 45, 2457–2468.
PALACIOS-GONZÁLEZ, F., GARCÍA-FERNÁNDEZ, R. M., (2012).Interpretation of the coefficient of determination of an ANOVA model as a measure of polarization, Journal of Applied Statistics, 39, 1543–55.
POGGI, A., SILBER, J., (2010). On polarization and mobility: a look at polarization in the wage-career profile in Italy, Review of Income and Wealth, 56, 123–140.
STROBL, C., MALLEY, J., TUTZ, G., (2009). An introduction to recursive partitioning: Rationale, application and characteristics of classification and regression trees, bagging and random forests, Psychological Methods, 14, 323–48.
SUTTON, C. D., (2005). Classification and regression trees, bagging, and boosting, in Handbook of statistics 24: data mining and data visualization (Eds.) C. R. Rao, E. J. Wegman and J. L. Solka, Elsevier, Amsterdam, pp. 303–29.
THERNEAU, T., ATKINSON, B., RIPLEY, B., (2012). Rpart: recursive partitioning and regression trees, R package version 3.1–55.
WANG, Y. Q., TSUI, K. Y., (2000). Polarization orderings and New Classes of Polarization Indices, Journal of Public Economic Theory, 2, 349–63.
WOLFSON, M. C., (1994). When inequalities diverge? American Economic Review, 84, 353–58.
ZENGA, M., (2007), Inequality curve and inequality index based on the ratios between lower and upper arithmetic means, Statistica & Applicazioni, 5, 3–27.
ZHANG, X., KANBUR, R., (2001). What difference do polarization measures make? An application to China, Journal of Development Studies, 37, 85–98