Over the past years, machine learning emerged as a powerful tool for credit scoring, producing high-quality results compared to traditional statistical methods. However, literature shows that statistical methods are still being used because they still perform and can be interpretable compared to neural network models, considered to be black boxes. This study compares the predictive power of logistic regression and multilayer perceptron algorithms on two credit-risk datasets by applying the Local Interpretable Model-Agnostic Explanations (LIME) explainability technique. Our results show that multilayer perceptron outperforms logistic regression in terms of balanced accuracy, Matthews Correlation Coefficient, and F1 score. Based on our findings from LIME, building models on imbalanced datasets results in biased predictions towards the majority class. Model developers in the field of finance could consider explanation methods such as LIME to extend the use of deep learning models to help them make well-informed decisions.
credit score, logistic regression, multilayer perceptron, explainability, LIME
Agarwal, R., Melnick, L., Frosst, N., Zhang, X., Lengerich, B., Caruana, R. and Hinton, G. E., (2021). Statistical regression modeling with R. Advances in Neural Information Processing Systems, 34, No. 1, pp. 4699–4711.
Bolton, C., (2009). Logistic regression and its application in credit scoring.University of Pretoria (South Africa).
Chawla, N. V., Bowyer, K. W., Hall, L. O. and Kegelmeyer, W. P., (2002). SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research, 16, No. 1, pp. 321–357.
Chicco, D., Jurman, G., (2020). The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC genomics, 21, pp. 1–13.
Dumitrescu, E., Hué, S., Hurlin, C. and Tokpavi, S., (2022). Machine learning for credit scoring: Improving logistic regression with non-linear decision-tree effects. European Journal of Operational Research, 297, pp. 1178–1192.
Hand, D. J., Henley, W. E., (1997). Statistical classification methods in consumer credit scoring: a review.Journal of the Royal Statistical Society: Series A (Statistics in Society), Vol. 160(3), pp. 523–541.
Hossain, M. M., (2022). Statistical regression modeling with R. Oxford University Press, Vol. 5, No. 1, pp. 63–72.
Imtiaz, S., Brimicombe, A. J., (2017). A Better Comparison Summary of Credit Scoring Classification. International Journal of Advanced Computer Science and Applications, 8(7).
Laborda, J., Ryoo, S., (2021). Feature selection in a credit scoring model. Mathematics. Mathematics, 16, 9(7), p. 746.
Lahsasna, A., Ainon, R. N. and Teh, Y. W., (2010). Credit Scoring Models Using Soft Computing Methods: A Survey. Int. Arab J. Inf. Technol., 7(2), pp. 115–123.
Little, R. J. Rubin, D. B., (2019). Statistical analysis with missing data, Vol. 793. JohnWiley & Sons.
Mahmood, E. A., (2024). Robust Estimation of Multiple Logistic Regression Model.
Misra, P., Yadav, A. S., (2020). Improving the classification accuracy using recursive feature elimination with cross-validation. Int. J. Emerg. Technol, 11(3), pp. 659–665.
Molnar, C., (2020). Interpretable machine learning. Lulu. com.
Mustaqim, A. Z., Adi, S., Pristyanto, Y. and Astuti, Y., (2021). The effect of recursive feature elimination with cross-validation (RFECV) feature selection algorithm toward classifier performance on credit card fraud detection. In 2021 International conference on artificial intelligence and computer science technology (ICAICST) , pp. 270–275, IEEE.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P.,Weiss, R., Dubourg, V. and Vanderplas, J., (2011). Scikit-learn: Machine Learning in Python. the Journal of machine Learning research, 12, pp. 2825–2830.
Ribeiro, M. T., Singh, S. and Guestrin, C., (2016). Why should i trust you?" Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledgediscovery and data mining, pp. 1135–1144.
Rodrigues, P. C., Awe, O. O., Pimentel, J. S. and Mahmoudvand, R., (2020). Modelling the behaviour of currency exchange rates with singular spectrum analysis and artificial neural networks. Stats, 3(2), pp. 137–157.
Scheuren, F., (2005). Multiple imputation: How it began and continues. The American Statistician, Vol. 59(4), pp. 315–319.
Sewpaul, R., Awe, O. O., Dogbey, D. M., Sekgala, M. D. and Dukhi, N., (2023). Classification of Obesity among South African Female Adolescents: Comparative Analysis of Logistic Regression and Random Forest Algorithms. International Journal of Environmental Research and Public Health, 21(1), No. 1, p. 2.
Taghavi Takyar, S. M., Aghajan Nashtaei, R. and Chirani, E., (2015). The Comparison of Credit Risk between Artificial Neural Network and Logistic Regression Models in Tose- Taavon Bank in Guilan. International Journal of Applied Operational Research-An Open Access Journal, 5(1), pp. 63–72.
Tran, K. L., Le, H. A., Nguyen, T. H. and Nguyen, D. T., (2022). Explainable machine learning for financial distress prediction: Evidence from Vietnam. Data, 7(11), p. 160.
Van Buuren, S., Oudshoorn, C. G. M., (2000). Multivariate imputation by chained equations: Mice v1. 0 user’s manual.
Wulff, J. N., Jeppesen, L. E., (2017). Multiple imputation by chained equations in praxis: guidelines and review Electronic Journal of Business Research Methods, 15, pp. 41–56.
Zhang, A., Lipton, Z.C., Li, M. and Smola, A. J., (2023). Dive into deep learning Cambridge University Press.
Zhao, Z., Xu, S., Kang, B. H., Kabir, M. M. J., Liu, Y. and Wasinger, R., (2015). Investigation and improvement of multi-layer perceptron neural networks for credit scoring. Expert Systems with Applications, 42(7), pp. 3508–3516.