journal article Open Access Jan 04, 2024

Machine Learning for an Enhanced Credit Risk Analysis: A Comparative Study of Loan Approval Prediction Models Integrating Mental Health Data

View at Publisher Save 10.3390/make6010004
Abstract
The number of loan requests is rapidly growing worldwide representing a multi-billion-dollar business in the credit approval industry. Large data volumes extracted from the banking transactions that represent customers’ behavior are available, but processing loan applications is a complex and time-consuming task for banking institutions. In 2022, over 20 million Americans had open loans, totaling USD 178 billion in debt, although over 20% of loan applications were rejected. Numerous statistical methods have been deployed to estimate loan risks opening the field to estimate whether machine learning techniques can better predict the potential risks. To study the machine learning paradigm in this sector, the mental health dataset and loan approval dataset presenting survey results from 1991 individuals are used as inputs to experiment with the credit risk prediction ability of the chosen machine learning algorithms. Giving a comprehensive comparative analysis, this paper shows how the chosen machine learning algorithms can distinguish between normal and risky loan customers who might never pay their debts back. The results from the tested algorithms show that XGBoost achieves the highest accuracy of 84% in the first dataset, surpassing gradient boost (83%) and KNN (83%). In the second dataset, random forest achieved the highest accuracy of 85%, followed by decision tree and KNN with 83%. Alongside accuracy, the precision, recall, and overall performance of the algorithms were tested and a confusion matrix analysis was performed producing numerical results that emphasized the superior performance of XGBoost and random forest in the classification tasks in the first dataset, and XGBoost and decision tree in the second dataset. Researchers and practitioners can rely on these findings to form their model selection process and enhance the accuracy and precision of their classification models.
Topics

No keywords indexed for this article. Browse by subject →

References
32
[1]
Prabaljeet, S.S., Atush, B., and Lekha, R. (2023, December 19). Loan Approval Prediction Using Machine Learning: A Comparative Analysis of Classification Algorithms. Available online: https://ieeexplore.ieee.org/document/10182799/authors#authors.
[2]
Yash, D., Prashant, R., and Pratik, C. (2023, December 19). Loan Approval Prediction Using Machine Learning. Available online: https://www.irjet.net/archives/V8/i5/IRJET-V8I5331.pdf.
[3]
Mohammad, A.S., Amit, K.G., and Tapas, K. (2023, December 19). An Approach for Prediction of Loan Approval Using Machine Learning Algorithm. Available online: https://ieeexplore.ieee.org/document/9155614.
[4]
Almheiri, A.S. (2023, December 19). Automated Loan Approval System for Banks. Available online: https://scholarworks.rit.edu/cgi/viewcontent.cgi?article=12535&context=theses.
[5]
Banco de España, Eurosistema (2023, December 19). Report on the Financial and Banking Crisis in Spain, 2008–2014. Available online: https://repositorio.bde.es/bitstream/123456789/15112/1/InformeCrisis_Completo_web_en.pdf.
[6]
(2023, December 19). How Much Does Racial Bias Affect Mortgage Lending? Evidence from Human and Algorithmic Credit Decisions—Neil Bhutta, Aurel Hizmo, Daniel Ringo, Available online: https://www.federalreserve.gov/econres/feds/files/2022067pap.pdf.
[7]
Roberts, R. (2019). Mental Health and Money: A Practical Guide, Money and Mental Health Policy Institute.
[8]
Bhargav, P., and Sashirekha, K. (2023, December 19). A Machine Learning Method for Predicting Loan Approval by Comparing the Random Forest and Decision Tree Algorithms. Available online: https://sifisheriessciences.com/journal/index.php/journal/article/view/414/397.
[9]
Wang, Y., Wang, M., Yong, P., and Chen, J. (2023). Joint loan risk prediction based on deep learning-optimized stacking model. Eng. Rep., e12748. 10.1002/eng2.12748
[10]
Abdullah "Forecasting nonperforming loans using machine learning" J. Forecast. (2023) 10.1002/for.2977
[11]
Alsaleem "Predicting bank loan risks using machine learning algorithms" AL-Rafidain J. Comput. Sci. Math. (2020)
[12]
World Health Organization (2023, December 19). Mental Disorders, Available online: https://www.who.int/health-topics/mental-disorders#tab=tab_1.
[13]
National Alliance on Mental Illness (2023, December 19). Mental Health by the Numbers. Available online: https://www.nami.org/mhstats.
[14]
Mental Health America (2023, December 19). The State of Mental Health in America. Available online: https://mhanational.org/sites/default/files/2021%20State%20of%20Mental%20Health%20in%20America_0.pdf.
[15]
Mental Health First Aid USA (2023, December 19). About Mental Health First Aid. Available online: https://www.mentalhealthfirstaid.org/about/.
[16]
Javed "A comparative study of decision tree algorithms for nonlinear and complex relationships between input features and output variables" Int. J. Adv. Res. Comput. Sci. Softw. Eng. (2015)
[17]
Breiman, L., Friedman, J., Stone, C.J., and Olshen, R.A. (1984). Classification and Regression Trees, Taylor & Francis.
[18]
Predicting good probabilities with supervised learning

Alexandru Niculescu-Mizil, Rich Caruana

Proceedings of the 22nd international conference o... 10.1145/1102351.1102430
[19]
Shahriari, B., Swersky, K., Wang, Z., Adams, R.P., and de Freitas, N. (2016). Proceedings of the IEEE, IEEE.
[20]
Kaviani "Short Survey on Naive Bayes Algorithm" Int. J. Adv. Res. Comput. Sci. Manag. (2017)
[21]
Jena, B. (2021). Gender Recognition of Speech Signal using KNN and SVM. SSRN Electron. J. 10.2139/ssrn.3769786
[22]
Zhan "A video semantic detection method based on locality-sensitive discriminant sparse representation and weighted KNN" J. Vis. Commun. Image Represent. (2016) 10.1016/j.jvcir.2016.09.006
[23]
Syaliman, K.U., and Labellapansa, A. (2019). Improving the Accuracy of Features Weighted k-Nearest Neighbor Using Distance Weigh, SciTePress. 10.5220/0009390903260330
[24]
Freund, Y., and Schapire, R.E. (2013). Boosting: Foundations and Algorithms, The MIT Press. 10.7551/mitpress/8291.001.0001
[25]
Shahri "Comparing the Performance of AdaBoost, XGBoost, and Logistic Regression for Imbalanced Data" Math. Stat. (2021) 10.13189/ms.2021.090320
[26]
Greedy function approximation: A gradient boosting machine.

Jerome H. Friedman

The Annals of Statistics 2001 10.1214/aos/1013203451
[27]
Masui, T. (2023, December 19). All You Need to Know about Gradient Boosting Algorithm—Part 1. Available online: https://towardsdatascience.com/all-you-need-to-know-about-gradient-boosting-algorithm-part-1-regression-2520a34a502.
[28]
Chen, T., and Guestrin, C. (2016, January 13–17). XGBoost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA. 10.1145/2939672.2939785
[29]
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., and Liu, T.-Y. (2017, January 4–9). LightGBM: A highly efficient gradient boosting decision tree. Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA.
[30]
Sujatha, C.N., Gudipalli, A., Pushyami, B.H., Karthik, N., and Sanjana, B.N. (2021, January 27–29). Loan Prediction Using Machine Learning and Its Deployment on Web Application. Proceedings of the 2021 Innovations in Power and Advanced Computing Technologies (i-PACT), Kuala Lumpur, Malaysia. 10.1109/i-pact52855.2021.9696448
[31]
Tumuluru, P., Burra, L.R., Loukya, M., Bhavana, S., CSaiBaba, H.M.H., and Sunanda, N. (2022, January 23–25). Comparative Analysis of Customer Loan Approval Prediction using Machine Learning Algorithms. Proceedings of the Second International Conference on Artificial Intelligence and Smart Energy (ICAIS-2022), Coimbatore, India. 10.1109/icais53314.2022.9742800
[32]
Mamun, M.A., Farjana, A., and Mamun, M. (2022, January 12–14). Predicting Bank Loan Eligibility Using Machine Learning Models and Comparison Analysis. Proceedings of the 7th North American International Conference on Industrial Engineering and Operations Management, Orlando, FL, USA.
Cited By
30
Eskişehir Osmangazi Üniversitesi İk...
Machine Learning with Applications
Related

You May Also Like

A Comprehensive Review of YOLO Architectures in Computer Vision: From YOLOv1 to YOLOv8 and YOLO-NAS

Juan Terven, Diana-Margarita Córdova-Esparza · 2023

2,412 citations

A CNN-BiLSTM Model for Document-Level Sentiment Analysis

Maryem Rhanoui, Mounia Mikram · 2019

245 citations

Causal Discovery with Attention-Based Convolutional Neural Networks

Meike Nauta, Doina Bucur · 2019

197 citations

A Survey of Machine Learning-Based Solutions for Phishing Website Detection

Lizhen Tang, Qusay H. Mahmoud · 2021

151 citations