Combining Data Envelopment Analysis with Machine Learning Algorithms for Predictions

No Thumbnail Available
Date
2020-09
Journal Title
Journal ISSN
Volume Title
Publisher
KNUST
Abstract
Comparative to other methods, DEA is an improved method to organize and analyze data. However, it is very difficult to use only DEA to predict the efficiency and performance of other or new Decision Making Units (DMU). The main objective of this study is to build a high accuracy machine learning predictive models for predicting the efficiencies of banks by combining DEA with Machine Learning algorithm. The study built four Machine Learning Models namely; DEA-DT, DEA-RF, DEA-NN and DEA-LR to predict the efficiencies of banks. The study used 33% of the total bank branches in Ghana, largely in the nine regions. A two-stage DEA was used to determine the efficiencies of all bank branches and these banks were grouped based on a proposed algorithm, Bank Classification Algorithm (BC Algorithm). In building the predictive models, 70% of the banks dataset were used to train and validate the models. The developed models were used to predict the efficiencies of the other 30% banks. A 10-fold Cross-Validation was applied to check the performance of all predicting models on each case dataset. All experiments were executed within a simulation environment and conducted in R studio using R programming language. Standardized Machine Learning evaluation metrics were used to compare the models. The results suggested a very good performance of all the machine learning models proposed by the study. However, a comparison among them clearly indicated a much better performance by the DEA-RF for predicting banks’ efficiency in collecting deposit and DEA-DT for predicting banks’ efficiency in investing deposits. This study has demonstrated that combing two models improve the performance, predictions and classification accuracies suggested by previous studies. In conclusion, the study proposed the usage of the proposed BC Algorithm for classifying banks based on their efficiencies in deposit stage and investment stage.
Description
A doctoral thesis submitted to the College of Science in partial fulfillment of the requirements for the award of a degree of Doctor of Philosophy in Computer Science.
Keywords
Citation