Comparing Supervised Classification Methods for Financial Domain Problems

Victor Pugliese, Celso Hirata, Renato Costa

2020

Abstract

Classification is key to the success of the financial business. Classification is used to analyze risk, the occurrence of fraud, and credit-granting problems. The supervised classification methods help the analyzes by ’learning’ patterns in data to predict an associated class. The most common methods include Naive Bayes, Logistic Regression, K-Nearest Neighbors, Decision Tree, Random Forest, Gradient Boosting, XGBoost, and Multilayer Perceptron. We conduct a comparative study to identify which methods perform best on problems of analyzing risk, the occurrence of fraud, and credit-granting. Our motivation is to identify if there is a method that outperforms systematically others for the aforementioned problems. We also consider the application of Optuna, which is a next-generation Hyperparameter optimization framework on methods to achieve better results. We applied the non-parametric Friedman test to infer hypotheses and we performed Nemeyni as a posthoc test to validate the results obtained on five datasets in Finance Domain. We adopted the performance metrics F1 Score and AUROC. We achieved better results in applying Optuna in most of the evaluations, and XGBoost was the best method. We conclude that XGBoost is the recommended machine learning classification method to overcome when proposing new methods for problems of analyzing risk, fraud, and credit.

Download


Paper Citation


in Harvard Style

Pugliese V., Hirata C. and Costa R. (2020). Comparing Supervised Classification Methods for Financial Domain Problems.In Proceedings of the 22nd International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-758-423-7, pages 440-451. DOI: 10.5220/0009426204400451


in Bibtex Style

@conference{iceis20,
author={Victor Pugliese and Celso Hirata and Renato Costa},
title={Comparing Supervised Classification Methods for Financial Domain Problems},
booktitle={Proceedings of the 22nd International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2020},
pages={440-451},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0009426204400451},
isbn={978-989-758-423-7},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 22nd International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - Comparing Supervised Classification Methods for Financial Domain Problems
SN - 978-989-758-423-7
AU - Pugliese V.
AU - Hirata C.
AU - Costa R.
PY - 2020
SP - 440
EP - 451
DO - 10.5220/0009426204400451