A Region-based Training Data Segmentation Strategy to Credit Scoring

Roberto Saia, Salvatore Carta, Gianni Fenu, Livio Pompianu

2022

Abstract

The rating of users requesting financial services is a growing task, especially in this historical period of the COVID-19 pandemic characterized by a dramatic increase in online activities, mainly related to e-commerce. This kind of assessment is a task manually performed in the past that today needs to be carried out by automatic credit scoring systems, due to the enormous number of requests to process. It follows that such systems play a crucial role for financial operators, as their effectiveness is directly related to gains and losses of money. Despite the huge investments in terms of financial and human resources devoted to the development of such systems, the state-of-the-art solutions are transversally affected by some well-known problems that make the development of credit scoring systems a challenging task, mainly related to the unbalance and heterogeneity of the involved data, problems to which it adds the scarcity of public datasets. The Region-based Training Data Segmentation (RTDS) strategy proposed in this work revolves around a divide-and-conquer approach, where the user classification depends on the results of several sub-classifications. In more detail, the training data is divided into regions that bound different users and features, which are used to train several classification models that will lead toward the final classification through a majority voting rule. Such a strategy relies on the consideration that the independent analysis of different users and features can lead to a more accurate classification than that offered by a single evaluation model trained on the entire dataset. The validation process carried out using three public real-world datasets with a different number of features, samples, and degree of data imbalance demonstrates the effectiveness of the proposed strategy, which outperforms the canonical training one in the context of all the datasets.

Download


Paper Citation


in Harvard Style

Saia R., Carta S., Fenu G. and Pompianu L. (2022). A Region-based Training Data Segmentation Strategy to Credit Scoring. In Proceedings of the 19th International Conference on Security and Cryptography - Volume 1: SECRYPT, ISBN 978-989-758-590-6, pages 275-282. DOI: 10.5220/0011137400003283


in Bibtex Style

@conference{secrypt22,
author={Roberto Saia and Salvatore Carta and Gianni Fenu and Livio Pompianu},
title={A Region-based Training Data Segmentation Strategy to Credit Scoring},
booktitle={Proceedings of the 19th International Conference on Security and Cryptography - Volume 1: SECRYPT,},
year={2022},
pages={275-282},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011137400003283},
isbn={978-989-758-590-6},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 19th International Conference on Security and Cryptography - Volume 1: SECRYPT,
TI - A Region-based Training Data Segmentation Strategy to Credit Scoring
SN - 978-989-758-590-6
AU - Saia R.
AU - Carta S.
AU - Fenu G.
AU - Pompianu L.
PY - 2022
SP - 275
EP - 282
DO - 10.5220/0011137400003283