A Discretized Enriched Technique to Enhance Machine Learning Performance in Credit Scoring

Roberto Saia, Salvatore Carta, Diego Reforgiato Recupero, Gianni Fenu, Marco Saia

2019

Abstract

The automated credit scoring tools play a crucial role in many financial environments, since they are able to perform a real-time evaluation of a user (e.g., a loan applicant) on the basis of several solvency criteria, without the aid of human operators. Such an automation allows who work and offer services in the financial area to take quick decisions with regard to different services, first and foremost those concerning the consumer credit, whose requests have exponentially increased over the last years. In order to face some well-known problems related to the state-of-the-art credit scoring approaches, this paper formalizes a novel data model that we called Discretized Enriched Data (DED), which operates by transforming the original feature space in order to improve the performance of the credit scoring machine learning algorithms. The idea behind the proposed DED model revolves around two processes, the first one aimed to reduce the number of feature patterns through a data discretization process, and the second one aimed to enrich the discretized data by adding several meta-features. The data discretization faces the problem of heterogeneity, which characterizes such a domain, whereas the data enrichment works on the related loss of information by adding meta-features that improve the data characterization. Our model has been evaluated in the context of real-world datasets with different sizes and levels of data unbalance, which are considered a benchmark in credit scoring literature. The obtained results indicate that it is able to improve the performance of one of the most performing machine learning algorithm largely used in this field, opening up new perspectives for the definition of more effective credit scoring solutions.

Download


Paper Citation


in Harvard Style

Saia R., Carta S., Recupero D., Fenu G. and Saia M. (2019). A Discretized Enriched Technique to Enhance Machine Learning Performance in Credit Scoring. In Proceedings of the 11th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2019) - Volume 1: KDIR; ISBN 978-989-758-382-7, SciTePress, pages 202-213. DOI: 10.5220/0008377702020213


in Bibtex Style

@conference{kdir19,
author={Roberto Saia and Salvatore Carta and Diego Reforgiato Recupero and Gianni Fenu and Marco Saia},
title={A Discretized Enriched Technique to Enhance Machine Learning Performance in Credit Scoring},
booktitle={Proceedings of the 11th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2019) - Volume 1: KDIR},
year={2019},
pages={202-213},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0008377702020213},
isbn={978-989-758-382-7},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 11th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2019) - Volume 1: KDIR
TI - A Discretized Enriched Technique to Enhance Machine Learning Performance in Credit Scoring
SN - 978-989-758-382-7
AU - Saia R.
AU - Carta S.
AU - Recupero D.
AU - Fenu G.
AU - Saia M.
PY - 2019
SP - 202
EP - 213
DO - 10.5220/0008377702020213
PB - SciTePress