loading
Papers

Research.Publish.Connect.

Paper

Authors: Başak Gültekin and Betül Erdoğdu Şakar

Affiliation: Faculty of Engineering and Natural Sciences, Bahçeşehir University, Beşiktaş and Turkey

ISBN: 978-989-758-318-6

Keyword(s): Credit Scoring, Default Prediction, Feature Selection, Classification, Boruta, Logistic Regression, Random Forest, Artificial Neural Network.

Related Ontology Subjects/Areas/Topics: Applications ; Artificial Intelligence ; Biomedical Engineering ; Biomedical Signal Processing ; Business Analytics ; Business Intelligence ; Cardiovascular Technologies ; Computing and Telecommunications in Cardiology ; Data Analytics ; Data Engineering ; Data Manipulation ; Data Mining ; Databases and Information Systems Integration ; Datamining ; Decision Support Systems ; Decision Support Systems, Remote Data Analysis ; Enterprise Information Systems ; Health Engineering and Technology Applications ; Health Information Systems ; Human-Computer Interaction ; Knowledge Discovery and Information Retrieval ; Knowledge-Based Systems ; Methodologies and Methods ; Neurocomputing ; Neurotechnology, Electronics and Informatics ; Pattern Recognition ; Physiological Computing Systems ; Sensor Networks ; Signal Processing ; Soft Computing ; Software Engineering ; Statistics Exploratory Data Analysis ; Symbolic Systems

Abstract: In this study, different data mining techniques were applied to a real bank credit data set from a public bank to provide an automated and objective credit scoring. Two-step methodology was used for objective credit scoring: Determining the variables to be included in the model and deciding on the model to classify the potential credit application as “bad credit (default)” or “good credit (not default)”. The phrases “bad credit” and “good credit” are used as class labels since they are used like this in banking jargon in Turkey. For this two-step procedure, different variable selection algorithms like Random Forest, Boruta and machine learning algorithms like Logistic Regression, Random Forest, Artificial Neural Network were tried. At the end of the feature selection phase, CRA_Score and III_Score variables were determined as most important variables. Moreover, occupation and bank product number were also predictor variables. For the classification phase, Neural Network model was the best model with higher accuracy and low average square error also Random Forest model better resulted than Logistic Regression model. (More)

PDF ImageFull Text

Download
CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.233.224.8

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Gültekin, B. and Erdoğdu Şakar, B. (2018). Variable Importance Analysis in Default Prediction using Machine Learning Techniques.In Proceedings of the 7th International Conference on Data Science, Technology and Applications - Volume 1: DATA, ISBN 978-989-758-318-6, pages 56-62. DOI: 10.5220/0006872400560062

@conference{data18,
author={Başak Gültekin. and Betül Erdoğdu Şakar.},
title={Variable Importance Analysis in Default Prediction using Machine Learning Techniques},
booktitle={Proceedings of the 7th International Conference on Data Science, Technology and Applications - Volume 1: DATA,},
year={2018},
pages={56-62},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006872400560062},
isbn={978-989-758-318-6},
}

TY - CONF

JO - Proceedings of the 7th International Conference on Data Science, Technology and Applications - Volume 1: DATA,
TI - Variable Importance Analysis in Default Prediction using Machine Learning Techniques
SN - 978-989-758-318-6
AU - Gültekin, B.
AU - Erdoğdu Şakar, B.
PY - 2018
SP - 56
EP - 62
DO - 10.5220/0006872400560062

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.