(LR) follows closely with an F1-score of 95.11%.
Finally, the comparison of the F1- score sheds light on
the models’ total performance in terms of
classification accuracy with the overall F1-score that
seems to accurately measure the quality of the models
achieving high scores for both precision and recall.
5 CONCLUSIONS AND FUTURE
SCOPE
The advancement in technologies and the growing
adoption of electronic financial transactions have
significantly increased the risk of fraudulent activities
due to simplified verification processes. In this study,
we analyzed a dataset comprising 284,807 credit card
transactions from European users. For fraud
detection, the dataset was split into 80% training and
20% testing data to build and evaluate models.
Preprocessing steps included Z-score normalization
for standardization, one-hot encoding for categorical
variables, and handling missing values through
appropriate techniques.
To assess the performance of various machine
learning models, we utilized key evaluation metrics:
accuracy, precision, recall, F1 score, and the confusion
matrix. Among the models tested, the XGBoost
Classifier demonstrated superior performance. While
the accuracy of the XGBoost model reached an
impressive 99%, further analysis revealed that
precision and recall were critical in addressing the
misclassification rate, particularly for the minority
class (fraudulent transactions).
The confusion matrix highlighted the model's
ability to correctly classify the majority of genuine
transactions while maintaining a reasonable balance
in detecting fraudulent cases. However, the results
underscore the importance of selecting the most
appropriate evaluation criterion—such as recall or F1
score to ensure effective fraud detection, especially in
imbalanced datasets like this one.
One limitation of this study is that the dataset was
collected over only two trading days, which may not
fully capture long-term trends or variations in
fraudulent behavior. Future research could address
this by incorporating a more extensive and diverse
collection of fraudulent transactions and exploring
advanced deep-learning algorithms to enhance fraud
detection rates and improve resistance to emerging
fraud techniques.
REFERENCES
A. Liaw, M. Wiener, Classification and regression by
random Forest, R News 2(3) (2002) 18–22.
A. Aditi, A. Dubey, A. Mathur, P. Garg, Credit Card Fraud
Detection Using Advanced Machine Learning
Techniques. (2022), 56–60.
http://dx.doi.org/10.1109/ccict56684.2022.00022.
B. G. Tabachnick, L.S. Fidell, Using Multivariate Statistics,
Harper Collins, New York, 1996.
F. C. Yann-a, Streaming active learning strategies for real-
life credit card fraud detection: Assessment and
visualization, 2018.
J. A. Michael, S.L. Gordon, Data Mining Technique for
Marketing, Sales and Customer Support, John Wiley &
Sons INC, New York, 1997, p. 445.
K. Yak, D. Tudeal, Internet Banking Development as A
Means of Providing Efficient Financial Services in
South Sudan. 2 (2011) 139–148.
K. Randhawa, C.H.U.K. Loo, S. Member, Credit card fraud
detection using AdaBoost and majority voting, IEEE
Access 6 (2018) 14277– 14284,
http://dx.doi.org/10.1109/ACCESS.2018.2806420.
K. Ayorinde, Cornerstone: A Collection of Scholarly and
Creative Works for Minnesota State University,
Mankato a Methodology for Detecting Credit Card
Fraud- Kayode Ayorinde (Thesis Master’s), Data
Science Minnesota State University Mankato, MN,
2021.
L. Breiman, Random forests, Mach. Learn. 45 (1) (2001)5–
32.
L. Guanjun, L. Zhenchuan, Z. Lutao, W. Shuo, Random
Forest for credit card fraud, IEEE Access (2018).
N. K. Trivedi, S. Simaiya, U. K. Lilhore, and S. K. Sharma,
“An efficient credit card fraud detection model based
on machine learning methods,” Int. J. Adv. Sci.
Technol., 2020.
N. S. Alfaiz and S. M. Fati, “Enhanced Credit Card Fraud
Detection Model Using Machine Learning,” Electron.,
2022, doi: 10.3390/electronics11040662.
O. Citation, B. Systems, University of Huddersfield
Repository Credit card fraud and detection techniques:
a review, 2009.
S. B. E. Raj, A.A. Portia, A. Sg, Analysis on Credit Card
Fraud Detection Methods. (2011) 152–156.
S. Madan, S. Sofat, D. Bansal, Tools and Techniques for
Collection and Analysis of Internet-of-Things malware:
A systematic state-of-art review, J. King
SaudUniv.Comput.Inf.Sci.(2021)xxxx,http://dx.doi.or
g/10.1016/j.jksuci.2021.12.016.
T. Pencarelli, The digital revolution in the travel and
tourism industry, Inf. Technol. Tourism (2019)
0123456789, http://dx.doi.org/10.1007/s40558-019-
00160-3.
V. Bolón-Canedo, N. Sánchez-Maroño, A. Alonso-
Betanzos, J. M. Benítez, and F. Herrera, “A review of
microarray datasets and applied feature selection
methods,” Inf. Sci. (Ny)., 2014, doi:
10.1016/j.ins.2014.05.042.