Exploring the Impact of Data Heterogeneity in Federated Learning for Fraud Detection
Zhiqiu Wang
2024
Abstract
With the increase of credit card utilization rate, credit card fraud cases are increasing, which has gradually become an important problem that people need to solve. This study examines the overall effectiveness of the three Machine Learning (ML) methods, proposes a federated learning algorithm integrated with three separate ML methods, and discusses the algorithms' performance in the face of varying degrees of data heterogeneity. The study uses a Kaggle dataset that included information on about 550,000 credit card trades made by cardholders across Europe. By using K-means algorithm to simulate different degrees of heterogeneity in data, ML methods such as Logistic Regression, Decision Tree and Random Forest are respectively used to embed the framework of federated learning. Each model was applied to these data with varying degrees of heterogeneity for fraud identification of credit card transactions. The results show that federal learning algorithms still face challenges when faced with data with strong data heterogeneity. The performance of Logistic Regression and Decision Tree method is more stable, while the performance of Random Forest method is more volatile.
DownloadPaper Citation
in Harvard Style
Wang Z. (2024). Exploring the Impact of Data Heterogeneity in Federated Learning for Fraud Detection. In Proceedings of the 2nd International Conference on Data Analysis and Machine Learning - Volume 1: DAML; ISBN 978-989-758-754-2, SciTePress, pages 418-422. DOI: 10.5220/0013525200004619
in Bibtex Style
@conference{daml24,
author={Zhiqiu Wang},
title={Exploring the Impact of Data Heterogeneity in Federated Learning for Fraud Detection},
booktitle={Proceedings of the 2nd International Conference on Data Analysis and Machine Learning - Volume 1: DAML},
year={2024},
pages={418-422},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013525200004619},
isbn={978-989-758-754-2},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 2nd International Conference on Data Analysis and Machine Learning - Volume 1: DAML
TI - Exploring the Impact of Data Heterogeneity in Federated Learning for Fraud Detection
SN - 978-989-758-754-2
AU - Wang Z.
PY - 2024
SP - 418
EP - 422
DO - 10.5220/0013525200004619
PB - SciTePress