Utilizing Generative Adversarial Networks for Preserving Privacy in Developing Machine Learning Models for the Healthcare Industry

Shahnawaz Khan, Bharavi Mishra, Sultan Alamri, Philippe Pringuet

2025

Abstract

Privacy preservation is a critical challenge while developing machine learning models utilizing medical data. This research investigates the application of Generative Adversarial Networks (GANs) for generating synthetic medical dataset while preserving the properties of the real-world dataset. It investigates on both types of medical datasets which are tabular and image-based datasets. This research employs Conditional Tabular GANs (CTGANs) for tabular data (Heart Disease Cleveland dataset) and Deep Convolutional GANs (DCGANs) for image data (chest X-ray dataset). The primary purpose is to synthesize datasets within the healthcare domain that closely mimic the statistical properties and diagnostic relevance of their real-world counterparts while safeguarding patient privacy. The proposed research focuses on training GANs to learn complex patterns and dependencies within the data. Thus, enabling GANs to generate realistic synthetic samples that can be used for training machine learning models. The generated datasets have been evaluated using various classifiers. The results demonstrate that models trained on synthetic data achieve comparable performance to those trained on real data. The results demonstrate the efficacy of our approach in balancing data utility and privacy. Furthermore, this research explores different techniques for privacy enhancement. These techniques include parameter tuning, differential privacy, and layer-wise perturbation, to further strengthen privacy preservation. The findings suggest that GAN-based synthetic data generation offers a robust and versatile solution for privacy-preserving machine learning in medical applications.

Download


Paper Citation


in Harvard Style

Khan S., Mishra B., Alamri S. and Pringuet P. (2025). Utilizing Generative Adversarial Networks for Preserving Privacy in Developing Machine Learning Models for the Healthcare Industry. In Proceedings of the 14th International Conference on Data Science, Technology and Applications - Volume 1: DATA; ISBN 978-989-758-758-0, SciTePress, pages 544-551. DOI: 10.5220/0013567600003967


in Bibtex Style

@conference{data25,
author={Shahnawaz Khan and Bharavi Mishra and Sultan Alamri and Philippe Pringuet},
title={Utilizing Generative Adversarial Networks for Preserving Privacy in Developing Machine Learning Models for the Healthcare Industry},
booktitle={Proceedings of the 14th International Conference on Data Science, Technology and Applications - Volume 1: DATA},
year={2025},
pages={544-551},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013567600003967},
isbn={978-989-758-758-0},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 14th International Conference on Data Science, Technology and Applications - Volume 1: DATA
TI - Utilizing Generative Adversarial Networks for Preserving Privacy in Developing Machine Learning Models for the Healthcare Industry
SN - 978-989-758-758-0
AU - Khan S.
AU - Mishra B.
AU - Alamri S.
AU - Pringuet P.
PY - 2025
SP - 544
EP - 551
DO - 10.5220/0013567600003967
PB - SciTePress