Utilizing Generative Adversarial Networks for Preserving Privacy in Developing Machine Learning Models for the Healthcare Industry
Shahnawaz Khan, Bharavi Mishra, Sultan Alamri, Philippe Pringuet
2025
Abstract
Privacy preservation is a critical challenge while developing machine learning models utilizing medical data. This research investigates the application of Generative Adversarial Networks (GANs) for generating synthetic medical dataset while preserving the properties of the real-world dataset. It investigates on both types of medical datasets which are tabular and image-based datasets. This research employs Conditional Tabular GANs (CTGANs) for tabular data (Heart Disease Cleveland dataset) and Deep Convolutional GANs (DCGANs) for image data (chest X-ray dataset). The primary purpose is to synthesize datasets within the healthcare domain that closely mimic the statistical properties and diagnostic relevance of their real-world counterparts while safeguarding patient privacy. The proposed research focuses on training GANs to learn complex patterns and dependencies within the data. Thus, enabling GANs to generate realistic synthetic samples that can be used for training machine learning models. The generated datasets have been evaluated using various classifiers. The results demonstrate that models trained on synthetic data achieve comparable performance to those trained on real data. The results demonstrate the efficacy of our approach in balancing data utility and privacy. Furthermore, this research explores different techniques for privacy enhancement. These techniques include parameter tuning, differential privacy, and layer-wise perturbation, to further strengthen privacy preservation. The findings suggest that GAN-based synthetic data generation offers a robust and versatile solution for privacy-preserving machine learning in medical applications.
DownloadPaper Citation
in Harvard Style
Khan S., Mishra B., Alamri S. and Pringuet P. (2025). Utilizing Generative Adversarial Networks for Preserving Privacy in Developing Machine Learning Models for the Healthcare Industry. In Proceedings of the 14th International Conference on Data Science, Technology and Applications - Volume 1: DATA; ISBN 978-989-758-758-0, SciTePress, pages 544-551. DOI: 10.5220/0013567600003967
in Bibtex Style
@conference{data25,
author={Shahnawaz Khan and Bharavi Mishra and Sultan Alamri and Philippe Pringuet},
title={Utilizing Generative Adversarial Networks for Preserving Privacy in Developing Machine Learning Models for the Healthcare Industry},
booktitle={Proceedings of the 14th International Conference on Data Science, Technology and Applications - Volume 1: DATA},
year={2025},
pages={544-551},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013567600003967},
isbn={978-989-758-758-0},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 14th International Conference on Data Science, Technology and Applications - Volume 1: DATA
TI - Utilizing Generative Adversarial Networks for Preserving Privacy in Developing Machine Learning Models for the Healthcare Industry
SN - 978-989-758-758-0
AU - Khan S.
AU - Mishra B.
AU - Alamri S.
AU - Pringuet P.
PY - 2025
SP - 544
EP - 551
DO - 10.5220/0013567600003967
PB - SciTePress