Impact of Using GAN Generated Synthetic Data for the Classification of Chemical Foam in Low Data Availability Environments

Toon Stuyck, Eric Demeester

2024

Abstract

One of the main challenges of using machine learning in the chemical sector is a lack of qualitative labeled data. Data of certain events can be extremely rare, or very costly to generate, e.g. an anomaly during a production process. Even if data is available it often requires highly educated observers to correctly annotate the data. The performance of supervised classification algorithms can be drastically reduced when confronted with limited amounts of training data. Data augmentation is typically used in order to increase the amount of available training data but the risk exists of overfitting or loss of information. In recent years Generative Adversarial Networks have been able to generate realistically looking synthetic data, even on small amounts of training data. In this paper the feasibility of utilizing Generative Adversarial Network generated synthetic data to improve classification results will be demonstrated via a comparison with and without standard augmentation methods such as scaling, rotation,... . In this paper a methodology is proposed on how to combine original data and synthetic data to achieve the best classifier result and to quantitatively verify generalization of the classifier using an explainable AI method. The proposed methodology compares favourably to using no or standard augmentation methods in the case of classification of chemical foam.

Download


Paper Citation


in Harvard Style

Stuyck T. and Demeester E. (2024). Impact of Using GAN Generated Synthetic Data for the Classification of Chemical Foam in Low Data Availability Environments. In Proceedings of the 13th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM; ISBN 978-989-758-684-2, SciTePress, pages 620-627. DOI: 10.5220/0012305300003654


in Bibtex Style

@conference{icpram24,
author={Toon Stuyck and Eric Demeester},
title={Impact of Using GAN Generated Synthetic Data for the Classification of Chemical Foam in Low Data Availability Environments},
booktitle={Proceedings of the 13th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM},
year={2024},
pages={620-627},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012305300003654},
isbn={978-989-758-684-2},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 13th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM
TI - Impact of Using GAN Generated Synthetic Data for the Classification of Chemical Foam in Low Data Availability Environments
SN - 978-989-758-684-2
AU - Stuyck T.
AU - Demeester E.
PY - 2024
SP - 620
EP - 627
DO - 10.5220/0012305300003654
PB - SciTePress