Variational Autoencoders for Pedestrian Synthetic Data Augmentation of Existing Datasets: A Preliminary Investigation

Ivan Nikolov

2024

Abstract

The requirements for more and more data for training deep learning surveillance and object detection models have resulted in slower deployment and more costs connected to dataset gathering, annotation, and testing. One way to help with this is the use of synthetic data giving more varied scenarios and not requiring manual annotation. We present our initial exploratory work in generating synthetic pedestrian augmentations for an existing dataset through the use of variational autoencoders. Our method consists of creating a large number of backgrounds and training a variational autoencoder on a small subset of annotated pedestrians. We then interpolate the latent space of the autoencoder to generate variations of these pedestrians, calculate their positions on the backgrounds, and blend them to create new images. We show that even though we do not achieve as good results as just adding more real images, we can boost the performance and robustness of a YoloV5 model trained on a mix of real and small amounts of synthetic images. As part of this paper, we also propose the next steps to expand this approach and make it much more useful for a wider array of datasets.

Download


Paper Citation


in Harvard Style

Nikolov I. (2024). Variational Autoencoders for Pedestrian Synthetic Data Augmentation of Existing Datasets: A Preliminary Investigation. In Proceedings of the 19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 2: VISAPP; ISBN 978-989-758-679-8, SciTePress, pages 829-836. DOI: 10.5220/0012570700003660


in Bibtex Style

@conference{visapp24,
author={Ivan Nikolov},
title={Variational Autoencoders for Pedestrian Synthetic Data Augmentation of Existing Datasets: A Preliminary Investigation},
booktitle={Proceedings of the 19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 2: VISAPP},
year={2024},
pages={829-836},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012570700003660},
isbn={978-989-758-679-8},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 2: VISAPP
TI - Variational Autoencoders for Pedestrian Synthetic Data Augmentation of Existing Datasets: A Preliminary Investigation
SN - 978-989-758-679-8
AU - Nikolov I.
PY - 2024
SP - 829
EP - 836
DO - 10.5220/0012570700003660
PB - SciTePress