Towards Automatic Block Size Tuning for Image Processing Algorithms on CUDA

Imene Guerfi, Lobna Kriaa, Leila Saidane

2022

Abstract

With the growing amount of data, computational power has became highly required in all fields. To satisfy these requirements, the use of GPUs seems to be the appropriate solution. But one of their major setbacks is their varying architectures making writing efficient parallel code very challenging, due to the necessity to master the GPU’s low-level design. CUDA offers more flexibility for the programmer to exploit the GPU’s power with ease. However, tuning the launch parameters of its kernels such as block size remains a daunting task. This parameter requires a deep understanding of the architecture and the execution model to be well-tuned. Particularly, in the Viola-Jones algorithm, the block size is an important factor that improves the execution time, but this optimization aspect is not well explored. This paper aims to offer the first steps toward automatically tuning the block size for any input without having a deep knowledge of the hardware architecture, which ensures the automatic portability of the performance over different GPUs architectures. The main idea is to define techniques on how to get the optimum block size to achieve the best performance. We pointed out the impact of using static block size for all input sizes on the overall performance. In light of the findings, we presented two dynamic approaches to select the best block size suitable to the input size. The first one is based on an empirical search; this approach provides the optimal performance; however, it is tough for the programmer, and its deployment is time-consuming. In order to overcome this issue, we proposed a second approach, which is a model that automatically selects a block size. Experimental results show that this model can improve the execution time by up to 2.5x over the static approach.

Download


Paper Citation


in Harvard Style

Guerfi I., Kriaa L. and Saidane L. (2022). Towards Automatic Block Size Tuning for Image Processing Algorithms on CUDA. In Proceedings of the 17th International Conference on Software Technologies - Volume 1: ICSOFT, ISBN 978-989-758-588-3, pages 591-601. DOI: 10.5220/0011314800003266


in Bibtex Style

@conference{icsoft22,
author={Imene Guerfi and Lobna Kriaa and Leila Saidane},
title={Towards Automatic Block Size Tuning for Image Processing Algorithms on CUDA},
booktitle={Proceedings of the 17th International Conference on Software Technologies - Volume 1: ICSOFT,},
year={2022},
pages={591-601},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011314800003266},
isbn={978-989-758-588-3},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 17th International Conference on Software Technologies - Volume 1: ICSOFT,
TI - Towards Automatic Block Size Tuning for Image Processing Algorithms on CUDA
SN - 978-989-758-588-3
AU - Guerfi I.
AU - Kriaa L.
AU - Saidane L.
PY - 2022
SP - 591
EP - 601
DO - 10.5220/0011314800003266