PRACTICAL EXAMPLES OF GPU COMPUTING OPTIMIZATION PRINCIPLES

Patrik Goorts, Sammy Rogmans, Steven Vanden Eynde, Philippe Bekaert

2010

Abstract

In this paper, we provide examples to optimize signal processing or visual computing algorithms written for SIMT-based GPU architectures. These implementations demonstrate the optimizations for CUDA or its successors OpenCL and DirectCompute. We discuss the effect and optimization principles of memory coalescing, bandwidth reduction, processor occupancy, bank conflict reduction, local memory elimination and instruction optimization. The effect of the optimization steps are illustrated by state-of-the-art examples. A comparison with optimized and unoptimized algorithms is provided. A first example discusses the construction of joint histograms using shared memory, where optimizations lead to a significant speedup compared to the original implementation. A second example presents convolution and the acquired results.

References

  1. Asanovic, K., Bodik, R., Catanzaro, B. C., Gebis, J. J., Husbands, P., Keutzer, K., Patterson, D. A., Plishker, W. L., Shalf, J., Williams, S. W., and Yelick, K. A. (2006). The Landscape of Parallel Computing Research: A View From Berkeley. Electrical Engineering and Computer Sciences, University of California at Berkeley, 18(183):19.
  2. Boyd, C. (2008). The DirectX 11 Compute Shader. Shading Course SIGGRAPH.
  3. Shams, R., Sadeghi, P., Kennedy, R. A., and Hartley, R. I. (2010). A survey of medical image registration on multicore and the GPU. IEEE Signal Processing Mag. (to appear).
Download


Paper Citation


in Harvard Style

Goorts P., Rogmans S., Vanden Eynde S. and Bekaert P. (2010). PRACTICAL EXAMPLES OF GPU COMPUTING OPTIMIZATION PRINCIPLES . In Proceedings of the International Conference on Signal Processing and Multimedia Applications - Volume 1: SIGMAP, (ICETE 2010) ISBN 978-989-8425-19-5, pages 46-49. DOI: 10.5220/0002990400460049


in Bibtex Style

@conference{sigmap10,
author={Patrik Goorts and Sammy Rogmans and Steven Vanden Eynde and Philippe Bekaert},
title={PRACTICAL EXAMPLES OF GPU COMPUTING OPTIMIZATION PRINCIPLES},
booktitle={Proceedings of the International Conference on Signal Processing and Multimedia Applications - Volume 1: SIGMAP, (ICETE 2010)},
year={2010},
pages={46-49},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002990400460049},
isbn={978-989-8425-19-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Signal Processing and Multimedia Applications - Volume 1: SIGMAP, (ICETE 2010)
TI - PRACTICAL EXAMPLES OF GPU COMPUTING OPTIMIZATION PRINCIPLES
SN - 978-989-8425-19-5
AU - Goorts P.
AU - Rogmans S.
AU - Vanden Eynde S.
AU - Bekaert P.
PY - 2010
SP - 46
EP - 49
DO - 10.5220/0002990400460049