Authors:
Gargi Alavani
and
Santonu Sarkar
Affiliation:
Dept. of CSIS, BITS Pilani K. K Birla Goa Campus, India
Keyword(s):
Microbenchmarking, GPU Computing, CUDA, Performance.
Abstract:
While GPUs are popular for High-Performance Computing(HPC) applications, the available literature is inadequate for understanding the architectural characteristics and quantifying performance parameters of NVIDIA GPUs. This paper proposes “Inspect-GPU”, a software that uses a set of novel, architecture-agnostic microbenchmarks, and a set of architecture-specific regression models to quantify instruction latency, peakwarp and throughput of a CUDA kernel for a particular NVIDIA GPU architecture. Though memory access is critical for GPU performance, memory instruction execution details, such as its runtime throughput, are not revealed. We have developed a memory throughput model providing unpublished crucial insights. Inspect-GPU builds this throughput model for a particular GPU architecture. Inspect-GPU has been tested on multiple GPU architectures: Kepler, Maxwell, Pascal, and Volta. We have demonstrated the efficacy of our approach by comparing it with two popular performance analysi
s models. Using the results from Inspect-GPU, developers can analyze their CUDA applications, apply optimization, and model GPU architecture and its performance.
(More)