TASKWORK: A Cloud-aware Runtime System for Elastic Task-parallel HPC Applications

Stefan Kehrer, Wolfgang Blochinger

2019

Abstract

With the capability of employing virtually unlimited compute resources, the cloud evolved into an attractive execution environment for applications from the High Performance Computing (HPC) domain. By means of elastic scaling, compute resources can be provisioned and decommissioned at runtime. This gives rise to a new concept in HPC: Elasticity of parallel computations. However, it is still an open research question to which extent HPC applications can benefit from elastic scaling and how to leverage elasticity of parallel computations. In this paper, we discuss how to address these challenges for HPC applications with dynamic task parallelism and present TASKWORK, a cloud-aware runtime system based on our findings. TASKWORK enables the implementation of elastic HPC applications by means of higher-level development frameworks and solves corresponding coordination problems based on Apache ZooKeeper. For evaluation purposes, we discuss a development framework for parallel branch-and-bound based on TASKWORK, show how to implement an elastic HPC application, and report on measurements with respect to parallel efficiency and elastic scaling.

Download


Paper Citation


in Harvard Style

Kehrer S. and Blochinger W. (2019). TASKWORK: A Cloud-aware Runtime System for Elastic Task-parallel HPC Applications.In Proceedings of the 9th International Conference on Cloud Computing and Services Science - Volume 1: CLOSER, ISBN 978-989-758-365-0, pages 198-209. DOI: 10.5220/0007795501980209


in Bibtex Style

@conference{closer19,
author={Stefan Kehrer and Wolfgang Blochinger},
title={TASKWORK: A Cloud-aware Runtime System for Elastic Task-parallel HPC Applications},
booktitle={Proceedings of the 9th International Conference on Cloud Computing and Services Science - Volume 1: CLOSER,},
year={2019},
pages={198-209},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0007795501980209},
isbn={978-989-758-365-0},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 9th International Conference on Cloud Computing and Services Science - Volume 1: CLOSER,
TI - TASKWORK: A Cloud-aware Runtime System for Elastic Task-parallel HPC Applications
SN - 978-989-758-365-0
AU - Kehrer S.
AU - Blochinger W.
PY - 2019
SP - 198
EP - 209
DO - 10.5220/0007795501980209