Authors: César Acevedo ; Porfidio Hernandez ; Antonio Espinosa and Víctor Méndez

Affiliation: Univ. Autonoma de Barcelona, Spain

ISBN: 978-989-758-181-6

ISSN: 2184-5034

Keyword(s): Multiple, Multiworkflows, Data-Aware, Cluster.

Related Ontology Subjects/Areas/Topics: Bioinformatics ; Biomedical Engineering ; Biomedical Signal Processing

Abstract: Previous scheduling research work is based on the analysis of the computational time of application workflows. Current use of clusters deals with the execution of multiworkflows that may share applications and input files. In order to reduce the makespan of such multiworkflows adequate data allocation policies should be applied to reduce input data latency. We propose a scheduling strategy for multiworkflows that considers the data location of shared input files in different locations of the storage system of the cluster. For that, we first merge all workflows in a study and evaluate the global design pattern obtained. Then, we apply a classic list scheduling heuristic considering the location of the input files in the storage system to reduce the communication overhead of the applications. We have evaluated our proposal with an initial set of experimental environments showing promising results of up to 20% makespan improvement.


