A Simulation-based Scheduling Strategy for Scientific Workflows

Sergio Hernández, Javier Fabra, Pedro Álvarez, Joaquín Ezpeleta

2012

Abstract

Grid computing infrastructures have recently come up as computing environments able to manage heterogeneous and geographically distributed resources, being very suitable for the deployment and execution of scientific workflows. An emerging topic in this discipline is the improvement of the scheduling process and the overall execution requirements by means of simulation environments. In this work, a simulation component based on realistic workload usage is presented and integrated into a framework for the flexible deployment of scientific workflows in Grid environments. This framework allows researchers to simultaneously work with different and heterogeneous Grid middlewares in a transparent way and also provides a high level of abstraction when developing their workflows. The approach presented here allows to model and simulate different computing infrastructures, helping in the scheduling process and improving the deployment and execution requirements in terms of performance, resource usage, cost, etc. As a use case, the Inspiral analysis workflow is executed on two different computing infrastructures, reducing the overall execution cost.

References

  1. Abraham, A., Liu, H., Zhang, W., and Chang, T.-G. (2006). Scheduling Jobs on Computational Grids Using Fuzzy Particle Swarm Algorithm. In Proceedings of the 10th International Conference in Knowledge-Based Intelligent Information and Engineering Systems - KES 2006, volume 4252, pages 500-507.
  2. Fabra, J., Hernández, S., Í lvarez, P., and Ezpeleta, J. (2012). A framework for the flexible deployment of scientific workflows in grid environments. In Proceedings of the Third International Conference on Cloud Computing, GRIDs, and Virtualizations - CLOUD COMPUTING 2012.
  3. Feitelson, D. G. (2002). Workload Modeling for Performance Evaluation. In Proceedings of Performance Evaluation of Complex Systems: Techniques and Tools - Performance 2002, volume 2459, pages 114- 141.
  4. Foster, I. and Kesselman, C. (2003). The Grid 2: Blueprint for a New Computing Infrastructure. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.
  5. Hamscher, V., Schwiegelshohn, U., Streit, A., and Yahyapour, R. (2000). Evaluation of Job-Scheduling Strategies for Grid Computing. In Proceedings of the First IEEE/ACM International Workshop on Grid Computing - GRID 2000, pages 191-202.
  6. Iosup, A. and Epema, D. H. J. (2011). Grid Computing Workloads. IEEE Internet Computing, 15(2):19-26.
  7. Iosup, A., Li, H., Jan, M., Anoep, S., Dumitrescu, C., Wolters, L., and Epema, D. H. J. (2008a). The Grid Workloads Archive. Future Genereration Computer Systems, 24(7):672-686.
  8. Iosup, A., Sonmez, O., Anoep, S., and Epema, D. (2008b). The performance of bags-of-tasks in large-scale distributed systems. In Proceedings of the 17th international symposium on High performance distributed computing - HPDC 2008, pages 97-108.
  9. Kertész, A. and Kacsuk, P. (2010). GMBS: A new middleware service for making grids interoperable. Future Generation Computer Systems, 26(4):542-553.
  10. Klusác?ek, D. and Rudová, H. (2010). Alea 2 - Job Scheduling Simulator. In Proceedings of the 3rd International ICST Conference on Simulation Tools and Techniques - SIMUTools 2010.
  11. Li, H., Groep, D., and Wolters, L. (2004). Workload characteristics of a multi-cluster supercomputer. In Proceedings of the 10th International Conference on Job Scheduling Strategies for Parallel Processing - JSSPP 2004, pages 176-193.
  12. Lublin, U. and Feitelson, D. G. (2003). The workload on parallel supercomputers: modeling the characteristics of rigid jobs. Journal of Parallel and Distributed Computing, 63(11):1105-1122.
  13. Ludwig, S. A. and Moallem, A. (2011). Swarm Intelligence Approaches for Grid Load Balancing. Journal of Grid Computing, 9(3):279-301.
  14. Rahman, M., Ranjan, R., Buyya, R., and Benatallah, B. (2011). A taxonomy and survey on autonomic management of applications in grid computing environments. Concurrency and Computation: Practice and Experience, 23(16):1990-2019.
  15. Sargent, R. G. (2010). Verification and validation of simulation models. In Proceedings of the 2010 Winter Simulation Conference - WSC 2010, pages 166-183.
  16. Sulistio, A., Cibej, U., Venugopal, S., Robic, B., and Buyya, R. (2008). A toolkit for modelling and simulating data Grids: an extension to GridSim. Concurrency and Computation: Practice and Experience, 20(13):1591- 1609.
  17. Taylor, I. J., Deelman, E., Gannon, D. B., and Shields, M. (2006). Workflows for e-Science: Scientific Workflows for Grids. Springer-Verlag New York, Inc., Secaucus, NJ, USA.
  18. Yu, J. and Buyya, R. (2005). A taxonomy of scientific workflow systems for grid computing. SIGMOD Record, 34(3):44-49.
  19. Yu, Z. and Shi, W. (2007). An Adaptive Rescheduling Strategy for Grid Workflow Applications. In IEEE International Parallel and Distributed Processing Symposium, 2007 - IPDPS 2007, pages 1-8.
Download


Paper Citation


in Harvard Style

Hernández S., Fabra J., Álvarez P. and Ezpeleta J. (2012). A Simulation-based Scheduling Strategy for Scientific Workflows . In Proceedings of the 2nd International Conference on Simulation and Modeling Methodologies, Technologies and Applications - Volume 1: SIMULTECH, ISBN 978-989-8565-20-4, pages 61-70. DOI: 10.5220/0004061200610070


in Bibtex Style

@conference{simultech12,
author={Sergio Hernández and Javier Fabra and Pedro Álvarez and Joaquín Ezpeleta},
title={A Simulation-based Scheduling Strategy for Scientific Workflows},
booktitle={Proceedings of the 2nd International Conference on Simulation and Modeling Methodologies, Technologies and Applications - Volume 1: SIMULTECH,},
year={2012},
pages={61-70},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004061200610070},
isbn={978-989-8565-20-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 2nd International Conference on Simulation and Modeling Methodologies, Technologies and Applications - Volume 1: SIMULTECH,
TI - A Simulation-based Scheduling Strategy for Scientific Workflows
SN - 978-989-8565-20-4
AU - Hernández S.
AU - Fabra J.
AU - Álvarez P.
AU - Ezpeleta J.
PY - 2012
SP - 61
EP - 70
DO - 10.5220/0004061200610070