Process Mining Monitoring for Map Reduce Applications in the Cloud

Federico Chesani, Anna Ciampolini, Daniela Loreti, Paola Mello

Abstract

The adoption of mobile devices and sensors, and the Internet of Things trend, are making available a huge quantity of information that needs to be analyzed. Distributed architectures, such as Map Reduce, are indeed providing technical answers to the challenge of processing these big data. Due to the distributed nature of these solutions, it can be difficult to guarantee the Quality of Service: e.g., it might be not possible to ensure that processing tasks are performed within a temporal deadline, due to specificities of the infrastructure or processed data itself. However, relaying on cloud infrastructures, distributed applications for data processing can easily be provided with additional resources, such as the dynamic provisioning of computational nodes. In this paper, we focus on the step of monitoring Map Reduce applications, to detect situations where resources are needed to meet the deadlines. To this end, we exploit some techniques and tools developed in the research field of Business Process Management: in particular, we focus on declarative languages and tools for monitoring the execution of business process. We introduce a distributed architecture where a logic-based monitor is able to detect possible delays, and trigger recovery actions such as the dynamic provisioning of further resources.

References

  1. Amazon Cloud Watch (2015). Amazon cloud monitor system. https://aws.amazon.com/it/cloudwatch/. Web Page, last visited in Dec. 2015.
  2. Apache Hadoop (2015). Apache software foundation. https://hadoop.apache.org/. Web Page, last visited in Dec. 2015.
  3. Apache Spark (2015). Apache software foundation. http://spark.apache.org. Web Page, last visited in Dec. 2015.
  4. Armbrust, M., Fox, O., and R., G. (2009). Above the clouds: A berkeley view of cloud computing. Technical report, Electrical Engineering and Computer Sciences University of California at Berkeley.
  5. Bauer, A., Leucker, M., and Schallhart, C. (2011). Runtime verification for ltl and tltl. ACM Trans. Softw. Eng. Methodol., 20(4):14:1-14:64.
  6. Bragaglia, S., Chesani, F., Mello, P., Montali, M., and Torroni, P. (2012). Reactive event calculus for monitoring global computing applications. In Logic Programs, Norms and Action. Springer.
  7. Ceilometer, O. (2015). the openstack monitoring module. https://wiki.openstack.org/wiki/ceilometer.
  8. Chen, K., Powers, J., Guo, S., and Tian, F. (2014a). Cresp: Towards optimal resource provisioning for mapreduce computing in public clouds. Parallel and Distributed Systems, IEEE Transactions on, 25(6):1403-1412.
  9. Chen, M., Mao, S., and Liu, Y. (2014b). Big data: A survey. Mobile Networks and Applications, Volume 19(2):171-209.
  10. Collins, E. (2014). Intersection of the cloud and big data. Cloud Computing, IEEE, 1(1):84-85.
  11. Dean, J. and Ghemawat, S. (2008). Mapreduce: simplified data processing on large clusters. Commun. ACM, 51(1):107-113.
  12. Farrel, A., Sergot, M., Sallè, M., and Bartolini, C. (2005). Using the event calculus for tracking the normative state of contracts. International Journal of Cooperative Information Systems, 14(02n03):99-129.
  13. Giannakopoulou, D. and Havelund, K. (2001). Automatabased verification of temporal properties on running programs. In Automated Software Engineering, 2001. (ASE 2001). Proceedings. 16th Annual International Conference on, pages 412-416.
  14. Kailasam, S., Dhawalia, P., Balaji, S., Iyer, G., and Dharanipragada, J. (2014). Extending mapreduce across clouds with bstream. Cloud Computing, IEEE Transactions on, 2(3):362-376.
  15. Kowalski, R. A. and Sergot, M. J. (1986). A Logic-Based Calculus of Events. New Generation Computing.
  16. Loreti, D. and Ciampolini, A. (2015). A hybrid cloud infrastructure fo big data applications. In Proceedings of IEEE International Conferences on High Performance Computing and Communications.
  17. Mattess, M., Calheiros, R., and Buyya, R. (2013). Scaling mapreduce applications across hybrid clouds to meet soft deadlines. In Advanced Information Networking and Applications (AINA), 2013 IEEE 27th International Conference on, pages 629-636.
  18. Montali, M., Chesani, F., Mello, P., and Maggi, F. M. (2013a). Towards data-aware constraints in declare. In Shin, S. Y. and Maldonado, J. C., editors, Proceedings of the 28th Annual ACM Symposium on Applied Computing, SAC 7813, Coimbra, Portugal, March 18- 22, 2013, pages 1391-1396. ACM.
  19. Montali, M., Maggi, F. M., Chesani, F., Mello, P., and van der Aalst, W. M. P. (2013b). Monitoring business constraints with the event calculus. ACM TIST, 5(1):17.
  20. Palanisamy, B., Singh, A., and Liu, L. (2015). Costeffective resource provisioning for mapreduce in a cloud. Parallel and Distributed Systems, IEEE Transactions on, 26(5):1265-1279.
  21. Pesic, M. and van der Aalst, W. M. P. (2006). A Declarative Approach for Flexible Business Processes Management.
  22. Rizvandi, N. B., Taheri, J., Moraveji, R., and Zomaya, A. Y. (2013). A study on using uncertain time series matching algorithms for mapreduce applications. Concurrency and Computation: Practice and Experience, 25(12):1699-1718.
  23. Spanoudakis, G. and Mahbub, K. (2006). Non-intrusive monitoring of service-based systems. International Journal of Cooperative Information Systems, 15(03):325-358.
  24. Van Der Aalst, W., Adriansyah, A., de Medeiros, A. K. A., and Arcieri, F. (2012). Process mining manifesto. In Business Process Management Workshops. Springer Berlin Heidelberg.
  25. Verma, A., Cherkasova, L., and Campbell, R. H. (2011). Resource Provisioning Framework for MapReduce Jobs with Performance Goals, volume 7049 of Lecture Notes in Computer Science, pages 165-186. Springer Berlin Heidelberg.
Download


Paper Citation


in Harvard Style

Chesani F., Ciampolini A., Loreti D. and Mello P. (2016). Process Mining Monitoring for Map Reduce Applications in the Cloud . In Proceedings of the 6th International Conference on Cloud Computing and Services Science - Volume 1: CLOSER, ISBN 978-989-758-182-3, pages 95-105. DOI: 10.5220/0005864000950105


in Bibtex Style

@conference{closer16,
author={Federico Chesani and Anna Ciampolini and Daniela Loreti and Paola Mello},
title={Process Mining Monitoring for Map Reduce Applications in the Cloud},
booktitle={Proceedings of the 6th International Conference on Cloud Computing and Services Science - Volume 1: CLOSER,},
year={2016},
pages={95-105},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005864000950105},
isbn={978-989-758-182-3},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 6th International Conference on Cloud Computing and Services Science - Volume 1: CLOSER,
TI - Process Mining Monitoring for Map Reduce Applications in the Cloud
SN - 978-989-758-182-3
AU - Chesani F.
AU - Ciampolini A.
AU - Loreti D.
AU - Mello P.
PY - 2016
SP - 95
EP - 105
DO - 10.5220/0005864000950105