An Automated Approach for Creating Workload Models from Server Log Data

Fredrik Abbors, Dragos Truscan, Tanwir Ahmad

2014

Abstract

We present a tool-supported approach for creating workload models from historical web access log data. The resulting workload models are stochastic, represented as Probabilistic Timed Automata (PTA), and describe how users interact with the system. Such models allow one to analyze different user profiles and to mimic real user behavior as closely as possible when generating workload. We provide an experiment to validate the approach.

References

  1. Abbors, F., Ahmad, T., Truscan, D., and Porres, I. (2012). MBPeT: A Model-Based Performance Testing Tool. 2012 Fourth International Conference on Advances in System Testing and Validation Lifecycle.
  2. Al-Jaar, R. (1991). Book review: The art of computer systems performance analysis: Techniques for experimental design, measurement, simulation, and modeling by raj jain (John Wiley & Sons). SIGMETRICS Perform. Eval. Rev., 19(2):5-11.
  3. Anastasiou, N. and Knottenbelt, W. (2013). Peppercorn: Inferring performance models from location tracking data. In QEST, Lecture Notes in Computer Science, pages 169-172. Springer.
  4. Arnold, B. (2008). Pareto and generalized pareto distributions. In Chotikapanich, D., editor, Modeling Income Distributions and Lorenz Curves, volume 5 of Economic Studies in Equality, Social Exclusion and WellBeing, pages 119-145. Springer New York.
  5. Cai, Y., Grundy, J., and Hosking, J. (2007). Synthesizing client load models for performance engineering via web crawling. In Proceedings of the Twentysecond IEEE/ACM International Conference on Automated Software Engineering, ASE 7807, pages 353- 362. ACM.
  6. Django Framework (2012). Online at https://www.djangoproject.com/.
  7. Ferrari, D. (1984). On the foundations of artificial workload design. In Proceedings of the 1984 ACM SIGMETRICS conference on Measurement and modeling of computer systems, SIGMETRICS 7884, pages 8-14, New York, NY, USA. ACM.
  8. JurdziÁski, M., Kwiatkowska, M., Norman, G., and Trivedi, A. (2009). Concavely-Priced Probabilistic Timed Automata. In Bravetti, M. and Zavattaro, G., editors, Proc. 20th International Conference on Concurrency Theory (CONCUR'09), volume 5710 of LNCS, pages 415-430. Springer.
  9. Kathuria, A., Jansen, B. J., Hafernik, C. T., and Spink, A. (2010). Classifying the user intent of web queries using k-means clustering. In Internet Research, number 5, pages 563-581. Emerald Group Publishing.
  10. Lutteroth, C. and Weber, G. (2008). Modeling a realistic workload for performance testing. In 12th International Conference on Enterprise Distributed Object Computing., pages 149-158. IEEE Computer Society.
  11. Ma, S. and Hellerstein, J. L. (2001). Mining partially periodic event patterns with unknown periods. In Proceedings of the 17th International Conference on Data Engineering, pages 205-214, Washington, DC, USA. IEEE Computer Society.
  12. MacQueen, J. B. (1967). Some methods for classification and analysis of multivariate observations. In Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability, number 1, pages 281- 297. Berkeley, University of California Press.
  13. Mannila, H., Toivonen, H., and Inkeri Verkamo, A. (1997). Discovery of frequent episodes in event sequences. Data Min. Knowl. Discov., 1(3):259-289.
  14. Oracle (2014). Java Pet Store 2.0 reference application. http://www.oracle.com/technetwork/java/index136650.html. Last Accessed: 2014-05-23.
  15. Petriu, D. C. and Shen, H. (2002). Applying the UML Performance Profile: Graph Grammar-based Derivation of LQN Models from UML Specifications. pages 159-177. Springer-Verlag.
  16. Python (2014). Python programming language. Online at http://www.python.org/. Last Accessed: 2014-05-23.
  17. Richardson, L. and Ruby, S. (2007). Restful web services. O'Reilly, first edition.
  18. Rudolf, A. and Pirker, R. (2000). E-Business Testing: User Perceptions and Performance Issues. In Proceedings of the First Asia-Pacific Conference on Quality Software (APAQS'00), APAQS 7800, pages 315-, Washington, DC, USA. IEEE Computer Society.
  19. Shi, P. (2009). An efficient approach for clustering web access patterns from web logs. In International Journal of Advanced Science and Technology, volume 5, pages 1-14. SERSC.
  20. Subraya, B. M. and Subrahmanya, S. V. (2000). Object driven performance testing in web applications. In Proceedings of the First Asia-Pacific Conference on Quality Software (APAQS'00), pages 17-26. IEEE Computer Society.
  21. Vaarandi, R. (2003). A data clustering algorithm for mining patterns from event logs. In Proceedings of the 3rd IEEE Workshop on IP Operations and Management (IPOM03), pages 119-126. IEEE.
Download


Paper Citation


in Harvard Style

Abbors F., Truscan D. and Ahmad T. (2014). An Automated Approach for Creating Workload Models from Server Log Data . In Proceedings of the 9th International Conference on Software Engineering and Applications - Volume 1: ICSOFT-EA, (ICSOFT 2014) ISBN 978-989-758-036-9, pages 14-25. DOI: 10.5220/0005002200140025


in Bibtex Style

@conference{icsoft-ea14,
author={Fredrik Abbors and Dragos Truscan and Tanwir Ahmad},
title={An Automated Approach for Creating Workload Models from Server Log Data},
booktitle={Proceedings of the 9th International Conference on Software Engineering and Applications - Volume 1: ICSOFT-EA, (ICSOFT 2014)},
year={2014},
pages={14-25},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005002200140025},
isbn={978-989-758-036-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 9th International Conference on Software Engineering and Applications - Volume 1: ICSOFT-EA, (ICSOFT 2014)
TI - An Automated Approach for Creating Workload Models from Server Log Data
SN - 978-989-758-036-9
AU - Abbors F.
AU - Truscan D.
AU - Ahmad T.
PY - 2014
SP - 14
EP - 25
DO - 10.5220/0005002200140025