ARCHCOLLECT FRONT-END - A Web usage data mining knowledge acquisition mechanism focused on static or dynamic contenting applications

Joubert de Castro Lima, Ahmed Ali Abdalla Esmin, Juvêncio Geraldo de Moura, Bruno Ferreira, Tiago Garcia de Senna Carneiro

2004

Abstract

Knowledge acquisition mechanism is essencial to every Web usage mining project and it can be implemented on the user or on all servers configuration. This paper presents a low coupled acquisition mechanism focused on users’ interactions, associated with semantic data, binded to almost all markup languages and with monitored application layout independence. This mechanism acquires knowledge only from the Web browser. It separates the requests: one for the monitored application and the other for the server called ArchCollect, and has a parser that automatically inserts the knowledge acquisition mechanism into the static/dynamic user’s page. Like other acquisition mechanisms, the ArchCollect front-end is scalable since it can deal with massive network traffic, adopting scalable ArchCollect servers or scalable internal components. It is efficient since it reduces drastically the preprocessing, sharing this hard activity with all users, and since it makes no log files interpretation or completation. It is realible since it eliminates browser and server caches problems. This project can collect layout, usage and performance data, providing general application focus, like Srivastava et.al proposed.

References

  1. Etzioni, O., 1999. The world wide web: Quaqmire or gold mine. Communications of the ACM, 39(11):65-68.
  2. Shahabi, C., Banaei-Kashani, F., Faruque, J., 2001. A reliable, eficient, and scalable system for web usage data acquisition. WebKDD'01 Workshop, ACMSIGKDD 2001, São Francisco, CA.
  3. Spiliopoulou, M., 2000. Web usage mining for site evolution: Making a site better fit its users. Special Section of the Communications of ACM on “Personalization Technologies with Data Mining”, 43(8):127-134, August, 2000.
  4. Perkowitz, M., and Etzioni, O., 2000. Toward adaptive Web sites: conceptual framework and case study. Artificial Intelligence 118, p.p245-275, 2000.
  5. Buchner, A.G., and Mulvenna, M.D., 1998. Discovering Internet Marketing Intelligence through Online Analytical Web Usage Mining. ACM SIGMOD Record, ISSN 0163-5808, Vol. 27, No.4, p.p 54-61, 1998.
  6. Sarwar, B.M., Karypis, G., Kostan, J.A., and Riedl, R., 2000. Analysis of Recommender Algorithms for ECommerce. ACM E-Commerce'00 Conference. October, 2000.
  7. Srivastava, J., Cooley, R., Deshpande, M., Tan, P., 2000. Web usage minig : Discovery and applications of usage patterns from web data, SIGKDD, January, 2000.
  8. Cooley, R., Tan, P., Srivastava, J., 2000. Discovery of Interesting Usage Patterns from Web Data. Advances in Web Usage Analysis and User Profiling, Lecture Notes in Computer Science, Vol. 1836, SpringerVerlag, 2000.
  9. Gomory, S., Hoch, R., Lee, J., Poldlaseck, M., Schonberg, E., 1999. Ecommerce Intelligence : Measuring, Analyzing, and Reporting on Merchandising Effectiveness of Online Stores, IBM Watson Research Center.
  10. Wu, K., Yu, P.S., and Ballman, A., 1999. SpeedTracer: A web usage mining and analysis tool. IBM Systems Journal, 37(1), 1999.
  11. Zaiane, O.R., Xin, M., Han, J., 1999. Discovering Web Access Patterns and Trends by Applying OLAP and Data Mining Technology on Web Logs, Proc. of Advances in Digital Libraries Conference, 1999.
  12. Ackerman M. D., et al. 1997. “Learning Probabilistic user profiles: Applications to finding interesting Web sites, notifying users of relevant changes to the Web pages, and locating grant opportunities”. AI Magazine 18(2) 47-56, 1997.
  13. Lieberman H., 1995. “ Letizia: An agent that assists Web browsing”. Proceedings of the international joint conference on Artificial Intelligence, Montreal, August 1995.
  14. Lee, J., Lee, H.S., Wang, P., 2001. Design and Implementation of a Visual Online Product Catalog Interface. ICEIS (2) 2001: 1010-1017.
  15. Ghani, R., Fano, A., 2002. Towards Semantic Data Mining: Creating and Using a Knowledge Base of Product Semantics, KDD 2002, Edmonton, Canada.
  16. Menascé, Daniel A. & Almeida, Virgilio A.F., "Capacity Planning for WEB Performance - Metrics, Models & Methods", Prentice Hall, PTR, 1998.
  17. Lima J.C., Carneiro T.G.S., Pagliares R.M., et. al, 2003. ArchCollect: A set of Components directed towards web users' interaction, ICEIS 2003 Conference, Angers - France.
  18. Lima J.C., Carneiro T.G.S., Esmin, A.A.A., 2003. ArchCollect: concepts of na architecture that offers services for analyzing and understanding Web users' interactions. Conexão Ciência Magazine- Year I, n° 01, pp. 39-46 - Formiga - Brazil - 2003.
  19. Kimball, R., 2002. Data Webhouse Toolkit. Editora Campus.
Download


Paper Citation


in Harvard Style

de Castro Lima J., Ali Abdalla Esmin A., Geraldo de Moura J., Ferreira B. and Garcia de Senna Carneiro T. (2004). ARCHCOLLECT FRONT-END - A Web usage data mining knowledge acquisition mechanism focused on static or dynamic contenting applications . In Proceedings of the Sixth International Conference on Enterprise Information Systems - Volume 4: ICEIS, ISBN 972-8865-00-7, pages 258-262. DOI: 10.5220/0002638402580262


in Bibtex Style

@conference{iceis04,
author={Joubert de Castro Lima and Ahmed Ali Abdalla Esmin and Juvêncio Geraldo de Moura and Bruno Ferreira and Tiago Garcia de Senna Carneiro},
title={ARCHCOLLECT FRONT-END - A Web usage data mining knowledge acquisition mechanism focused on static or dynamic contenting applications},
booktitle={Proceedings of the Sixth International Conference on Enterprise Information Systems - Volume 4: ICEIS,},
year={2004},
pages={258-262},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002638402580262},
isbn={972-8865-00-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Sixth International Conference on Enterprise Information Systems - Volume 4: ICEIS,
TI - ARCHCOLLECT FRONT-END - A Web usage data mining knowledge acquisition mechanism focused on static or dynamic contenting applications
SN - 972-8865-00-7
AU - de Castro Lima J.
AU - Ali Abdalla Esmin A.
AU - Geraldo de Moura J.
AU - Ferreira B.
AU - Garcia de Senna Carneiro T.
PY - 2004
SP - 258
EP - 262
DO - 10.5220/0002638402580262