Comparing Summarisation Techniques for Informal Online Reviews

Mhairi McNeill, Robert Raeside, Martin Graham, Isaac Roseboom

2015

Abstract

In this paper we evaluate three methods for summarising game reviews written in a casual style. This was done in order to create a review summarisation system to be used by clients of deltaDNA. We look at one well-known method based on natural language processing, and describe two statistical methods that could be used for summarisation: one based on TF-IDF scores another using supervised latent Dirichlet allocation. We find, due to the informality of these online reviews, that natural language based techniques work less well than they do on other types of reviews, and we recommend using techniques based on the statistical properties of the words’ frequencies. In particular, we decided to use a TF-IDF score based system in the final system.

References

  1. Anwer, N., Rashid, A., & Hassan, S. (2010, August). Feature based opinion mining of online free format customer reviews using frequency distribution and Bayesian statistics. In Networked Computing and Advanced Information Management (NCM), 2010 Sixth International Conference on (pp. 57-62). IEEE.
  2. ACL Contributors. (2015). POS Tagging (State of the art). Available: http://aclweb.org/aclwiki/index.php?title=POS_Taggin g_(State_of_the_art). Last accessed 31th Jul 2015.
  3. Blei, D. M., Ng, A. Y. and Jordan, M. I., (2003). Latent Dirichlrt allocation. Journal of Machine Learning Research, 3: 993-1022.
  4. Blei, D.M., Ng, A.Y. & Jordan, M.I., 2012. Latent Dirichlet Allocation J. Lafferty, ed. Journal of Machine Learning Research, 3(4-5), pp.993-1022.
  5. Bird, Steven, Edward Loper and Ewan Klein (2009), Natural Language Processing with Python.
  6. Buckley, C. (1985). Implementation of the SMART information retrieval system. Cornell University.
  7. Chang, J 2012. lda: Collapsed Gibbs sampling methods for topic models.. R package version 1.3.2. http://CRAN.R-project.org/package=lda.
  8. Chatterjee, P., 2001. Online Reviews: Do Consumers Use Them? Advances in Consumer Research, 28, pp.129- 134.
  9. Chen L. and Chue W., 2005. Using Web structure and summarization techniques for web content mining, Information Processing and Management.
  10. Chevalier, J. a & Mayzlin, D., 2003. The Effect of Word of Mouth on Sales?: National Bureau of Economic Research, p.40.
  11. Feinerer, I, Hornik K, and Meyer D (2008). Text Mining Infrastructure in R. Journal of Statistical Software 25(5):1-54. URL: http://www.jstatsoft.org/v25/i05/.
  12. Forman, G., (2003). An extensive empirical study of feature selection metrics for text classification, journal of machine Learning Research, 3: 1289-1305.
  13. He, W., Chee, T., Chong, D. Z. and Rasnick, E., (2012). Analysing the trends of E-marketing from 2001 to 2010 with use of biblometrics and text mining. International Journal of Online Marketing, 2(1), 16-24.
  14. He, W., Zha, s. and Li, L., (2013). Social media competitive analysis: A case study in the pizza industry, International Journal of Information Management, 33: 464-472.
  15. Hu, M. & Liu, B., 2004. Mining and summarizing customer reviews. Proceedings of the 2004 ACM SIGKDD international conference on Knowledge discovery and data mining KDD 04, 04, p.168.
  16. Hu, M. & Liu, B., 2004. "Mining Opinion Features in Customer reviews". In Proceedings of Nineteenth National Conference on Artificial Intelligence (San Jose, California, USA, July 2-29, 2004). The AAAI Press, Menlo Park, CA, 755-760.
  17. Labbé, C. & Portet, F., 2012. Towards an abstractive opinion summarisation of multiple reviews in the tourism domain. In CEUR Workshop Proceedings. pp. 87-94.
  18. Lihui, C. & Chue, W.L., 2005. Using Web structure and summarisation techniques for Web content mining. Information Processing and Management, 41(5), pp.1225-1242.
  19. Lizhen Liu; Wentao Wang; HangShi Wang, "Summarizing customer reviews based on product features," Image and Signal Processing (CISP), 2012 5th International Congress on.
  20. Mcauliffe, J.D. & Blei, D.M., 2008. Supervised Topic Models. In Advances in Neural Information Processing Systems. pp. 121-128. Available at: http://papers.nips.cc/paper/3328-supervised-topic [Accessed July 13, 2015].
  21. Mostafa, M., 92013). More than words: Social networks' text mining for consumer brand sentiments, Expert Systems with Applications, 40: 4241-4251.
  22. Nguyen, P., Mahajan, M. & Zweig, G., 2007. Summarization of Multiple User Reviews in the Restaurant Domain, Available at: http://research. microsoft.com:8082/pubs/70488/tr-2007-126.pdf.
  23. Pang, B. and Lee, L. (2002). Thumbs up? Sentiment Classification using Machine Learning, Proceedings of the conference on empirical Methods in Natural Language Processing (EMNLP), Philadelphia, July 2002: 79-86. Association for Computational Linguistics.
  24. Ramage, D., Hall, D., Nallapati, R. and Manning, C. D., (2009). Labelled LDA: A supervised topic model for credit attribution in multi-labelled corpora, Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, 248-256., Singapore 6-7 August 2009.
  25. Titov, I. and Macdonald, R., (2008). A joint model of text and aspect ratings for sentiment summarization, Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, June 15-20, 2008, Ohio state University, Columbus, Ohio, USA.
  26. Zhuang, L., Jing, F. & Zhu, X.-Y., 2006. Movie review mining and summarization. In Proceedings of the 15th ACM international conference on Information and knowledge management - CIKM 7806.
Download


Paper Citation


in Harvard Style

McNeill M., Raeside R., Graham M. and Roseboom I. (2015). Comparing Summarisation Techniques for Informal Online Reviews . In Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2015) ISBN 978-989-758-158-8, pages 322-329. DOI: 10.5220/0005612203220329


in Bibtex Style

@conference{kdir15,
author={Mhairi McNeill and Robert Raeside and Martin Graham and Isaac Roseboom},
title={Comparing Summarisation Techniques for Informal Online Reviews},
booktitle={Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2015)},
year={2015},
pages={322-329},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005612203220329},
isbn={978-989-758-158-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2015)
TI - Comparing Summarisation Techniques for Informal Online Reviews
SN - 978-989-758-158-8
AU - McNeill M.
AU - Raeside R.
AU - Graham M.
AU - Roseboom I.
PY - 2015
SP - 322
EP - 329
DO - 10.5220/0005612203220329