Wiki-LDA: A Mixed-Method Approach for Effective Interest Mining on Twitter Data

Xiao Pu, Mohamed Amine Chatti, Hendrik Thüs, Ulrik Schroeder

Abstract

Learning analytics (LA) and Educational data mining (EDM) have emerged as promising technology enhanced learning (TEL) research areas in recent years. Both areas deal with the development of methods that harness educational data sets to support the learning process. A key area of application for LA and EDM is learner modelling. Learner modelling enables to achieve adaptive and personalized learning environments, which are able to take into account the heterogeneous needs of learners and provide them with tailored learning experience suited for their unique needs. As learning is increasingly happening in open and distributed environments beyond the classroom and access to information in these environments is mostly interest-driven, learner interests need to constitute an important learner feature to be modeled. In this paper, we focus on the interest dimension of a learner model and present Wiki-LDA as a novel method to effectively mine user’s interests in Twitter. We apply a mixed-method approach that combines Latent Dirichlet Allocation (LDA), text mining APIs, and wikipedia categories. Wiki-LDA has proven effective at the task of interest mining and classification on Twitter data, outperforming standard LDA.

References

  1. Baker, R. (2010). Data mining for education. International Encyclopedia of Education, 7:112-118.
  2. Blei, D. M., Ng, A. Y., and Jordan, M. I. (2003). Latent Dirichlet Allocation. Journal of Machine Learning Research, 3:993-1022.
  3. Brusilovsky, P. and Millan, E. (2007). User models for adaptive hypermedia and adaptive educational systems. In Brusilovsky, P., Kobsa, A., and Nejdl, W., editors, The Adaptive Web, LNCS 4321, chapter 1, pages 3-53. Springer-Verlag Berlin Heidelberg.
  4. Chatti, M. A. (2010). The laan theory. In Personalization in Technology Enhanced Learning: A Social Software Perspective, pages 19-42. Aachen, Germany: Shaker Verlag.
  5. Chatti, M. A., Dyckhoff, A. L., Schroeder, U., and Thüs, H. (2012). A reference model for learning analytics. International Journal of Technology Enhanced Learning, 4(5/6):318-331.
  6. Chatti, M. A., Lukarov, V., Th üs, H., Muslim, A., Yousef, A. M. F., Wahid, U., Greven, C., Chakrabarti, A., and Schroeder, U. (2014). Learning analytics: Challenges and future research directions. e-learning and education journal (eleed), 10.
  7. Kay, J. and Kummerfeld, B. (2011). Lifelong learner modeling. In Durlach, P. J. and Lesgold, A. M., editors, Adaptive Technologies for Training and Education, pages 140-164. Cambridge University Press.
  8. Mehrotra, R., Sanner, S., Buntine, W., and Xie, L. (2013). Improving lda topic models for microblogs via tweet pooling and automatic labeling. In Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval, pages 889-892. ACM.
  9. Michelson, M. and Macskassy, S. A. (2010). Discovering users' topics of interest on twitter: a first look. In Proceedings of the fourth workshop on Analytics for noisy unstructured text data, pages 73-80. ACM.
  10. Puniyani, K., Eisenstein, J., Cohen, S., and Xing, E. P. (2010). Social links from latent topics in microblogs. In Proceedings of the NAACL HLT 2010 Workshop on Computational Linguistics in a World of Social Media, pages 19-20. Association for Computational Linguistics.
  11. Quercia, D., Askham, H., and Crowcroft, J. (2012). Tweetlda: supervised topic classification and link prediction in twitter. In Proceedings of the 3rd Annual ACM Web Science Conference, pages 247-250. ACM.
  12. Ramage, D., Dumais, S. T., and Liebling, D. J. (2010). Characterizing microblogs with topic models. ICWSM, 10:1-1.
  13. Romero, C., Ventura, S., Pechenizkiy, M., and Baker, R. S. (2010). Handbook of educational data mining. CRC Press.
  14. Zhao, W. X., Jiang, J., Weng, J., He, J., Lim, E.-P., Yan, H., and Li, X. (2011). Comparing twitter and traditional media using topic models. In Advances in Information Retrieval, pages 338-349. Springer.
Download


Paper Citation


in Harvard Style

Pu X., Chatti M., Thüs H. and Schroeder U. (2016). Wiki-LDA: A Mixed-Method Approach for Effective Interest Mining on Twitter Data . In Proceedings of the 8th International Conference on Computer Supported Education - Volume 1: CSEDU, ISBN 978-989-758-179-3, pages 426-433. DOI: 10.5220/0005861504260433


in Bibtex Style

@conference{csedu16,
author={Xiao Pu and Mohamed Amine Chatti and Hendrik Thüs and Ulrik Schroeder},
title={Wiki-LDA: A Mixed-Method Approach for Effective Interest Mining on Twitter Data},
booktitle={Proceedings of the 8th International Conference on Computer Supported Education - Volume 1: CSEDU,},
year={2016},
pages={426-433},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005861504260433},
isbn={978-989-758-179-3},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 8th International Conference on Computer Supported Education - Volume 1: CSEDU,
TI - Wiki-LDA: A Mixed-Method Approach for Effective Interest Mining on Twitter Data
SN - 978-989-758-179-3
AU - Pu X.
AU - Chatti M.
AU - Thüs H.
AU - Schroeder U.
PY - 2016
SP - 426
EP - 433
DO - 10.5220/0005861504260433