The Stor-e-Motion Visualization for Topic Evolution Tracking in Text Data Streams

Andreas Weiler, Michael Grossniklaus, Marc H. Scholl

2015

Abstract

Nowadays, there are plenty of sources generating massive amounts of text data streams in a continuous way. For example, the increasing popularity and the active use of social networks result in voluminous and fast-flowing text data streams containing a large amount of user-generated data about almost any topic around the world. However, the observation and tracking of the ongoing evolution of topics in these unevenly distributed text data streams is a challenging task for analysts, news reporters, or other users. This paper presents “Stor-e- Motion” a shape-based visualization to track the ongoing evolution of topics’ frequency (i.e., importance), sentiment (i.e., emotion), and context (i.e., story) in user-defined topic channels over continuous flowing text data streams. The visualization supports the user in keeping the overview over vast amounts of streaming data and guides the perception of the user to unexpected and interesting points or periods in the text data stream. In this work, we mainly focus on the visualization of text streams from the social microblogging service Twitter, for which we present a series of case studies (e.g., the observation of cities, movies, or natural disasters) applied on real-world data streams collected from the public timeline. However, to further evaluate our visualization, we also present a baseline case study applied on the text stream of a fantasy book series.

References

  1. Abadi, D. J., Ahmad, Y., Balazinska, M., C¸ etintemel, U., Cherniack, M., Hwang, J., Lindner, W., Maskey, A., Rasin, A., Ryvkina, E., Tatbul, N., Xing, Y., and Zdonik, S. B. (2005). The Design of the Borealis Stream Processing Engine. In Proc. Intl. Conf. on Innovative Data Systems Research (CIDR), pages 277- 289.
  2. Abadi, D. J., Carney, D., C¸ etintemel, U., Cherniack, M., Convey, C., Lee, S., Stonebraker, M., Tatbul, N., and Zdonik, S. (2003). Aurora: A New Model and Architecture for Data Stream Management. The VLDB Journal, 12(2):120-139.
  3. Alvanaki, F., Michel, S., Ramamritham, K., and Weikum, G. (2012). See what's enblogue: real-time emergent topic identification in social media. In Proceedings of the 15th International Conference on Extending Database Technology, EDBT 7812, pages 336-347, New York, NY, USA. ACM.
  4. Arasu, A., Babu, S., and Widom, J. (2006). The CQL Continuous Query Language: Semantic Foundations and Query Execution. The VLDB Journal, 15(2):121-142.
  5. Bontcheva, K. and Rout, D. (2012). Making sense of social media streams through semantics: a survey. In Semantic Web journal.
  6. Bosch, H., Thom, D., Heimerl, F., Puttmann, E., Koch, S., Krüger, R., Wörner, M., and Ertl, T. (2013). Scatterblogs2: Real-time monitoring of microblog messages through user-guided filtering. IEEE Trans. Vis. Comput. Graph., 19(12):2022-2031.
  7. Culotta, A. (2010). Towards detecting influenza epidemics by analyzing twitter messages. In Proceedings of the First Workshop on Social Media Analytics, SOMA 7810, pages 115-122, New York, NY, USA. ACM.
  8. Dork, M., Gruen, D., Williamson, C., and Carpendale, S. (2010). A visual backchannel for large-scale events. IEEE Transactions on Visualization and Computer Graphics, 16(6):1129-1138.
  9. Havre, S., Hetzler, E., Whitney, P., and Nowell, L. (2002). Themeriver: Visualizing thematic changes in large document collections. Visualization and Computer Graphics, IEEE Transactions on, 8(1):9-20.
  10. Java, A., Song, X., Finin, T., and Tseng, B. (2007). Why we twitter: understanding microblogging usage and communities. In Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis, pages 56-65. ACM.
  11. Krstajic, M., Bertini, E., and Keim, D. A. (2011). Cloudlines: Compact display of event episodes in multiple time-series. IEEE Trans. Vis. Comput. Graph., 17(12):2432-2439.
  12. Kwak, H., Lee, C., Park, H., and Moon, S. (2010). What is Twitter, a social network or a news media? In Proceedings of the 19th international conference on World wide web, WWW 7810, pages 591-600. ACM.
  13. Lee, S., Lee, S., Kim, K., and Park, J. (2012). Bursty event detection from text streams for disaster management. In Proceedings of the 21st international conference companion on World Wide Web, WWW 7812 Companion, pages 679-682, New York, NY, USA. ACM.
  14. Li, J., Maier, D., Tufte, K., Papadimos, V., and Tucker, P. A. (2005). No Pane, No Gain: Efficient Evaluation of Sliding-Window Aggregates over Data Streams. SIGMOD Record, 34(1):39-44.
  15. Li, J., Tufte, K., Shkapenyuk, V., Papadimos, V., Johnson, T., and Maier, D. (2008). Out-of-Order Processing: A New Architecture for High-Performance Stream Systems. PVLDB, 1(1):274-288.
  16. MacEachren, A., Jaiswal, A., Robinson, A., Pezanowski, S., Savelyev, A., Mitra, P., Zhang, X., and Blanford, J. (2011). Senseplace2: Geotwitter analytics support for situational awareness. In Visual Analytics Science and Technology (VAST), 2011 IEEE Conference on, pages 181-190.
  17. Maier, D., Grossniklaus, M., Moorthy, S., and Tufte, K. (2012). Capturing Episodes: May the Frame Be With You. In Proc. Intl. Conf. on Distributed Event-Based Systems (DEBS), pages 1-11.
  18. Marcus, A., Bernstein, M. S., Badar, O., Karger, D. R., Madden, S., and Miller, R. C. (2011). Twitinfo: aggregating and visualizing microblogs for event exploration. In Proceedings of the 2011 annual conference on Human factors in computing systems, CHI 7811, pages 227-236. ACM.
  19. Naaman, M., Boase, J., and Lai, C.-H. (2010). Is it really about me?: message content in social awareness streams. In Proceedings of the 2010 ACM conference on Computer supported cooperative work, CSCW 7810, pages 189-192. ACM.
  20. Overby, D., Keyser, J., and Wall, J. (2009). Interactive visual analysis of location reporting patterns. In Visual Analytics Science and Technology, 2009. VAST 2009. IEEE Symposium on, pages 223-224. IEEE.
  21. Ritter, A., Mausam, Etzioni, O., and Clark, S. (2012). Open domain event extraction from twitter. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD 7812, pages 1104-1112, New York, NY, USA. ACM.
  22. Sakaki, T., Okazaki, M., and Matsuo, Y. (2010). Earthquake shakes Twitter users: real-time event detection by social sensors. In Proceedings of the 19th international conference on World wide web, WWW 7810, pages 851-860. ACM.
  23. Sparck Jones, K. (1988). A statistical interpretation of term specificity and its application in retrieval, pages 132- 142. Taylor Graham Publishing.
  24. Thelwall, M., Buckley, K., Paltoglou, G., Cai, D., and Kappas, A. (2010). Sentiment strength detection in short informal text. Journal of the American Society for Information Science and Technology, 61(12).
  25. Tonkin, E., Pfeiffer, H. D., and Tourte, G. (2012). Twitter, information sharing and the London riots? Bulletin of the American Society for Information Science and Technology, 38(2):49-57.
  26. Wanner, F., Ramm, T., and Keim, D. A. (2011). Foravis: Explorative user forum analysis. In Proceedings of the International Conference on Web Intelligence, Mining and Semantics, WIMS 7811, pages 14:1-14:10, New York, NY, USA. ACM.
  27. Wanner, F., Rohrdantz, C., Mansmann, F., Stoffel, A., Oelke, D., Krstajic, M., Keim, D. A., Luo, D., Yang, J., and Atkinson, M. (2009). Large-scale Comparative Sentiment Analysis of News Articles (InfoVis 2009). Poster at IEEE InfoVis 2009.
  28. Weng, J., Yao, Y., Leonardi, E., and Lee, F. (2011). Event Detection in Twitter. Technical report, HP Labs.
Download


Paper Citation


in Harvard Style

Weiler A., Grossniklaus M. and Scholl M. (2015). The Stor-e-Motion Visualization for Topic Evolution Tracking in Text Data Streams . In Proceedings of the 6th International Conference on Information Visualization Theory and Applications - Volume 1: IVAPP, (VISIGRAPP 2015) ISBN 978-989-758-088-8, pages 29-39. DOI: 10.5220/0005292900290039


in Bibtex Style

@conference{ivapp15,
author={Andreas Weiler and Michael Grossniklaus and Marc H. Scholl},
title={The Stor-e-Motion Visualization for Topic Evolution Tracking in Text Data Streams},
booktitle={Proceedings of the 6th International Conference on Information Visualization Theory and Applications - Volume 1: IVAPP, (VISIGRAPP 2015)},
year={2015},
pages={29-39},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005292900290039},
isbn={978-989-758-088-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 6th International Conference on Information Visualization Theory and Applications - Volume 1: IVAPP, (VISIGRAPP 2015)
TI - The Stor-e-Motion Visualization for Topic Evolution Tracking in Text Data Streams
SN - 978-989-758-088-8
AU - Weiler A.
AU - Grossniklaus M.
AU - Scholl M.
PY - 2015
SP - 29
EP - 39
DO - 10.5220/0005292900290039