Interpretation of Dimensionally-reduced Crime Data: A Study with Untrained Domain Experts

Dominik Jäckle, Florian Stoffel, Sebastian Mittelstädt, Daniel A. Keim, Harald Reiterer

Abstract

Dimensionality reduction (DR) techniques aim to reduce the amount of considered dimensions, yet preserving as much information as possible. According to many visualization researchers, DR results lack interpretability, in particular for domain experts not familiar with machine learning or advanced statistics. Thus, interactive visual methods have been extensively researched for their ability to improve transparency and ease the interpretation of results. However, these methods have primarily been evaluated using case studies and interviews with experts trained in DR. In this paper, we describe a phenomenological analysis investigating if researchers with no or only limited training in machine learning or advanced statistics can interpret the depiction of a data projection and what their incentives are during interaction. We, therefore, developed an interactive system for DR, which unifies mixed data types as they appear in real-world data. Based on this system, we provided data analysts of a Law Enforcement Agency (LEA) with dimensionally-reduced crime data and let them explore and analyze domain-relevant tasks without providing further conceptual information. Results of our study reveal that these untrained experts encounter few difficulties in interpreting the results and drawing conclusions given a domain relevant use case and their experience. We further discuss the results based on collected informal feedback and observations.

References

  1. Agency, N. P. I. (2008). Practice advice on analysis. Technical report, Association of Chief Police Officers by the National Policing Improvement Agency.
  2. Alsallakh, B., Aigner, W., Miksch, S., and Gr öller, M. E. (2012). Reinventing the contingency wheel: Scalable visual analytics of large categorical data. IEEE Trans. Vis. Comput. Graph., 18(12):2849-2858.
  3. Andreas Buja, Dianne Cook, D. F. S. (1996). Interactive high-dimensional data visualization. Journal of Computational and Graphical Statistics, 5(1):78-99.
  4. Benzécri, J. (1973). L'analyse des données: L'analyse des correspondances. L'analyse des données: lec¸ons sur l'analyse factorielle et la reconnaissance des formes et travaux. Dunod.
  5. Bernard, J., Steiger, M., Widmer, S., L ücke-Tieke, H., May, T., and Kohlhammer, J. (2014). Visual-interactive exploration of interesting multivariate relations in mixed research data sets. Comput. Graph. Forum, 33(3):291-300.
  6. Boren, T. and Ramey, J. (2000). Thinking aloud: Reconciling theory and practice. IEEE transactions on professional communication, 43(3):261-278.
  7. Brehmer, M. and Munzner, T. (2013). A multi-level typology of abstract visualization tasks. IEEE Trans. Visualization and Computer Graphics (TVCG) (Proc. InfoVis), 19(12):2376-2385.
  8. Brown, E. T., Liu, J., Brodley, C. E., and Chang, R. (2012). Dis-function: Learning distance functions interactively. In 2012 IEEE Conference on Visual Analytics Science and Technology, VAST 2012, Seattle, WA, USA, October 14-19, 2012, pages 83-92.
  9. Cox, T. and Cox, A. (2000). Multidimensional Scaling, Second Edition. Chapman & Hall/CRC Monographs on Statistics & Applied Probability. CRC Press.
  10. D örk, M., Carpendale, M. S. T., and Williamson, C. (2012). Visualizing explicit and implicit relations of complex information spaces. Information Visualization, 11(1):5-21.
  11. Ellis, G. P. and Dix, A. J. (2006). An explorative analysis of user evaluation studies in information visualisation. In Proceedings of the 2006 AVI Workshop on BEyond time and errors: novel evaluation methods for information visualization, BELIV 2006, Venice, Italy, May 23, 2006, pages 1-7.
  12. Fernstad, S. J., Shaw, J., and Johansson, J. (2013). Qualitybased guidance for exploratory dimensionality reduction. Information Visualization, 12(1):44-64.
  13. Gower, J. C. (1971). A general coefficient of similarity and some of its properties. Biometrics, 27(4):857-871.
  14. Graham, R. L., Knuth, D. E., and Patashnik, O. (1994). Concrete Mathematics: A Foundation for Computer Science. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 2nd edition.
  15. Hinneburg, A., Aggarwal, C. C., and Keim, D. A. (2000). What is the nearest neighbor in high dimensional spaces? In VLDB 2000, Proceedings of 26th International Conference on Very Large Data Bases, September 10-14, 2000, Cairo, Egypt, pages 506-515.
  16. Ingram, S., Munzner, T., Irvine, V., Tory, M., Bergner, S., and M öller, T. (2010). Dimstiller: Workflows for dimensional analysis and reduction. In Proceedings of the IEEE Conference on Visual Analytics Science and Technology, IEEE VAST 2010, Salt Lake City, Utah, USA, 24-29 October 2010, part of VisWeek 2010, pages 3-10.
  17. Inselberg, A. (1985). The plane with parallel coordinates. The Visual Computer, 1(2):69-91.
  18. Jeong, D. H., Ziemkiewicz, C., Fisher, B. D., Ribarsky, W., and Chang, R. (2009). ipca: An interactive system for pca-based visual analytics. Comput. Graph. Forum, 28(3):767-774.
  19. Johansson, S. and Johansson, J. (2009a). Interactive dimensionality reduction through user-defined combinations of quality metrics. IEEE transactions on visualization and computer graphics, 15(6):993-1000.
  20. Johansson, S. and Johansson, J. (2009b). Visual analysis of mixed data sets using interactive quantification. SIGKDD Explorations, 11(2):29-38.
  21. Jolliffe, I. (1986). Principal component analysis. Springer series in statistics. Springer-Verlang.
  22. Kosara, R., Bendix, F., and Hauser, H. (2006). Parallel sets: Interactive exploration and visual analysis of categorical data. IEEE Trans. Vis. Comput. Graph., 12(4):558-568.
  23. Krause, J., Dasgupta, A., Fekete, J.-D., and Bertini, E. (2016). SeekAView: an intelligent dimensionality reduction strategy for navigating high-dimensional data spaces. Large Data Analysis and Visualization (LDAV), IEEE Symposium on.
  24. Lawrence E. Cohen, M. F. (1979). Social change and crime rate trends: A routine activity approach.
  25. Liu, S., Wang, B., Bremer, P., and Pascucci, V. (2014). Distortion-guided structure-driven interactive exploration of high-dimensional data. Comput. Graph. Forum, 33(3):101-110.
  26. Manly, B. (2004). Multivariate Statistical Methods: A Primer, Third Edition. Taylor & Francis.
  27. Mittelstädt, S., Bernard, J., Schreck, T., Steiger, M., Kohlhammer, J., and Keim, D. A. (2014). Revisiting Perceptually Optimized Color Mapping for HighDimensional Data Analysis. In In Proceedings of the Eurographics Conference on Visualization, pages 91- 95. The Eurographics Association.
  28. Nam, J. E. and Mueller, K. (2013). Tripadvisorn-d: A tourism-inspired high-dimensional space exploration framework with overview and detail. IEEE Trans. Vis. Comput. Graph., 19(2):291-305.
  29. Rosario, G. E., Rundensteiner, E. A., Brown, D. C., Ward, M. O., and Huang, S. (2004). Mapping nominal values to numbers for effective visualization. Information Visualization, 3(2):80-95.
  30. Sacha, D., Zhang, L., Sedlmair, M., Lee, J. A., Peltonen, J., Weiskopf, D., North, S. C., and Keim, D. A. (2016). Visual interaction with dimensionality reduction: A structured literature analysis. IEEE Transactions on Visualization and Computer Graphics, PP(99):1-1.
  31. Sedlmair, M., Munzner, T., and Tory, M. (2013). Empirical guidance on scatterplot and dimension reduction technique choices. IEEE Trans. Vis. Comput. Graph., 19(12):2634-2643.
  32. Seo, J. and Shneiderman, B. (2005). A rank-by-feature framework for interactive exploration of multidimensional data. Information Visualization, 4(2):96-113.
  33. Singhal, A. (2001). Modern information retrieval: A brief overview. IEEE Data Eng. Bull., 24(4):35-43.
  34. Stahnke, J., D örk, M., M üller, B., and Thom, A. (2016). Probing projections: Interaction techniques for interpreting arrangements and errors of dimensionality reductions. IEEE Trans. Vis. Comput. Graph., 22(1):629-638.
  35. Tominski, C., Gladisch, S., Kister, U., Dachselt, R., and Schumann, H. (2014). A Survey on Interactive Lenses in Visualization. In EuroVis State-of-the-Art Reports, pages 43-62. Eurographics Association.
  36. Turkay, C., Filzmoser, P., and Hauser, H. (2011). Brushing dimensions - A dual visual analysis model for highdimensional data. IEEE Trans. Vis. Comput. Graph., 17(12):2591-2599.
  37. Turkay, C., Lundervold, A., Lundervold, A. J., and Hauser, H. (2012). Representative factor generation for the interactive visual analysis of high-dimensional data. IEEE Trans. Vis. Comput. Graph., 18(12):2621-2630.
  38. Ward, M. O. and Martin, A. R. (1995). High dimensional brushing for interactive exploration of multivariate data. In IEEE Visualization, page 271.
  39. Yi, J. S., Melton, R., Stasko, J. T., and Jacko, J. A. (2005). Dust & magnet: multivariate information visualization using a magnet metaphor. Information Visualization, 4(3):239-256.
  40. Yuan, X., Ren, D., Wang, Z., and Guo, C. (2013). Dimension projection matrix/tree: Interactive subspace visual exploration and analysis of high dimensional data. IEEE Trans. Vis. Comput. Graph., 19(12):2625- 2633.
Download


Paper Citation


in Harvard Style

Jäckle D., Stoffel F., Mittelstädt S., Keim D. and Reiterer H. (2017). Interpretation of Dimensionally-reduced Crime Data: A Study with Untrained Domain Experts . In Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 3: IVAPP, (VISIGRAPP 2017) ISBN 978-989-758-228-8, pages 164-175. DOI: 10.5220/0006265101640175


in Bibtex Style

@conference{ivapp17,
author={Dominik Jäckle and Florian Stoffel and Sebastian Mittelstädt and Daniel A. Keim and Harald Reiterer},
title={Interpretation of Dimensionally-reduced Crime Data: A Study with Untrained Domain Experts},
booktitle={Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 3: IVAPP, (VISIGRAPP 2017)},
year={2017},
pages={164-175},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006265101640175},
isbn={978-989-758-228-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 3: IVAPP, (VISIGRAPP 2017)
TI - Interpretation of Dimensionally-reduced Crime Data: A Study with Untrained Domain Experts
SN - 978-989-758-228-8
AU - Jäckle D.
AU - Stoffel F.
AU - Mittelstädt S.
AU - Keim D.
AU - Reiterer H.
PY - 2017
SP - 164
EP - 175
DO - 10.5220/0006265101640175