Improved Identification of Data Correlations through Correlation Coordinate Plots

Hoa Nguyen, Paul Rosen

Abstract

Correlation is a powerful relationship measure used in science, engineering, and business to estimate trends and make forecasts. Visualization methods, such as scatterplots and parallel coordinates, are designed to be general, supporting many visualization tasks, including identifying correlation. However, due to their generality, they do not provide the most efficient interface, in terms of speed and accuracy. This can be problematic when a task needs to be repeated frequently. To address this shortcoming, we propose a new correlation task-specific visualization method called Correlation Coordinate Plots (CCPs). CCPs transform data into a powerful coordinate system for estimating the direction and strength of correlation. To support multiple attributes, we propose 2 additional interfaces. The first is the Snowflake Visualization, a focus+context layout for exploring all pairwise correlations. The second enhances the basic CCP by using principal component analysis to project multiple attributes. We validate CCP performance in correlation-specific tasks through an extensive user study that shows improvement in both accuracy and speed.

References

  1. Anscombe, F. J. (1973). Graphs in statistical analysis. In American Statistical Association, pages 17-21.
  2. Aris, A. and Shneiderman, B. (2007). Designing semantic substrates for visual network exploration. In InfoVis, pages 281-300.
  3. Bertini, E., Tatu, A., and Keim, D. (2011). Quality metrics in high-dimensional data visualization: An overview and systematization. IEEE Trans. on Visualization and Comp. Graphics, 17(12):2203-2212.
  4. Bezerianos, A., Chevalier, F., Dragicevic, P., Elmqvist, N., and Fekete, J.-D. (2010). Graphdice: A system for exploring multivariate social networks. Computer Graphics Forum, 29(3):863-872.
  5. Buering, T., Gerken, J., and Reiterer, H. (2006). User interaction with scatterplots on small screens - a comparative evaluation of geometric-semantic zoom and fisheye distortion. IEEE Trans. on Visualization and Comp. Graphics, 12(5):829-836.
  6. Chen, Y. A., Almeida, J. S., Richards, A. J., Muller, P., Carroll, R. J., and Rohrer, B. (2010). A nonparametric approach to detect nonlinear correlation in gene expression. Journal of Computational and Statistical Graphics, 19(3):552-568.
  7. Dang, T. N. and Wilkinson, L. (2014). Transforming scagnostics to reveal hidden features. IEEE Trans. on Visualization and Comp. Graphics, 20(12):1624- 1632.
  8. Elmqvist, N., Dragicevic, P., and Fekete, J.-D. (2008). Rolling the dice: Multidimensional visual exploration using scatterplot matrix navigation. IEEE Trans. on Visualization and Comp. Graphics, 14(6):1539-1148.
  9. Fanea, E., Carpendale, M. S. T., and Isenberg, T. (2005). An interactive 3d integration of parallel coordinates and star glyphs. In InfoVis, pages 149-156.
  10. Friendly, M. (2002a). Corrgrams: Exploratory displays for correlation matrices. The American Statistician, 56(4):316-324.
  11. Friendly, M. (2002b). Corrgrams: Exploratory displays for correlation matrices. Ame. Stats, 1.
  12. Geng, Z., Peng, Z., S.Laramee, R., Roberts, J. C., and Walker, R. (2011). Angular histograms: Frequencybased visualizations for large, high dimensional data. IEEE Trans. on Visualization and Comp. Graphics, 17(12):2572-2580.
  13. Harrison, L., Yang, F., Franconeri, S., and Chang, R. (2014). Ranking visualizations of correlation using weber's law. IEEE Trans. on Visualization and Comp. Graphics, 20(12):1943-1952.
  14. Hartigan, J. A. (1975). Printer graphics for clustering. JSCS, 4(3).
  15. Heinrich, J., Stasko, J., and Weiskopf, D. (2012). The parallel coordinates matrix. In EuroVis - Short Papers, pages 37-41.
  16. Heinrich, J. and Weiskopf, D. (2013). State of the art of parallel coordinates. In Eurographics STAR, pages 95- 116.
  17. Holten, D. and van Wijk, J. J. (2010). Evaluation of cluster identification performance for different pcp variants. EuroVis, 29(3).
  18. Hong, X., Wang, C.-X., Thompson, J. S., Allen, B., Malik, W. Q., and Ge, X. (2010). On space-frequency correlation of uwb mimo channels. IEEE Trans. on Veh. Tech., 59(9):4201-4213.
  19. Huang, T.-H., Huang, M. L., and Zhang, K. (2012). An interactive scatter plot metrics visualization for decision trend analysis. In Conf. on Machine Learning, Applications, pages 258-264.
  20. Inselberg, A. (1985). The plane with parallel coordinates. The Visual Computer, 1(2):69-91.
  21. Jarrell, S. B. (1994). Basic Statistics. W. C. Brow Comm.
  22. Johansson, J., Ljung, P., Jern, M., and Cooper, M. (2006). Revealing structure in visualizations of dense 2d and 3d parallel coordinates. Information Visualization, 5(2):125-136.
  23. Li, J., Martens, J.-B., and van Wijk, J. J. (2010). Judging correlation from scatterplots and parallel coordinate plots. InfoVis, 9:13-30.
  24. Magnello, E. and Vanloon, B. (2009). Introducing Statistics: A Graphic Guide. Icon Books.
  25. Sharma, R. K. and Wallace, J. W. (2011). Correlation-based sensing for cognitive radio networks: Bounds and experimental assessment. IEEE Sensors Journal, 11(3).
  26. Tominski, C., Abello, J., and Schumann, H. (2004). Axesbased visualizations with radial layouts. In ACM symposium on Applied computing, pages 1242-1247. ACM.
  27. Wang, J. and Zheng, N. (2013). A novel fractal image compression scheme with block classification and sorting based on pearsons correlation coefficient. IEEE Transactions on Image Processing, 22(9).
  28. Wattenberg, M. (2006). Visual exploration of multivariate graphs. In SIGCHI, CHI 7806, pages 811-819.
  29. Wilkinson, L., Anand, A., and Grossman, R. L. (2005). Graph-theoretic scagnostics. In InfoVis, volume 5, page 21.
  30. Xu, W., Chang, C., Hung, Y. S., and Fung, P. C. W. (2008). Asymptotic properties of order statistics correlation coefficient in the normal cases. IEEE Trans. on Signal Pro., 56(6):2239-2248.
  31. Yu, S., Zhou, W., Jia, W., Guo, S., Xiang, Y., and Tang, F. (2012). Discriminating ddos attacks from flash crowds using flow correlation coefficient. IEEE Trans. on PDS, 23(6):1073-1080.
  32. Zhou, H., Cui, W., Qu, H., Wu, Y., Yuan, X., and Zhuo, W. (2009). Splatting lines in parallel coordinates. Computer Graphics Forum, 28(3):759-766.
  33. Zhou, H., Yuan, X., Qu, H., Cui, W., and Chen, B. (2008). Visual clustering in parallel coordinates. Computer Graphics Forum.
Download


Paper Citation


in Harvard Style

Nguyen H. and Rosen P. (2016). Improved Identification of Data Correlations through Correlation Coordinate Plots . In Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 2: IVAPP, (VISIGRAPP 2016) ISBN 978-989-758-175-5, pages 60-71. DOI: 10.5220/0005717500600071


in Bibtex Style

@conference{ivapp16,
author={Hoa Nguyen and Paul Rosen},
title={Improved Identification of Data Correlations through Correlation Coordinate Plots},
booktitle={Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 2: IVAPP, (VISIGRAPP 2016)},
year={2016},
pages={60-71},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005717500600071},
isbn={978-989-758-175-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 2: IVAPP, (VISIGRAPP 2016)
TI - Improved Identification of Data Correlations through Correlation Coordinate Plots
SN - 978-989-758-175-5
AU - Nguyen H.
AU - Rosen P.
PY - 2016
SP - 60
EP - 71
DO - 10.5220/0005717500600071