Data Aggregation and Distance Encoding for Interactive Large Multidimensional Data Visualization

Desislava Decheva, Lars Linsen

Abstract

Visualization of unlabeled multidimensional data is commonly performed using projections to a 2D visual space, which supports an investigative interactive analysis. However, static views obtained by a projection method like Principal Component Analysis (PCA) may not capture well all data features. Moreover. in case of large data with many samples, the scatterplots suffer from overplotting, which hinders analysis purposes. Clustering tools allow for aggregation of data to meaningful structures. Clustering methods like K-means, however, also suffer from drawbacks. We present a novel approach to visually encode aggregated data in projected views and to interactively explore the data. We make use of the benefits of PCA and K-means clustering, but overcome their main drawbacks. The sensitivity of K-means to outlier points is ameliorated, while the sensitivity of PCA to axis scaling is converted into a powerful flexibility, allowing the user to change observation perspective by rescaling the original axes. Analysis of both clusters and outliers is facilitated. Properties of clusters are visually encoded in aggregated form using color and size or examined in detail via local scatterplots or local circular parallel coordinate plots. The granularity of the data aggregation process can be adjusted interactively. A star coordinate interaction widget allows for modifying the projection matrix. To convey how much the projection maintains neighborhoods, we use a distance encoding. We evaluate our tool using synthetic and real-world data sets and perform a user study to evaluate its effectiveness.

Download


Paper Citation


in Harvard Style

Decheva D. and Linsen L. (2018). Data Aggregation and Distance Encoding for Interactive Large Multidimensional Data Visualization.In Proceedings of the 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 3: IVAPP, ISBN 978-989-758-289-9, pages 225-235. DOI: 10.5220/0006602502250235


in Bibtex Style

@conference{ivapp18,
author={Desislava Decheva and Lars Linsen},
title={Data Aggregation and Distance Encoding for Interactive Large Multidimensional Data Visualization},
booktitle={Proceedings of the 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 3: IVAPP,},
year={2018},
pages={225-235},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006602502250235},
isbn={978-989-758-289-9},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 3: IVAPP,
TI - Data Aggregation and Distance Encoding for Interactive Large Multidimensional Data Visualization
SN - 978-989-758-289-9
AU - Decheva D.
AU - Linsen L.
PY - 2018
SP - 225
EP - 235
DO - 10.5220/0006602502250235