Overcoming the Curse of Dimensionality When Clustering Multivariate Volume Data

Vladimir Molchanov, Lars Linsen

2018

Abstract

Visual analytics of multidimensional data suffer from the curse of dimensionality, i.e., that even large numbers of data points will be scattered in a high-dimensional space. The curse of dimensionality prohibits the proper use of clustering algorithms in the high-dimensional space. Projecting the space before clustering imposes a loss of information and possible mixing of separated clusters. We present an approach where we overcome the curse of dimensionality for a particular type of multidimensional data, namely for attribute spaces of multivariate volume data. For multivariate volume data, it is possible to interpolate between the data points in the high-dimensional attribute space based on their spatial relationship in the volumetric domain (or physical space). We apply this idea to a histogram-based clustering algorithm. We create a uniform partition of the attribute space in multidimensional bins and compute a histogram indicating the number of data samples belonging to each bin. Only non-empty bins are stored for efficiency. Without interpolation, the analysis is highly sensitive to the cell sizes yielding inaccurate clustering for improper choices: Large cells result in no cluster separation, while clusters fall apart for small cells. Using tri-linear interpolation in physical space, we can refine the data by generating additional samples. The refinement scheme can adapt to the data point distribution in attribute space and the histogram’s bin size. As a consequence, we can generate a density computation, where clusters stay connected even when using very small cell sizes. We exploit this result to create a robust hierarchical cluster tree. It can be visually explored using coordinated views to physical space visualizations and to parallel coordinates plots. We apply our technique to several datasets and compare the results against results without interpolation.

Download


Paper Citation


in Harvard Style

Molchanov V. and Linsen L. (2018). Overcoming the Curse of Dimensionality When Clustering Multivariate Volume Data. In Proceedings of the 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2018) - Volume 2: IVAPP; ISBN 978-989-758-289-9, SciTePress, pages 29-39. DOI: 10.5220/0006541900290039


in Bibtex Style

@conference{ivapp18,
author={Vladimir Molchanov and Lars Linsen},
title={Overcoming the Curse of Dimensionality When Clustering Multivariate Volume Data},
booktitle={Proceedings of the 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2018) - Volume 2: IVAPP},
year={2018},
pages={29-39},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006541900290039},
isbn={978-989-758-289-9},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2018) - Volume 2: IVAPP
TI - Overcoming the Curse of Dimensionality When Clustering Multivariate Volume Data
SN - 978-989-758-289-9
AU - Molchanov V.
AU - Linsen L.
PY - 2018
SP - 29
EP - 39
DO - 10.5220/0006541900290039
PB - SciTePress