Authors:
Jens Emil Gydesen
;
Henrik Haxholm
;
Niels Sonnich Poulsen
;
Sebastian Wahl
and
Bo Thiesson
Affiliation:
Aalborg University, Denmark
Keyword(s):
Data Mining, Indexing, Approximate Search, Multidimensional Data, Images, Data Representation.
Related
Ontology
Subjects/Areas/Topics:
Applications
;
Clustering
;
Data Engineering
;
Information Retrieval
;
Ontologies and the Semantic Web
;
Pattern Recognition
;
Software Engineering
;
Theory and Methods
Abstract:
The increasing amount and size of data makes indexing and searching more difficult. It is especially challenging
for multidimensional data such as images, videos, etc. In this paper we introduce a new indexable
symbolic data representation that allows us to efficiently index and retrieve from a large amount of data that
may appear in multiple dimensions. We use an approximate lower bounding distance measure to compute the
distance between multidimensional arrays, which allows us to perform fast similarity searches. We present
two search methods, exact and approximate, which can quickly retrieve data using our representation. Our approach
is very general and works for many types of multidimensional data, including different types of image
representations. Even for millions of multidimensional arrays, the approximate search will find a result in a
few milliseconds, and will in many cases return a result similar to the best match.