loading
Documents

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Alexis Gabadinho ; Gilbert Ritschard ; Matthias Studer and Nicolas S. Müller

Affiliation: University of Geneva, Switzerland

ISBN: 978-989-674-011-5

Keyword(s): Categorical sequence data, Representativeness, Dissimilarity, Discrepancy of sequences, Summarizing sets of sequences, Visualization.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; BioInformatics & Pattern Discovery ; Clustering and Classification Methods ; Data Reduction and Quality Assessment ; Information Extraction ; Knowledge Discovery and Information Retrieval ; Knowledge-Based Systems ; Mining High-Dimensional Data ; Symbolic Systems ; Visual Data Mining and Data Visualization

Abstract: This paper is concerned with the summarization of a set of categorical sequence data. More specifically, the problem studied is the determination of the smallest possible number of representative sequences that ensure a given coverage of the whole set, i.e. that have together a given percentage of sequences in their neighborhood. The goal is to yield a representative set that exhibits the key features of the whole sequence data set and permits easy sounded interpretation. We propose an heuristic for determining the representative set that first builds a list of candidates using a representativeness score and then eliminates redundancy. We propose also a visualization tool for rendering the results and quality measures for evaluating them. The proposed tools have been implemented in TraMineR our R package for mining and visualizing sequence data and we demonstrate their efficiency on a real world example from social sciences. The methods are nonetheless by no way limited to social scie nce data and should prove useful in many other domains. (More)

PDF ImageFull Text

Download
Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 54.92.182.0

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Gabadinho A., Ritschard G., Studer M. and Müller N. (2009). SUMMARIZING SETS OF CATEGORICAL SEQUENCES - Selecting and Visualizing Representative Sequences.In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2009) ISBN 978-989-674-011-5, pages 62-69. DOI: 10.5220/0002300400620069

@conference{kdir09,
author={Alexis Gabadinho and Gilbert Ritschard and Matthias Studer and Nicolas S. Müller},
title={SUMMARIZING SETS OF CATEGORICAL SEQUENCES - Selecting and Visualizing Representative Sequences},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2009)},
year={2009},
pages={62-69},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002300400620069},
isbn={978-989-674-011-5},
}

TY - CONF

JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2009)
TI - SUMMARIZING SETS OF CATEGORICAL SEQUENCES - Selecting and Visualizing Representative Sequences
SN - 978-989-674-011-5
AU - Gabadinho A.
AU - Ritschard G.
AU - Studer M.
AU - Müller N.
PY - 2009
SP - 62
EP - 69
DO - 10.5220/0002300400620069

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.