loading
Documents

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Salvatore Romeo 1 ; Andrea Tagarelli 1 ; Francesco Gullo 2 and Sergio Greco 1

Affiliations: 1 University of Calabria, Italy ; 2 Yahoo! Research, Spain

ISBN: 978-989-8565-41-9

Keyword(s): Document Clustering, Itemset Mining, Tensor Modeling and Decomposition.

Related Ontology Subjects/Areas/Topics: Applications ; Clustering ; Data Engineering ; Information Retrieval ; Ontologies and the Semantic Web ; Pattern Recognition ; Software Engineering ; Theory and Methods

Abstract: We propose a novel approach to the problem of document clustering when multiple organizations are provided for the documents in input. Besides considering the information on the text-based content of the documents, our approach exploits frequent associations of the documents in the groups across the existing classifications, in order to capture how documents tend to be grouped together orthogonally to different views. A third-order tensor for the document collection is built over both the space of terms and the space of the discovered frequent document-associations, and then it is decomposed to finally establish a unique encompassing clustering of documents. Preliminary experiments conducted on a document clustering benchmark have shown the potential of the approach to capture the multi-view structure of existing organizations for a given collection of documents.

PDF ImageFull Text

Download
CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 35.173.57.202

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Romeo, S.; Tagarelli, A.; Gullo, F. and Greco, S. (2013). A Tensor-based Clustering Approach for Multiple Document Classifications.In Proceedings of the 2nd International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM, ISBN 978-989-8565-41-9, pages 200-205. DOI: 10.5220/0004269102000205

@conference{icpram13,
author={Salvatore Romeo. and Andrea Tagarelli. and Francesco Gullo. and Sergio Greco.},
title={A Tensor-based Clustering Approach for Multiple Document Classifications},
booktitle={Proceedings of the 2nd International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,},
year={2013},
pages={200-205},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004269102000205},
isbn={978-989-8565-41-9},
}

TY - CONF

JO - Proceedings of the 2nd International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,
TI - A Tensor-based Clustering Approach for Multiple Document Classifications
SN - 978-989-8565-41-9
AU - Romeo, S.
AU - Tagarelli, A.
AU - Gullo, F.
AU - Greco, S.
PY - 2013
SP - 200
EP - 205
DO - 10.5220/0004269102000205

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.