loading
Papers

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Alexandra Cernian ; Liliana Dobrica ; Dorin Carstoiu and Valentin Sgarciu

Affiliation: University Politehnica of Bucharest, Romania

ISBN: 978-989-8425-22-5

Keyword(s): Organizing Web Search Results, Clustering by Compression, Normalized Compression Distance (NCD), Clustering Methods.

Related Ontology Subjects/Areas/Topics: Business Analytics ; Data and Information Retrieval ; Data Engineering ; Data Management and Quality ; Data Semantics ; Information Quality ; Media Search and Retrieval

Abstract: Current Web search engines return long lists of ranked documents that users are forced to sift through to find relevant documents. This paper introduces a new approach for clustering Web search results, based on the notion of clustering by compression. Compression algorithms allow defining a similarity measure based on the degree of common information. Classification methods allow clustering similar data without any previous knowledge. The clustering by compression procedure is based on a parameter-free, universal, similarity distance, the normalized compression distance or NCD, computed from the lengths of compressed data files. Our goal is to apply the clustering by compression algorithm in order to cluster the documents returned by a Web search engine in response to a user query.

PDF ImageFull Text

Download
CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.81.29.226

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Cernian A.; Dobrica L.; Carstoiu D.; Sgarciu V. and (2010). ON USING THE NORMALIZED COMPRESSION DISTANCE TO CLUSTER WEB SEARCH RESULTS.In Proceedings of the 5th International Conference on Software and Data Technologies - Volume 1: ICSOFT, ISBN 978-989-8425-22-5, pages 293-298. DOI: 10.5220/0002926102930298

@conference{icsoft10,
author={Alexandra Cernian and Liliana Dobrica and Dorin Carstoiu and Valentin Sgarciu},
title={ON USING THE NORMALIZED COMPRESSION DISTANCE TO CLUSTER WEB SEARCH RESULTS},
booktitle={Proceedings of the 5th International Conference on Software and Data Technologies - Volume 1: ICSOFT,},
year={2010},
pages={293-298},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002926102930298},
isbn={978-989-8425-22-5},
}

TY - CONF

JO - Proceedings of the 5th International Conference on Software and Data Technologies - Volume 1: ICSOFT,
TI - ON USING THE NORMALIZED COMPRESSION DISTANCE TO CLUSTER WEB SEARCH RESULTS
SN - 978-989-8425-22-5
AU - Cernian, A.
AU - Dobrica, L.
AU - Carstoiu, D.
AU - Sgarciu, V.
PY - 2010
SP - 293
EP - 298
DO - 10.5220/0002926102930298

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.