loading
  • Login
  • Sign-Up

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Klev Diamanti 1 ; 2 ; Christos Makris 2 and Thodoris Tokis 2

Affiliations: 1 Uppsala University, Sweden ; 2 University of Patras, Greece

ISBN: 978-989-758-024-6

Keyword(s): Searching and Browsing, Web Information Filtering and Retrieval, Text Mining, Indexing Structures, Inverted Files, n-gram Indexing, Sequence Analysis and Assembly, Weighted Sequences, Weighted Suffix Trees.

Related Ontology Subjects/Areas/Topics: Searching and Browsing ; Web Information Systems and Technologies ; Web Interfaces and Applications

Abstract: In this paper, we address the problem of handling weighted sequences. This is by taking advantage of the inverted files machinery and targeting text processing applications, where the involved documents cannot be separated into words (such as texts representing biological sequences) or word separation is difficult and involves extra linguistic knowledge (texts in Asian languages). Besides providing a handling of weighted sequences using n-grams, we also provide a study of constructing space efficient n-gram inverted indexes. The proposed techniques combine classic straightforward n-gram indexing, with the recently proposed two-level n-gram inverted file technique. The final outcomes are new data structures for n-gram indexing, which perform better in terms of space consumption than the existing ones. Our experimental results are encouraging and depict that these techniques can surely handle n-gram indexes more space efficiently than already existing methods.

PDF ImageFull Text

Download
Sign In Guest: Register as new SCITEPRESS user or Join INSTICC now for free.

Sign In SCITEPRESS user: please login.

Sign In INSTICC Members: please login. If not a member yet, Join INSTICC now for free.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 54.82.164.182. INSTICC members have higher download limits (free membership now)

In the current month:
Recent papers: 1 available of 1 total
2+ years older papers: 2 available of 2 total

Paper citation in several formats:
Diamanti K., Kanavos A., Makris C. and Tokis T. (2014). Handling Weighted Sequences Employing Inverted Files and Suffix Trees.In Proceedings of the 10th International Conference on Web Information Systems and TechnologiesISBN 978-989-758-024-6, pages 231-238. DOI: 10.5220/0004788502310238

@conference{webist14,
author={Klev Diamanti and Andreas Kanavos and Christos Makris and Thodoris Tokis},
title={Handling Weighted Sequences Employing Inverted Files and Suffix Trees},
booktitle={Proceedings of the 10th International Conference on Web Information Systems and Technologies},
year={2014},
pages={231-238},
doi={10.5220/0004788502310238},
isbn={978-989-758-024-6},
}

TY - CONF

JO - Proceedings of the 10th International Conference on Web Information Systems and Technologies
TI - Handling Weighted Sequences Employing Inverted Files and Suffix Trees
SN - 978-989-758-024-6
AU - Diamanti K.
AU - Kanavos A.
AU - Makris C.
AU - Tokis T.
PY - 2014
SP - 231
EP - 238
DO - 10.5220/0004788502310238

Sorted by: Show papers

Note: The preferred Subjects/Areas/Topics, listed below for each paper, are those that match the selected paper topics and their ontology superclasses.
More...

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.

Show authors

Note: The preferred Subjects/Areas/Topics, listed below for each author, are those that more frequently used in the author's papers.
More...