loading
Papers

Research.Publish.Connect.

Paper

Authors: Nourelhouda Yahi 1 ; Hacene Belhadef 1 ; Mathieu Roche 2 and Amer Draa 1

Affiliations: 1 MISC Laboratory, Algeria ; 2 Cirad, France

ISBN: 978-989-758-246-2

Keyword(s): Text Mining, Feature Selection, Semantic Similarity, Quantum Inspired Genetic Algorithm.

Abstract: Matching heterogeneous text documents coming from different sources means matching data extracted from these documents, generally structured in the form of vectors. The accuracy of matching directly depends on the right choice of the content of these vectors. That’s why we need to select the best features. In this paper, we present a new approach to select the minimum set of features that represents the semantics of a set of text documents, using a quantum inspired genetic algorithm. Among different Vs characterizing the big data we focus on ‘Variety’ criterion, therefore, we used three sets of different sources that are semantically similar to retrieve their best features which describe the semantics of the corpus. In the matching phase, our approach shows significant improvement compared with the classic ‘Bag-of-words’ approach.

PDF ImageFull Text

Download
CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 35.175.191.168

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Yahi, N.; Belhadef, H.; Roche, M. and Draa, A. (2017). Towards a Bio-inspired Approach to Match Heterogeneous Documents.In Proceedings of the 13th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST, ISBN 978-989-758-246-2, pages 276-283. DOI: 10.5220/0006294002760283

@conference{webist17,
author={Nourelhouda Yahi. and Hacene Belhadef. and Mathieu Roche. and Amer Draa.},
title={Towards a Bio-inspired Approach to Match Heterogeneous Documents},
booktitle={Proceedings of the 13th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,},
year={2017},
pages={276-283},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006294002760283},
isbn={978-989-758-246-2},
}

TY - CONF

JO - Proceedings of the 13th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,
TI - Towards a Bio-inspired Approach to Match Heterogeneous Documents
SN - 978-989-758-246-2
AU - Yahi, N.
AU - Belhadef, H.
AU - Roche, M.
AU - Draa, A.
PY - 2017
SP - 276
EP - 283
DO - 10.5220/0006294002760283

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.