Controlling the Drift of Semantic Indexing Systems

Ivan Garrido Marquez, Jorge Garcia Flores, François Lévy, Adeline Nazarenko

2018

Abstract

Document classification is often meant to serve as semantic indexing to help readers finding documents related to a given topic. However, the quality of indexing often deteriorates with time: some categories are misused or forgotten by indexers, others become obsolete or too general to be useful. This paper proposes measures to assess the quality of an indexing system and an algorithm that guides indexers in restructuring their indexes. Focus is put on the reader’s rather than on the annotator’s point of view (Does the classification really help accessing information? vs. Is a category adequate with the content of the document?). The whole approach is illustrated on a corpus of 20 blogs which posts are associated with categories. We show that indexers have difficulties to adapt the blogs indexing systems when the number of posts increases and we show that our approach can significantly improve the quality of these indexing systems, by simulating blog restructuring.

Download


Paper Citation


in Harvard Style

Marquez I., Flores J., Lévy F. and Nazarenko A. (2018). Controlling the Drift of Semantic Indexing Systems. In Proceedings of the 10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2018) - Volume 2: KEOD; ISBN 978-989-758-330-8, SciTePress, pages 199-206. DOI: 10.5220/0006926501990206


in Bibtex Style

@conference{keod18,
author={Ivan Garrido Marquez and Jorge Garcia Flores and François Lévy and Adeline Nazarenko},
title={Controlling the Drift of Semantic Indexing Systems},
booktitle={Proceedings of the 10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2018) - Volume 2: KEOD},
year={2018},
pages={199-206},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006926501990206},
isbn={978-989-758-330-8},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2018) - Volume 2: KEOD
TI - Controlling the Drift of Semantic Indexing Systems
SN - 978-989-758-330-8
AU - Marquez I.
AU - Flores J.
AU - Lévy F.
AU - Nazarenko A.
PY - 2018
SP - 199
EP - 206
DO - 10.5220/0006926501990206
PB - SciTePress