COMBINING INDEXING METHODS AND QUERY SIZES IN INFORMATION RETRIEVAL IN FRENCH

Désiré Kompaoré, Josiane Mothe, Ludovic Tanguy

2008

Abstract

This paper analyses three type of different indexing methods applied on French test collections (CLEF from 2000 to 2005): lemmas, truncated terms and single words. The same search engine and the same characteristics are used independently to the indexing method to avoid variability in the analysis. When evaluated on French CLEF collections, indexing by lemmas is the best method compared to single words and truncated term methods. We also analyse the impact of combining indexing methods by using the CombMNZ function. As CLEF topics are composed of different parts, we also examine the influence of these topic parts by comparing the results when topic parts are considered individually, and when they are combined. Finally, we combine both indexing methods and query parts. We show that MAP can be improved up to 8% compared to the best individual methods.

References

  1. Ahlgren P. and Kekäläinen J., 2006. Swedish full text retrieval: Effectiveness of different combinations of indexing strategies with query terms, Information Retrieval journal, 9(6): 681-697.
  2. Beitzel S.M. et al., 2004, Fusion of Effective Retrieval Strategies in the Same Information Retrieval System. JASIST, 55(10): 859-868.
  3. Boughanem M., Dkaki T., Mothe J., Soulé-Dupuy C., 1998, Mercure at trec7, NIST 500-242, 413-418.
  4. Denjean P., 1989, Interrogation d'un systeme videotex : l'indexation automatique des textes, PhD dissertation, Université de Toulouse, France.
  5. Fox E.A. and Shaw J.A., 1994, Combination of Multiple Searches, TREC-), NIST 500-215, 243-252.
  6. Hubert G. and Mothe J., 2007, Relevance feedback as an indicator to select the best search engine, ICEIS 2007, 184-189.
  7. Kompaoré N. D. and Mothe J., 2007, Probabilistic fusion and categorization of queries based on linguistic features, ACM PIKM, 63-68.
  8. Lee J., 1997, Analysis of multiple evidence combination, ACM SIGIR, 267-276.
  9. Lu X. A. and Keefer R. B., 1994, Query Expansion/Reduction and its Impact on Retrieval Effectiveness, NIST 500-225, TREC-3, 231-240.
  10. McCabe M. C., Chowdhury A., Grossman D.A., Frieder O., 1999, A unified Environment for Fusion of Information Retrieval, ACM CIKM, 330-334.
  11. Metzler D., Strohman T., Zhou Y., Croft W. B., Indri at TREC 2005: Terabyte Track.
  12. Mothe J. and Tanguy L., 2005, Linguistic features to predict query difficulty, SIGIR wkshop on Predicting Query Difficulty - Methods and Applications.
  13. Robertson S E, et al., 1995, Okapi at TREC-3, Overview of the Third Text REtrieval Conference, 109-128.
  14. Savoy J., 2003, Cross-language information retrieval: experiments based on CLEF 2000 corpa, IPM, V. 39, 75-115.
  15. Voorhees, E.M., 2007. Overview of TREC 2006, NIST, MD 20899, 1-16.
Download


Paper Citation


in Harvard Style

Kompaoré D., Mothe J. and Tanguy L. (2008). COMBINING INDEXING METHODS AND QUERY SIZES IN INFORMATION RETRIEVAL IN FRENCH . In Proceedings of the Tenth International Conference on Enterprise Information Systems - Volume 2: ICEIS, ISBN 978-989-8111-37-1, pages 149-154. DOI: 10.5220/0001674401490154


in Bibtex Style

@conference{iceis08,
author={Désiré Kompaoré and Josiane Mothe and Ludovic Tanguy},
title={COMBINING INDEXING METHODS AND QUERY SIZES IN INFORMATION RETRIEVAL IN FRENCH},
booktitle={Proceedings of the Tenth International Conference on Enterprise Information Systems - Volume 2: ICEIS,},
year={2008},
pages={149-154},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001674401490154},
isbn={978-989-8111-37-1},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Tenth International Conference on Enterprise Information Systems - Volume 2: ICEIS,
TI - COMBINING INDEXING METHODS AND QUERY SIZES IN INFORMATION RETRIEVAL IN FRENCH
SN - 978-989-8111-37-1
AU - Kompaoré D.
AU - Mothe J.
AU - Tanguy L.
PY - 2008
SP - 149
EP - 154
DO - 10.5220/0001674401490154