loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: François Role and Mohamed Nadif

Affiliation: Université Paris Descartes, France

Keyword(s): Text mining, Word similarity measures, Pointwise mutual information.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; Concept Mining ; Information Extraction ; Knowledge Discovery and Information Retrieval ; Knowledge-Based Systems ; Mining Text and Semi-Structured Data ; Symbolic Systems

Abstract: Statistical measures of word similarity are widely used in many areas of information retrieval and text mining. Among popular word co-occurrence based measures is Pointwise Mutual Information (PMI). Altough widely used, PMI has a well-known tendency to give excessive scores of relatedness to word pairs that involve low frequency words. Many variants of it have therefore been proposed, which correct this bias empirically. In contrast to this empirical approach, we propose formulae and indicators that describe the behavior of these variants in a precise way so that researchers and practitioners can make a more informed decision as to which measure to use in different scenarios.

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.138.138.144

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Role, F. and Nadif, M. (2011). HANDLING THE IMPACT OF LOW FREQUENCY EVENTS ON CO-OCCURRENCE BASED MEASURES OF WORD SIMILARITY - A Case Study of Pointwise Mutual Information. In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval (IC3K 2011) - KDIR; ISBN 978-989-8425-79-9; ISSN 2184-3228, SciTePress, pages 218-223. DOI: 10.5220/0003655102260231

@conference{kdir11,
author={Fran\c{C}ois Role. and Mohamed Nadif.},
title={HANDLING THE IMPACT OF LOW FREQUENCY EVENTS ON CO-OCCURRENCE BASED MEASURES OF WORD SIMILARITY - A Case Study of Pointwise Mutual Information},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval (IC3K 2011) - KDIR},
year={2011},
pages={218-223},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003655102260231},
isbn={978-989-8425-79-9},
issn={2184-3228},
}

TY - CONF

JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval (IC3K 2011) - KDIR
TI - HANDLING THE IMPACT OF LOW FREQUENCY EVENTS ON CO-OCCURRENCE BASED MEASURES OF WORD SIMILARITY - A Case Study of Pointwise Mutual Information
SN - 978-989-8425-79-9
IS - 2184-3228
AU - Role, F.
AU - Nadif, M.
PY - 2011
SP - 218
EP - 223
DO - 10.5220/0003655102260231
PB - SciTePress