Unsupervised Word Sense Disambiguation based on Word Embedding and Collocation

Shangzhuang Han, Kiyoaki Shirai

Abstract

This paper proposes a novel unsupervised word sense disambiguation (WSD) method. It utilizes two useful features for WSD. One is contextual information of a target word. The similarity between words in a context and a sense of a target word is measured based on the pre-trained word embedding, then the most similar sense to the context is chosen. Furthermore, we introduce a procedure not to use irrelevant words in a context in a calculation of the similarity. The other is a collocation, which is an idiomatic phrase including a target word. High-precision rules to determine a sense by a collocation is automatically acquired from a raw corpus. Finally, the above two methods are integrated into our final WSD system. Results of the experiments using Senseval-3 English lexical sample task showed that our proposed method could improve the precision by 4.7 point against the baseline.

Download


Paper Citation


in Harvard Style

Han S. and Shirai K. (2021). Unsupervised Word Sense Disambiguation based on Word Embedding and Collocation.In Proceedings of the 13th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART, ISBN 978-989-758-484-8, pages 1218-1225. DOI: 10.5220/0010380112181225


in Bibtex Style

@conference{icaart21,
author={Shangzhuang Han and Kiyoaki Shirai},
title={Unsupervised Word Sense Disambiguation based on Word Embedding and Collocation},
booktitle={Proceedings of the 13th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,},
year={2021},
pages={1218-1225},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010380112181225},
isbn={978-989-758-484-8},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 13th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,
TI - Unsupervised Word Sense Disambiguation based on Word Embedding and Collocation
SN - 978-989-758-484-8
AU - Han S.
AU - Shirai K.
PY - 2021
SP - 1218
EP - 1225
DO - 10.5220/0010380112181225