Automatic Word Sense Mapping from Princeton WordNet to Latvian WordNet

Laine Strankale, Madara Stāde

2022

Abstract

Latvian WordNet is a resource where word senses are connected based on their semantic relationships. The manual construction of a high-quality core Latvian WordNet is currently underway. However, text processing tasks require broad coverage, therefore, this work aims to extend the wordnet by automatically linking additional word senses in the Latvian online dictionary Tēzaurs.lv and aligning them to the English-language Princeton WordNet (PWN). Our method only needs translation data, sense definitions and usage examples to compare it to PWN using pretrained word embeddings and sBERT. As a result, 57 927 interlanguage links were found that can potentially be added to Latvian WordNet, with an accuracy of 80% for nouns, 56% for verbs, 67% for adjectives and 66% for adverbs.

Download


Paper Citation


in Harvard Style

Strankale L. and Stāde M. (2022). Automatic Word Sense Mapping from Princeton WordNet to Latvian WordNet. In Proceedings of the 14th International Conference on Agents and Artificial Intelligence - Volume 1: NLPinAI, ISBN 978-989-758-547-0, pages 478-485. DOI: 10.5220/0011006000003116


in Bibtex Style

@conference{nlpinai22,
author={Laine Strankale and Madara Stāde},
title={Automatic Word Sense Mapping from Princeton WordNet to Latvian WordNet},
booktitle={Proceedings of the 14th International Conference on Agents and Artificial Intelligence - Volume 1: NLPinAI,},
year={2022},
pages={478-485},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011006000003116},
isbn={978-989-758-547-0},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 14th International Conference on Agents and Artificial Intelligence - Volume 1: NLPinAI,
TI - Automatic Word Sense Mapping from Princeton WordNet to Latvian WordNet
SN - 978-989-758-547-0
AU - Strankale L.
AU - Stāde M.
PY - 2022
SP - 478
EP - 485
DO - 10.5220/0011006000003116