Lexical and Morpho-syntactic Features in Word Embeddings - A Case Study of Nouns in Swedish

Ali Basirat, Marc Tang

Abstract

We apply real-valued word vectors combined with two different types of classifiers (linear discriminant analy- sis and feed-forward neural network) to scrutinize whether basic nominal categories can be captured by simple word embedding models. We also provide a linguistic analysis of the errors generated by the classifiers. The targeted language is Swedish, in which we investigate three nominal aspects: uter/neuter, common/proper, and count/mass. They represent respectively grammatical, semantic, and mixed types of nominal classification within languages. Our results show that word embeddings can capture typical grammatical and semantic fea- tures such as uter/neuter and common/proper nouns. Nevertheless, the model encounters difficulties to identify classes such as count/mass which not only combine both grammatical and semantic properties, but are also subject to conversion and shift. Hence, we answer the call of the Special Session on Natural Language Process- ing in Artificial Intelligence by approaching the topic of interfaces between morphology, lexicon, semantics, and syntax via interdisciplinary methods combining machine learning of language and general linguistics.

References

Download


Paper Citation


in Harvard Style

Basirat A. and Tang M. (2018). Lexical and Morpho-syntactic Features in Word Embeddings - A Case Study of Nouns in Swedish.In Proceedings of the 10th International Conference on Agents and Artificial Intelligence - Volume 1: NLPinAI, ISBN 978-989-758-275-2, pages 663-674. DOI: 10.5220/0006729606630674


in Bibtex Style

@conference{nlpinai18,
author={Ali Basirat and Marc Tang},
title={Lexical and Morpho-syntactic Features in Word Embeddings - A Case Study of Nouns in Swedish},
booktitle={Proceedings of the 10th International Conference on Agents and Artificial Intelligence - Volume 1: NLPinAI,},
year={2018},
pages={663-674},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006729606630674},
isbn={978-989-758-275-2},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 10th International Conference on Agents and Artificial Intelligence - Volume 1: NLPinAI,
TI - Lexical and Morpho-syntactic Features in Word Embeddings - A Case Study of Nouns in Swedish
SN - 978-989-758-275-2
AU - Basirat A.
AU - Tang M.
PY - 2018
SP - 663
EP - 674
DO - 10.5220/0006729606630674