loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Author: Reinhard Rapp

Affiliation: University of Tarragona, Spain

Abstract: A method for the automatic extraction of semantically similar words is presented which is based on the analysis of word distribution in large monolingual text corpora. It involves compiling matrices of word co-occurrences and reducing the dimensionality of the semantic space by conducting a singular value decomposition. This way problems of data sparseness are reduced and a generalization effect is achieved which considerably improves the results. The method is largely language independent and has been applied to corpora of English, German, and Russian, with the resulting thesauri being freely available. For the English thesaurus, an evaluation has been conducted by comparing it to experimental results as obtained from test persons who were asked to give judgements of word similarities. According to this evaluation, the machine generated results come close to native speaker’s performance.

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.140.185.170

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Rapp, R. (2007). The Computation of Semantically Related Words: Thesaurus Generation for English, German, and Russian. In Proceedings of the 4th International Workshop on Natural Language Processing and Cognitive Science (ICEIS 2007) - NLPCS; ISBN 978-972-8865-97-9, SciTePress, pages 71-80. DOI: 10.5220/0002414500710080

@conference{nlpcs07,
author={Reinhard Rapp.},
title={The Computation of Semantically Related Words: Thesaurus Generation for English, German, and Russian},
booktitle={Proceedings of the 4th International Workshop on Natural Language Processing and Cognitive Science (ICEIS 2007) - NLPCS},
year={2007},
pages={71-80},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002414500710080},
isbn={978-972-8865-97-9},
}

TY - CONF

JO - Proceedings of the 4th International Workshop on Natural Language Processing and Cognitive Science (ICEIS 2007) - NLPCS
TI - The Computation of Semantically Related Words: Thesaurus Generation for English, German, and Russian
SN - 978-972-8865-97-9
AU - Rapp, R.
PY - 2007
SP - 71
EP - 80
DO - 10.5220/0002414500710080
PB - SciTePress