Authors:
Berenike Litz
;
Hagen Langer
and
Rainer Malaka
Affiliation:
University of Bremen, Germany
Keyword(s):
Information extraction, Machine learning, Mining text and Semi-structured data, Lexical knowledge acquisition,
Ontology learning, Ontology population.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Computational Intelligence
;
Evolutionary Computing
;
Information Extraction
;
Knowledge Discovery and Information Retrieval
;
Knowledge-Based Systems
;
Machine Learning
;
Mining Text and Semi-Structured Data
;
Soft Computing
;
Symbolic Systems
Abstract:
In this paper we propose a novel approach that combines syntactic and context information to identify lexical
semantic relationships. We compiled semi-automatically and manually created training data and a test set for
evaluation with the first sentences fromthe German version ofWikipedia. We trained the Trigrams’n’Tags Tagger
by Brants (Brants, 2000) with a semantically enhanced tagset. The experiments showed that the cleanliness
of the data is far more important than the amount of the same. Furthermore, it was shown that bootstrapping
is a viable approach to ameliorate the results. Our approach outperformed the competitive lexico-syntactic
patterns by 7% leading to an F1-measure of .91.