
method to extract non-taxonomic relations, and is 
used in ontology learning tools such as Text2Onto 
(Cimiano et al. 2005) and OntoLearn (Velardi et al. 
2005). These ontology learning tools use association 
rule mining with traditional confidence measure to 
extract non-taxonomic relations. However, there are 
noted limitations about confidence measure is that it 
(a) is sensitive to the frequency of the concepts in 
the data set and may return pairs of concepts even if 
there is no association between them, and (b) suffers 
from rare itemset problem whereby even if an 
association rule representing an important 
relationship between concepts exists but since it is 
rare it is pruned altogether (Sheikh et al. 2005).  
In this paper, we pursue the extraction of concept 
pairs, from unstructured text, that has a non-
taxonomic relation between them. We present a 
concept correlation search framework that employs a 
statistical approach that is an extension to the 
traditional association rule mining approach used in 
ontology learning tools for non-taxonomic relation 
extraction. Our approach to search for correlated 
concepts has three distinct elements: (i) we 
investigate the use of the lift measure (Sheikh et al. 
2005), as opposed to the traditional support and 
confidence measures, to establish the interestingness 
between correlated concepts. The key advantage of 
our use of the lift measure is that it determines how 
many  times  more  often concept  X and concept 
Y occurs together than expected if they were 
statistically independent. Lift does not suffer from 
the rare item problem (Sheikh et al. 2005); (ii) when 
searching for correlated concept pairs we look 
beyond the traditional one-sentence window to 
include multiple adjacent sentences. Our approach is 
based on the observation that quite often scientific 
authors discuss correlated concepts across multiple 
sentences, therefore we search correlated concepts 
across two adjoining sentences; (iii) we employ a 
domain ontology, as background knowledge, to filter 
out the correlated concepts that have a taxonomic 
relationship between them. This leaves us with a set 
of non-taxonomic concept pairs that serve as 
candidates for non-taxonomic relations during 
ontology learning. We apply our framework to 
search for non-taxonomic concept pairs for the 
domain of marine biology—we worked with 374 
Fisheries Oceanography journal publications over a 
period of 10 years (1999-2008). We extracted 130 
concept pairs out of which 108 non-taxonomic 
concept pairs were identified. The results were 
validated by domain experts.  
2 LITERATURE REVIEW-
RELATED WORK 
Ontology learning involves Machine Learning (ML) 
and advance Natural Language Processing (NLP) 
technologies, starting from term extraction and 
concept definition to more complex tasks such as 
learning taxonomic and non-taxonomic relations. In 
this section, we review the state-of-the-art in 
ontology learning tools specific to non-taxonomic 
relation extraction. 
From a statistical perspective, the pioneer 
research work in non-taxonomic relation extraction 
was performed by Maedche & Staab (2000) using 
association rule mining. Subsequently, ontology 
learning tools such as Text2Onto (Cimiano et al. 
2005) and OntoLearn  (Velardi et al. 2005) also 
approach the non-taxonomic relation extraction task 
from the statistical point of view using association 
rule mining with traditional confidence measure.  
Hasti (Shamsfard & Barforoush 2004), another 
ontology learning tool, extracts non-taxonomic 
relations from the semantic analysis point of view. 
Hasti combines logical, linguistic-based, template 
driven and semantic analysis methods in their non-
taxonomic relation extraction. A hybrid of both 
approaches is taken by RelExt (Schutz & Buitelaar 
2005) in their non-taxonomic relation extraction 
where relevant terms and verbs are extracted from a 
given text collection. Then, a combination of both 
linguistic and statistical processing is used to 
compute relations between them. The problem with 
these methods is that they are dependent on sentence 
structure. Thus, the search window size for 
correlated concepts is short and constrained to a 
single sentence. Short search window size used often 
proves to be deficient in discovering relations 
(Chagnoux et al. 2008). 
From the literature review, it is clear that 
ontology learning, especially the extraction of non-
taxonomic relations from unstructured text is a 
challenging, yet much pursued area. Our work is an 
extension to the traditional association rule mining 
used in some of the abovementioned tools. We 
pursue to look beyond single-sentence window and 
use lift as the interestingness measure to yield 
interesting concept pairs that represent potential 
non-taxonomic relations in ontology learning 
context.  
WEBIST 2011 - 7th International Conference on Web Information Systems and Technologies
708