
 
of applications using XML for syntax and URIs 
(Uniform Resource Identifier) for naming. At the 
heart of all semantic web applications is the use of 
ontologies, which describe entities and relationships 
among entities. The concept of metadata has evolved 
over the years starting from data dictionaries to 
database schemas and now to ontologies. 
Data mining aims at finding patterns and subtle 
relationships in data and discovering rules that allow 
the prediction of future results by the use of 
automatic or semi-automatic processes. It is an 
information extraction activity, whose goal is to 
discover hidden facts contained in databases, using a 
combination of machine learning, statistical analysis, 
modelling techniques and database technology. 
Mining the data on the web, however, is one of the 
major challenges faced by the data management and 
mining community, as well as those working on web 
information management and machine learning. The 
characteristic feature of Web Mining is the use of 
Data Mining techniques to elaborate on content, 
structure, and usage of Web resources.  
In the Semantic Web, content and structure are 
strongly inter-wined. Therefore the distinction 
between structure and content mining vanishes. The 
mining algorithms can be transformed in order to 
deal with RDF or ontology-based data. Mining the 
usage can be enhanced further, if the semantics are 
contained explicitly in the pages by referring to 
concepts of ontologies. 
3 RELATED WORK 
We will firstly review the formal model of 
association rule as was introduced by Agrawal 
(Agrawal et Al., 93). Formally association rules 
mining can be stated as follows:  
•  Let I= {i
1
, i
2
, … i
n
} be a set of items 
•  Let D, be a set of transactions, where each 
transaction T is a set of items satisfying T
⊆ I 
•  Each transaction is assigned an identifier, called 
TID 
•  Let X be a set of items, a transaction T is said to 
contain X if and only if X
⊆ T.  
•  An association rule is an implication of the form 
X
⇒ Y. 
While association rules provide means to 
discover many interesting associations, they fail to 
discover others, no less interesting associations that 
are also hidden in the data. While this may not be 
very dangerous in classical mining procedure this 
seems to be a serious problem in semantic web 
mining since this will form the basis of the ontology 
that will form the semantic web. However maximal 
association rules are not designed to replace regular 
association rules, but rather to allow the discovery of 
the concepts, which will be included in the ontology 
and the relations that bind them together. For this 
reason, we propose in this paper enhancements to 
the algorithm proposed above. These enhancements 
will allow the discovery of new association rules to 
complement them (Amihood et Al., 05). Maximal 
associations was proposed to allow the discovery of 
associations pertaining to items that most often do 
not appear alone, but rather together with closely 
related items, and hence associations relevant only 
to these items tend to obtain low confidence in the 
classical algorithms, for example Apriori. In a 
maximal association rule we are interested in 
capturing the notion that whenever X appears alone 
then Y also appears, with some confidence 
(Amihood et Al., 05) and this is why it is crucial for 
text/web mining for learning the ontology. 
In addition, some redundant, unwanted or even 
false strong association rules are likely to be 
generated because the correlation of attributes is 
ignored (Yong Xu et Al. 05). So the Chi-Squared 
test should be introduced to association rules mining 
since it could remove irrelevant itemsets and rules 
that have high support but no dependency. 
4 ENHANCED LEARNING 
ALGORITHM: EN-APRIORI 
The main learning algorithm that has been adopted 
in our paper is the one proposed in (Maedche et Al., 
99)(Berendett et Al, 06). This learning algorithm is 
effective up to a certain level, and since it is a 
text/web mining approach then the techniques from 
text mining and web mining have been combined to 
achieve the learning of the ontology. 
We will stress again that the learning of the 
ontology step is basically the most important step, 
since its results will allow the discovery of the 
concepts, which will be included in the ontology and 
the relations that bind them together. For this reason, 
we propose in this paper enhancements to the 
algorithm proposed above. These enhancements will 
allow the discovery of new association rules that 
have been missed by the original learning algorithm 
and in addition it allows the pruning of some faulty 
rules that appeared to be valid strong association 
rules. Our approach is based on the idea of 
introducing two new algorithms and integrating 
ICSOFT 2007 - International Conference on Software and Data Technologies
190