
2 RELATED WORK 
Many researches have been employing taxonomies 
and ontologies as background knowledge in mining 
association rules in order to enhance the knowledge 
discovery process. (Hou, et al., 2005) uses domain 
knowledge to generalize low level rules discovered 
by traditional rule mining algorithms, in order to get 
fewer and clearer high level rules. In (Brisson, et al., 
2005), domain knowledge is used in pre and post-
processing steps. The preprocessing step uses 
ontology to guide the construction of specific 
datasets for particular mining tasks. In the post-
processing step, mined rules are interpreted and 
filtered, as terms are generalized based on the 
ontology.  
“However, in many real-world applications, the 
taxonomic structure may not be crisp but fuzzy.” 
(Chen, et al., 2000, p.47). For this porpose, (Chen, et 
al., 2000) developed an algorithm to mine 
generalized association rules with fuzzy taxonomic 
structures. The algorithm considers the R-interest 
measure, which is used to eliminate redundants and 
inconsistent rules. Fuzzy association rule mining, 
developed by (Farzanyar, et al., 2006), is driven by 
domain knowledge in order to make the rules more 
visual, more interesting and understandable. 
(Escovar, et al., 2006) have proposed another 
approach that uses fuzzy ontologies to represent the 
semantic similarity relations among mined data. In 
this case, the mining algorithm (XSSDM algorithm) 
considers a new measure, called minimum similarity 
(minsim). If two itens have the similarity degree 
greater than or equal the minsim, fuzzy associations 
are made and can be expressed in the association 
rules extracted by the algorithm. For example, if 
item
1
 and item
2
 have the degree of similarity greater 
than or equal minsim in the fuzzy ontology, a fuzzy 
association is made and a fuzzy itemset is created 
(represented as item
1
~item
2
).  
Although generalized association rule mining 
approaches based on fuzzy ontology express 
semantically richer information, they may result in a 
great amount of extracted rules and redundant rules. 
Then, redundancy treatment has been an interesting 
research topic. In (Han and Fu, 1999) a multiple 
level association rule was proposed to reduce the 
number of generalized association rules. This 
consists in defining different minsup values for each 
level of a given taxonomy. Higher levels have bigger 
minsup values. Other approaches like (Kunkle, et al., 
2008) implement its method to reduce the amount of 
generalized association rules and redundant rules 
during the pattern extraction process. (Kunkle, et al., 
2008) implemented the MFGI_class algorithm based 
on maximal frequent itemset theory (Bayardo, 
1998). (Chen, et al., 2000), and (Oliveira, et al., 
2007) are some approaches that treat the problem 
after the processing stage. The first one does the 
generalization process based on the R-interest 
measure. This measure prunes redundant rules, only 
considering the rules which degree of support or 
confidence is R times the expected degree of support 
or confidence. (Oliveira, et al., 2007) generalizes 
only if the descendents of an ancestor generate rules, 
and the rule of the ancestor has support value x% 
greater than the descendent that generate a rule with 
the biggest support among its siblings. (Miani, et al, 
2009) proposed NARFO algorithm, which decreases 
the number of redundancy rules by its generalizing 
and redundancy treatment. 
NARFO* and NARFO have the same features, 
but NARFO* reduces the amount of NARFO’s rules 
by the introduction of minGen parameter, with more 
semantic and no equivocated information. 
 
3 NARFO* ALGORITHM 
This section explains the NARFO* algorithm. The 
introduction of minGen parameter is the main 
contribution of this work. MinGen works especially 
generalizing rules with low minimum support, 
without semantic lost. If X% of descendents is 
included in rules, the generalization is done, and the 
items that are not part of the generalization are 
showed to avoid wrong information. Considering the 
fuzzy ontology of figure 1, if minGen value is 0,6 
and rules like Apple 
Æ
 Turkey, Kaki 
Æ
 Turkey are 
generated, the algorithm generalizes these rules to 
Fruit 
Æ
 Turkey, since Apple and Kaki are more than 
60% of Fruit’s descendents (minGen value is 0,6). 
The rule are showed as Fruit(-Tomato) 
Æ
 
Turkey, indicating that the item tomato did not 
compose the generalization.  
Besides minGen parameter, the algorithm also 
eliminate redundant rules, preserving their semantic. 
If both Apple~Tomato 
Æ
 Chicken and Apple 
Æ
 
Chicken are extracted, NARFO* only considers the 
first one, exhibiting with a plus (+) the item more 
relevant in the fuzzy itemset of the rule 
Apple(+)~Tomato 
Æ
 Chicken. 
3.1  Data Scanning  
This step identifies the items in the database 
generating  itemsets  of one size (1-itemset). 
Considering   the  fuzzy  ontology   of  figure 1,  this 
NARFO* ALGORITHM - Optimizing the Process of Obtaining Non-redundant and Generalized Semantic Association
Rules
321