Objective Measures Ensemble in Associative Classiﬁers

Maicon Dall’Agnol and Veronica Oliveira de Carvalho

Universidade Estadual Paulista (Unesp), Instituto de Geoci

encias e Ci

encias Exatas, Rio Claro, Brazil

Keywords:

Associative Classiﬁer, Interestingness Measures, Ranking, Classiﬁcation, Association Rules.

Abstract:

Associative classiﬁers (ACs) are predictive models built based on association rules (ARs). Model construction

occurs in steps, one of them aimed at sorting and pruning a set of rules. Regarding ordering, usually objective

measures (OMs) are used to rank the rules. The aim of this work is exactly sorting. In the proposals found

in the literature, the OMs are generally explored separately. The only work that explores the aggregation of

measures in the context of ACs is (Silva and Carvalho, 2018), where multiple OMs are considered at the same

time. To do so, (Silva and Carvalho, 2018) use the aggregation solution proposed by (Bouker et al., 2014).

However, although there are many works in the context of ARs that investigate the aggregate use of OMs, all

of them have some bias. Thus, this work aims to evaluate the aggregation of measures in the context of ACs

considering another perspective, that of an ensemble of classiﬁers.

1 INTRODUCTION

Right now, people are generating and storing data.

This huge amount of data stores valuable information

that companies can use to better understand their cus-

tomers, improve their budgets, and so on. For this,

it is important to use techniques that automatically

discover interesting patterns in the domain. Classiﬁ-

cation and association rules (ARs) are among these

techniques. Associative Classiﬁers (ACs) are rule-

based classiﬁers that are built using association rules.

ACs have the advantage of exploring the search space

in a broader view, compared to, for example, C4.5,

which does a greedy search. AC yields good re-

sults compared to other machine learning algorithms

(Yang and Cui, 2015; Abdellatif et al., 2018b), espe-

cially decision trees, rule induction and probabilistic

approaches (Abdellatif et al., 2018a). According to

(Yang and Cui, 2015), one of the main advantages of

ACs is that the output is represented in simple if-then

rules, making it interpretable for the end-user. Be-

sides, according to (Kannan, 2010) ACs naturally deal

with missing values and outliers, as they only manip-

ulate statistically signiﬁcant associations and ensure

that no assumption is made about attribute depen-

dence or independence. Some domain applications

that use ACs can be seen in (Nandhini et al., 2015;

Singh et al., 2016; Moreno et al., 2016; Shao et al.,

2017; Alwidian et al., 2018; Yin et al., 2018).

https://orcid.org/0000-0003-1741-1618

Many ACs exist, as seen in (Thabtah, 2007; Abdel-

hamid and Thabtah, 2014), which provide a good re-

view of the topic. The ﬁrst and generally the one used

as baseline in the literature works is CBA (Liu et al.,

1998). In general, three steps are necessary to build an

AC: (a) extraction of a set of association rules where

the consequents contains only labels; (b) model build-

ing through sorting and pruning; (c) prediction. ACs

algorithms differ in how they perform each step, es-

pecially steps (b) and (c). Regarding step (b), one

way to sort the rules to choose which ones will be

in the model is through objective measures (OMs).

Step (b) is important because many rules can be ex-

tracted from step (a) and, in general, many of them

are not relevant to the model. Therefore, an efﬁcient

evaluation of ARs is an essential need (Yang and Cui,

2015; Abdellatif et al., 2019). This work contributes

to step (b). According to (Abdelhamid et al., 2016),

one of the challenges related to ACs is sorting, be-

cause choosing the appropriate ranking criterion is a

critical task that impacts the accuracy of the classiﬁer.

In the proposals found in the literature, OMs

are generally explored separately in ACs (see Sec-

tion 2.3). The only work that explores the aggre-

gation of measures in the context of ACs is (Silva

and Carvalho, 2018), where multiple OMs are con-

sidered at the same time. To do so, (Silva and Car-

valho, 2018) use the aggregation solution proposed

by (Bouker et al., 2014). However, although there are

many works in the context of ARs that investigate the

aggregate use of OMs (see Section 2.2), all of them

Dall’Agnol, M. and Oliveira de Carvalho, V.

Objective Measures Ensemble in Associative Classiﬁers.

DOI: 10.5220/0009321600830090

In Proceedings of the 22nd International Conference on Enterprise Information Systems (ICEIS 2020) - Volume 1, pages 83-90

ISBN: 978-989-758-423-7

have some bias.

Considering the above, this work aims to evalu-

ate the aggregation of measures in the context of ACs

considering another perspective, that of an ensemble

of classiﬁers. Ensemble classiﬁers, such as random

forest, build several classiﬁers and then combine their

classiﬁcations to decide to which class label a given

object belongs to. Using this idea it is possible to

consider many OMs at the same time, each of them

generating an AC that can be combined later. Good

results were obtained compared to those presented by

(Silva and Carvalho, 2018), indicating that an ensem-

ble of OMs ACs can be a suitable solution.

This work is structured as follows: Section 2 re-

views some concepts and discusses related works.

Section 3 describes the ensemble approach, which is

followed by experiments (Section 4), results and dis-

cussion (Section 5). Section 6 concludes the paper

with conclusions and future works.

2 REVIEW AND RELATED

WORKS

This section presents some concepts needed to under-

stand the current work (Section 2.1), as well, some

related works (Section 2.3) and approaches used to

aggregate OMs (Section 2.2).

2.1 Associative Classiﬁer

An associative classiﬁer (AC) aims to obtain a predic-

tive model based on a process that uses class associa-

tion rules (CARs). A class association rule A ⇒ c is

a special case of an association rule in which the con-

sequent only contains a class label. Formally, given

D a dataset containing a set of items I = {i

, ..., i

}, a

set of class labels R = {c

, ..., c

} and a set of trans-

actions T = {t

, ...,t

}, where each transaction t

∈ T

contains a subset of items A ⊆ I and one class label

c ∈ R such that t

= A ∪c. A class association rule is a

relation A ⇒ c, in which A ⊆ I and c ∈ R. Rule A ⇒ c

occurs in the set of transactions T with conﬁdence

con f and support sup, where P(Ac) represents rule

support (probability of A ∪ c occurrence) and P(c|A)

rule conﬁdence (conditional probability of c given A).

In this context, items in I are usually <attribute,

value> pairs, because CARs are extracted from re-

lational tables containing m objects (instances) de-

scribed by k attributes (features), where each object

is associated with one of the p known class la-

bels to be predicted. Each object O

is, therefore, de-

scribed by a vector O

= [v

, v

, ..., v

, c

], where v

i j

is a <attribute, value> pair representing the value of

object i in attribute j and c

the class label associated

with O

As mentioned in the introduction, among the

many ACs algorithms, CBA (Liu et al., 1998) is gen-

erally the most commonly used as baseline. Brieﬂy,

the algorithm works as follows: ﬁrst a set of CARs is

obtained (step (a)). After that, CARs are sorted ac-

cording to three criteria: conﬁdence, support and or-

der of generation. Based on these criteria, a rule r

precedes a rule r

, in a sorted list, if conﬁdence(r

)

> conﬁdence(r

); if the conﬁdences are the same, but

support(r

) > support(r

); if the supports are the same,

but r

was generated before r

. This is where this work

contributes. Considering this sorting pruning occurs.

For each rule r it is checked the transactions it cov-

ers and if it correctly covers at least one transaction.

In this case, the rule is selected to be included in the

model and all transactions covered by it are removed

from the dataset. Pruning completes step (b). Regard-

ing step (c), given an unseen object, the class label as-

sociated with the ﬁrst rule that matches the object is

the one to which it will be classiﬁed.

As can be observed, sorting affects the whole pro-

cess. It is based on the sorted rule list that pruning is

performed. Therefore, depending on the criteria used

in sorting a different classiﬁer is obtained.

2.2 Aggregation Approaches

A common way used by ACs algorithms to sort the

rules in step (b) is through objective measures, as sup-

port and conﬁdence. An OM is used to compute a

value that express the relevance of a rule by only con-

sidering the information available in the dataset. Gen-

erally, the higher the value the better the rule. Based

on these values, it is possible to create a sorted rule

list. 61 OMs are deﬁned and discussed in (Tew et al.,

2014), presenting a good review on the topic; there-

fore, these measures are not described here.

As many OMs exist, solutions have been proposed

to help the user to decide which one to choose/use. A

review discussing some of these solutions can be seen

in (Bong et al., 2014). Among them there are some

solutions that aim to aggregate the values of two or

more OMs so that the user does not have to select a

speciﬁc one to order the rules as in (Nguyen Le et al.,

2009) and (Yang et al., 2009). The ﬁrst uses Choquet

Integral to aggregate, while the second uses genetic

network programming (GNP) to solve the problem,

being necessary to set several parameters that inﬂu-

ence the results.

(Bouker et al., 2014) propose an aggregation solu-

tion that, because it is deterministic and based solely

on the information contained in the data, was chosen

ICEIS 2020 - 22nd International Conference on Enterprise Information Systems

for use in the work of (Silva and Carvalho, 2018).

There are other works based on the same ideas of

(Bouker et al., 2014) as that of (Dahbi et al., 2016).

To sort rules, (Bouker et al., 2014) use the concept of

rule dominance. It is deﬁned that a rule r

dominates a

rule r

 r

) when all OMs values in r

are greater

than or equal to the OMs values in r

– if the values

are all the same, the rules are considered equivalent,

i.e., to have dominance, the value of at least one mea-

sure must be greater. Comparisons where there is no

dominance of one of the rules, they are considered as

undominated among themselves.

To get the sorted list, an algorithm called SkyRule

is executed iteratively, which returns at each execu-

tion a set of undominated rules, i.e., a set of rules that

are not dominated by any other rule. The SkyRule

algorithm is loop-based, so that at each iteration the

rules most similar to the reference rule (r

⊥

) that are

also undominated are extracted. These rules are called

∗

. Similarity is computed through normalized Man-

hattan distance, i.e., through the arithmetic mean of

the normalized differences. The reference rule rep-

resents the aggregation of the computed OMs for all

rules, storing the best value obtained for each OM in

the rule set. The rules returned by SkyRule are added

to a E

set and removed from the rule set (i indicates

the interaction in which the rules were extracted). The

process repeats until the rule set be empty. At the end

of the process there is a set of sets that make up the

sorting. Note that the idea is that less dominated rules

are considered better.

Table 1: Example of some rules evaluated by some mea-

sures. Adapted from (Bouker et al., 2014).

Rule OM-1 OM-2 OM-3

0.2 0.67 0.02

0.1 0.50 0.00

0.1 0.50 0.02

0.2 0.40 0.10

0.2 0.33 0.02

0.2 0.33 0.10

0.1 0.20 0.01

0.1 0.17 0.02

Consider, for example, 8 rules evaluated by 3 OMs,

as seen in Table 1. Here, r

⊥

={0.20, 0.67, 0.10}. In

the ﬁrst iteration of the ﬁrst SkyRule execution r

the ﬁrst rule to be considered as r

∗

(the most similar

to r

⊥

); therefore, it is added in the return set. Since

dominates r

, r

and r

, they are discard

form this SkyRule execution, remaining only r

and

. Next, r

is chosen as r

∗

and is also added in the

return set, which removes r

from the current execu-

tion, ﬁnishing the search for undominated rules. Af-

ter the ﬁrst SkyRule execution, E

= {r

, r

}. These

rules are removed from the original rule set and, in

the next iteration, SkyRule is executed without con-

sidering them, which in turn results in E

= {r

, r

Repeating the process it is obtained in the third itera-

tion E

= {r

, r

} and in the fourth E

= {r

, r

}, thus

establishing the sorting of the 8 rules. Finally, to visu-

alize the results obtained through the ranking process,

it is possible to build a graph showing the sequence

of creation of the sets, as seen in Figure 1. Therefore,

the ﬁnal ranking is as follows: r

, r

and r

Figure 1: Visualization of the sequence of creation of the

sets used to sort the rules presented in Table 1. Adpated

from (Bouker et al., 2014).

Finally, it is important to mention that if there is more

than one rule in the same E

set, a tiebreaker is per-

formed by similarity of the rules to the reference rule

– the most similar ones are ranked ﬁrst. In cases

where there are still ties, the tie is broken by order

of generation, giving preference to the rules that were

ﬁrst generated, as in CBA.

2.3 Related Works

Although there is much work on associative classiﬁ-

cation, few explore the effect of OMs on the sorting

process. (Azevedo and Jorge, 2007) explore the use of

10 OMs in 17 datasets regarding sorting and predic-

tion in ACs and conclude that Conviction is the most

appropriate OM to be used.

(Jalali-Heravi and Za

ıane, 2010) assess the impact

of 53 OMs on each phase of the ACs construction

(steps (a), (b) and (c)). Regarding step (b) the au-

thors explore the sorting with and without pruning.

Regarding step (c) the authors explore two strategies:

(i) select the best ranked rule that covers the instance

and (ii) divide all the rules that cover the instance into

groups, according to the class label, and perform the

average of the values of the OM in analysis in each

group; in this case, the group with the best average

Objective Measures Ensemble in Associative Classiﬁers

Figure 2: Ranking views: individual, aggregate and ensemble.

deﬁnes the class label. Regarding the sorting crite-

ria, the following precedences are used: value of the

measure being analyzed, support and rule size (gen-

eral rules are preferred over the speciﬁc ones). The

analysis is done on 20 datasets. The authors conclude

that many of the analyzed OMs improve classiﬁer per-

formance. However, there is no measure that is best

suited for all datasets or for all classiﬁer steps.

(Kannan, 2010) assess the inﬂuence of 39 OMs on

pruning and sorting. The authors test 3 sorting alter-

natives: (i) only a given measure, (ii) a given measure

plus tiebreaker criteria, as in CBA (conﬁdence, sup-

port, and rule generated ﬁrst) and (iii) sorting by a

given measure, followed by reordering, through strat-

egy (ii), of the best k rules selected. The study was

conducted on a single dataset for students from a dis-

tance learning program. The authors conclude that

AC accuracy can be improved by using appropriate

OM for pruning and ordering.

(Yang and Cui, 2015) aim to improve the perfor-

mance of ACs in unbalanced datasets by studying 55

OMs in 9 datasets. The authors point out that this is a

relevant problem, as OMs can be applied in different

steps, such as rule generation, pruning and sorting.

Two types of analysis are performed: (i) one to ﬁnd

similar OMs groups in unbalanced datasets; (ii) the

other to identify the most appropriate OMs to be used

in the presented context. With respect to (i), the au-

thors use graph-based clustering as well as frequent

pattern mining. Regarding (ii), the authors use CBA

to compute its performance, via AUC (area under the

ROC curve), when sorting is performed by each of the

OMs individually. In conclusion, it is suggested to use

26 OMs divided into two groups: one focused on ex-

tremely unbalanced datasets and the other focused on

slightly unbalanced datasets.

Unlike the above works, (Silva and Carvalho,

2018) explore the aggregation of OMs in the con-

text of ACs, where multiple OMs are considered at

the same time. To do so, they used the aggregation

solution proposed by (Bouker et al., 2014) (see Sec-

tion 2.2). To this end, the authors modiﬁed step (b)

of CBA, speciﬁcally sorting, as follows: a rule r

pre-

cedes a rule r

, in a sorted list, if the aggregated OMs

values in r

are greater than r

; in the event of a tie, r

was generated before r

. In other words, the sorting

follows the list generated by applying (Bouker et al.,

2014) approach. The authors noted that aggregation

tends to improve the accuracy of ACs, compared to

the use of individual OMs, as done in other works.

However, aggregation approaches may present some

problems, as discussed below (Section 3).

3 ENSEMBLE APPROACH

It can be seen from the works of Section 2.3 that the

problem presented here is relevant, since the most

modiﬁable CBA step is that of sorting, as it can be

done using many OMs; the other steps have more pre-

deﬁned procedures. The revised works can be basi-

cally divided into two views, as shown in Figure 2:

those that explore the OMs individually (view (A) in

red) and those that explore the OMs at the same time

(view (B) in green). In the latter case, only one was

found. Unlike the literature, this current work intends

to explore the problem from another perspective, that

of an ensemble of classiﬁers (view (C) in blue). As

noted by (Silva and Carvalho, 2018), aggregate mea-

sures provide, on average, better results. The ensem-

ICEIS 2020 - 22nd International Conference on Enterprise Information Systems

ble view has the advantage of using many OMs at the

same time, regardless of which one is best suited for a

given dataset, which is what happens in the individual

view. However, in general, aggregation approaches

have some bias that generates problems, such as the

one presented in (Bouker et al., 2014) and used by

(Silva and Carvalho, 2018).

In works that use the concept of dominance, such

as (Bouker et al., 2014) and (Dahbi et al., 2016),

the greater the number of OMs to be aggregated the

greater the likelihood that an OM will cause a rule r

to be undominated. This fact only generates a com-

plete set of undominated rules, resulting in only one

iteration of RankRule. That is, a single E

containing

all the rules is obtained. In this way, the ﬁnal sort-

ing is done, in fact, only by the similarity of the rules

in relation to the reference rule, or even, in case of

a tie, by the rule ID (generated ﬁrst), thus losing the

purpose of searching by undominated rules. As men-

tioned before, similarity is computed through normal-

ized Manhattan distance, i.e., through the arithmetic

mean of the normalized differences.

For a better understanding of the problem, con-

sider two rules associated with 10 OMs: OMs

={0.9,

0.98, 0.89, 0.88, 0.95, 0.99, 0,79, 0.8, 0.79, 0.96}

and OMs

={0.1, 0.10, 0.08, 0.20, 0.11, 0.03, 0, 0.3,

0.15, 0.98}. It is remarkable that r

is better than r

in almost all values; however, because measure 10 in

OMs

is greater than measure 10 in OMs

(0.98 x

0.96), there would be no dominance between them

and therefore both would belong to the same E

set.

Expanding to a set of more rules and more OMs, the

problem begins to get worse, because if a measure in

one rule is greater than in another rule, the problem

will occur.

Due to the above problems, the focus here is to

assess the impact of using many OMs at the same time

considering an ensemble of classiﬁers. Thus, as seen

in Figure 2, the process works as follows:

i. Generate an individual classiﬁer for each OM. To

this end, step (b) of CBA is modiﬁed, speciﬁcally

sorting, as follows: a rule r

precedes a rule r

in a sorted list, if the individual OM value in r

greater than r

; in the event of a tie, r

was gener-

ated before r

;

ii. Label the new unseen object by each individual

OM classiﬁer and then realize a majority vote to

decide the class label to which the instance will

belong. In case of a tie, the majority class associ-

ated with the dataset is considered.

4 EXPERIMENTAL EVALUATION

Experiments were performed to evaluate the proposed

approach. For this, the necessary requirements for the

experiments are presented. The proposed approach

was compared with those found in the literature, as

shown in Figure 2: individual OMs (view (A)) and

aggregate OMs (view (B)). Note that the aggregation

approach to be used could be any other available in

the literature. However, it was chosen to use (Bouker

et al., 2014) approach as it was the one used by (Silva

and Carvalho, 2018). Future work intends to expand

the analysis to include other aggregation approaches

in order to make a broader comparison with the en-

semble approach proposed here. Finally, it was cho-

sen to use the same settings used by (Silva and Car-

valho, 2018) to make a fair comparison.

Datasets. 8 datasets available in UCI

were used,

which are presented in Table 2 – in the total num-

ber of features the identiﬁcation columns are being

disregarded and the one associated with the class la-

bels is considered. Discretization was performed by

the algorithm proposed by (Fayyad and Irani, 1993)

available in Weka

. Data were preprocessed to ﬁt the

datasets to the input format used in the CBA imple-

mentation adopted here (available in (Coenen, 2004)).

Missing values and features without distinct values af-

ter discretization were ignored.

Table 2: Datasets used in experiments: Australian Credit

Approval (Australian); Breast Cancer Wisconsin (Breast-

C-W); Glass Identiﬁcation (Glass); Heart; Iris; Tic-Tac-

Toe; Wine; Vehicle Silhouettes (Vehicle). Acronyms mean:

#Transactions (#T); #Features (#F); #Distinct Items (#D-I);

#Class Labels (#C-L).

Dataset #T #F #D-I #C-L

Australian 690 15 51 2

Breast-C-W 699 10 38 2

Glass 241 10 29 7

Heart 270 14 30 2

Iris 150 5 17 3

Tic-Tac-Toe 958 10 29 2

Wine 178 14 41 3

Vehicle 946 19 36 4

Objective Measures. 19 OMs were considered: Sup-

port, Prevalence, K-Measure, Least Contradiction,

Conﬁdence, EII1, Leverage, DIR, Certainty Factor,

Odds Ratio, Dilated Q2, Added Value, Cosine, Lift, J-

Measure, Recall, Speciﬁcity, Conditional Entropy and

https://archive.ics.uci.edu/ml.

https://www.cs.waikato.ac.nz/ml/weka/.

Objective Measures Ensemble in Associative Classiﬁers

Coverage. This choice was made based on the work

of (Tew et al., 2014), as described in (Silva and Car-

valho, 2018).

Parameters Setting. Minimum support and mini-

mum conﬁdence were set, respectively, to 5% and

50%. The values were empirically deﬁned. In addi-

tion, the following limits were considered in the im-

plementation of CBA used here (available in (Coenen,

2004)): (i) maximum amount of antecedent items: 6;

(ii) maximum amount of frequent itemsets to be ob-

tained: 5.000.000; (iii) maximum amount of rules to

be extracted: 10.000.

Evaluation Criterion. Accuracy was used as a

measure of performance. For this, a 10-fold strati-

ﬁed cross-validation was performed 10 times. There-

fore, the accuracy values presented here represent the

average of 10 runs. Aiming at the fairness of the re-

sults, the same training and testing sets were used for

a given i iteration in each of the 10 times the 10-fold

stratiﬁed cross-validation was performed, i.e., in all

conﬁgurations, in the nth iteration, the training and

test sets were the same.

Experimental Conﬁguration Overview. Each

dataset was executed in 22 different conﬁgurations,

as follows: (1

◦

) CBA, (2

◦

) Support, (3

◦

) Prevalence,

◦

) K-Measure, (5

◦

) Least Contradiction, (6

◦

)

Conﬁdence, (7

◦

) EII1, (8

◦

) Leverage, (9

◦

) DIR,

(10

◦

) Certainty Factor, (11

◦

) Odds Ratio, (12

◦

)

Dilated Q2, (13

◦

) Added Value, (14

◦

) Cosine, (15

◦

)

Lift, (16

◦

) J-Measure, (17

◦

) Recall, (18

◦

) Speci-

ﬁcity, (19

◦

) Conditional Entropy, (20

◦

) Coverage,

(21

◦

) OMs.Aggregate (OMs.A) and (22

◦

) Ensemble

(OMs.E). Number “1” refers to CBA without any

modiﬁcation; Number “2” to “20” to CBA modiﬁed

to sort the rules according to a speciﬁc OM (indi-

vidual OMs (view (A) in Figure 2)); Number “21”

to CBA modiﬁed to sort the rules according to the

aggregate approach proposed by (Bouker et al., 2014)

(aggregate OMs (view (B) in Figure 2)); Number

“22” to the ensemble approach presented here (view

cover the three views presented in Figure 2.

5 RESULTS AND DISCUSSION

To better understand the results the analysis was di-

vided considering different views. The ﬁrst, shown in

Table 3, compares the obtained accuracies in the En-

semble approach (OMs.E) with CBA. The best value

obtained in each dataset is highlighted. For example,

considering the Australian dataset, the best accuracy

(86.58%) occurred in the Ensemble approach. Note

that OMs.E performs better on 5 datasets (62.50%),

providing a good solution, as each OM in the ensem-

ble evaluates each rule in a different way. The sec-

ond, shown in Table 4, compares the obtained accu-

racies in the Ensemble approach with the aggregate

approach proposed by (Bouker et al., 2014) (OMs.A).

The best value obtained in each dataset is highlighted.

It can be seen that OMs.E perform better on 4 datasets

(50.0%), tying with OMs.A (50.0%). Also, as men-

tioned in Section 3, as many OMs are used in the

(Bouker et al., 2014) approach, the ﬁnal sorting is

done, in fact, only by the similarity of the rules in

relation to the reference rule. In all cases here, only

one E

set was generated (E

= 1), which means that

a simple similarity yields good results and no search

for undominated rules is required, as the process can

be simpliﬁed. Future work intends to use these results

to propose another aggregation approach.

Table 3: Comparison of results between CBA and Ensem-

ble.

Dataset CBA Ensemble

Australian 86.04 86.58

Breast-C-W 96.07 95.95

Glass 64.85 65.14

Heart 80.67 82.19

Iris 95.67 96.00

Tic-Tac-Toe 100.00 98.59

Wine 98.87 95.11

Vehicle 58.76 59.07

Table 4: Comparison of results between Ensemble and

OMs.A.

Dataset Ensemble OMs.A

Australian 86.58 85.17

Breast-C-W 95.95 93.71

Glass 65.14 66.23

Heart 82.19 83.15

Iris 96.00 97.07

Tic-Tac-Toe 98.59 90.40

Wine 95.11 98.87

Vehicle 59.07 58.26

Evaluating the results in a general view, the obtained

accuracies among CBA, OM (view (A) in Figure 2),

OMs.A (view (B) in Figure 2) and Ensemble (view

ble 5. The OM column presents the best accuracy

obtained among the 19 OMs used in the experiments

(the name(s) of the OM(s), regarding each value,

ICEIS 2020 - 22nd International Conference on Enterprise Information Systems

is(are) mentioned on the ﬁgure label). The best value

obtained in each dataset is highlighted. It can be noted

that:

• CBA performs best on 2 datasets (25%, counting

the tie), OM at 2 (25%), OMs.A at 4 (50%, count-

ing the tie) and OMs.E at 1 (12.50%). As seen,

solutions that use multiple OMs perform better

than individual OMs, in this case 62.50% (5 oc-

currence).

Table 5: Comparison of results among CBA, OM, OMs.A

and Ensemble. *Best OMs accuracies: Australian: Added

Value; Breast-C-W: J-Measure; Glass: J-Measure, Odds

Ratio; Heart: J-Measure; Iris: Dilated Q2; Tic-Tac-Toe:

Odds Ratio, Conﬁdence; Wine: J-Measure; Vehicle: K-

Measure.

Dataset CBA OM* OMs.A Ensemble

Australian 86.04 86.29 85.17 86.58

Breast-C-W 96.07 96.32 93.71 95.95

Glass 64.85 66.08 66.23 65.14

Heart 80.67 82.93 83.15 82.19

Iris 95.67 96.07 97.07 96.00

Tic-Tac-Toe 100.00 99.99 90.40 98.59

Wine 98.87* 96.94 98.87* 95.11

Vehicle 58.76 61.89 58.26 59.07

• Looking at cases where the Ensemble approach

does not have the best values, such as those pre-

sented in Table 4, where OMs.A wins, it is inter-

esting to note that in two of them (Breast-C-W and

Vehicle) it loses to the best OM, which is a good

result, as it performs close to the best OM. This

means that Ensemble is incorporating its effect in

the result. This is why their values are underlined.

In the latter case (Tic-Tac-Toe), although CBA

presents the best performance, the same pattern

occurs, i.e., it performs close to the best OM (un-

derlined value). Considering this, Ensemble wins

4 times (50.0%) and OMs.A 4 (50.0%), as already

shown in Table 4. In addition, Table 6 comple-

ments Table 5 by presenting the differences be-

tween OMs.A and Ensemble with respect to the

best accuracy observed in each dataset. The high-

lighted values correspond to the smallest differ-

ence between the best accuracy and the respective

setting. As seen, Ensemble wins 4 times (50.0%)

and OMs.A 4 (50.0%), with differences in Ensem-

ble smaller than OMs.A (an average of 1.44 vs

2.16).

• Only 6 OMs appeared in Table 5, which means

that 13 of 19 did not show good values in any of

the datasets. These OMs may be negatively inﬂu-

encing the accuracy of solutions that use multiple

OMs. Therefore, an in-depth study of the most

appropriate OMs to be aggregated into the pre-

Table 6: Differences between OMs.A and Ensemble regard-

ing the best accuracy observed in each dataset.

Dataset OMs.A Ensemble

Australian 1.41 0.00

Breast-C-W 2.61 0.37

Glass 0.00 1.09

Heart 0.00 0.96

Iris 0.00 1.07

Tic-Tac-Toe 9.60 1.41

Wine 0.00 3.76

Vehicle 3.63 2.82

sented approaches (aggregate, ensemble) should

be undertaken.

6 CONCLUSIONS

This work presented an ensemble solution to consider

many OMs, at the same time, in the sort step of CBA

to improve its accuracy. According to the obtained

results, it was noted that solutions that use multiple

OMs perform better than individual OMs. Regarding

the Ensemble approach, it was found to be a good so-

lution, especially when the best accuracy is related

to the best OM. However, the Ensemble approach

achieved the same performance as OMs.A, each win-

ning 50% of the times. Thus, both presented them-

selves as good solutions, each one better in one spe-

ciﬁc situation, even OMs.A having the bias previously

explained.

In order to improve results as well as analyzes, fu-

ture works should be done: (i) consider more datasets;

(ii) explore other aggregation approaches beyond that

considered here (Bouker et al., 2014); (iii) ﬁnd ways

to automatically select the most appropriate OMs to

be aggregated in the presented approaches (aggre-

gated, ensemble).

ACKNOWLEDGEMENTS

We wish to thank Fapesp, process number

2019/04923-2, for the ﬁnancial aid.

REFERENCES

Abdelhamid, N., Jabbar, A. A., and Thabtah, F. (2016). As-

sociative classiﬁcation common research challenges.

In 45th International Conference on Parallel Process-

ing Workshops, pages 432–437.

Abdelhamid, N. and Thabtah, F. A. (2014). Associative

classiﬁcation approaches: Review and comparison.

Objective Measures Ensemble in Associative Classiﬁers

Journal of Information & Knowledge Management,

13(3).

Abdellatif, S., Ben Hassine, M. A., and Ben Yahia, S.

(2019). Novel interestingness measures for mining

signiﬁcant association rules from imbalanced data. In

Web, Artiﬁcial Intelligence and Network Applications,

pages 172–182.

Abdellatif, S., Ben Hassine, M. A., Ben Yahia, S., and

Bouzeghoub, A. (2018a). ARCID: A new approach

to deal with imbalanced datasets classiﬁcation. In

SOFSEM: Theory and Practice of Computer Science,

pages 569–580.

Abdellatif, S., Yahia, S. B., Hassine, M. A. B., and

Bouzeghoub, A. (2018b). Fuzzy aggregation for rule

selection in imbalanced datasets classiﬁcation using

choquet integral. In IEEE International Conference

on Fuzzy Systems, page 7p.

Alwidian, J., Hammo, B. H., and Obeid, N. (2018). WCBA:

Weighted classiﬁcation based on association rules al-

gorithm for breast cancer disease. Applied Soft Com-

puting, 62:536–549.

Azevedo, P. J. and Jorge, A. M. (2007). Comparing rule

measures for predictive association rules. In Machine

Learning: ECML, pages 510–517.

Bong, K. K., Joest, M., Quix, C., Anwar, T., and Manickam,

S. (2014). Selection and aggregation of interestingnes

measures: A review. Journal of Theoretical and Ap-

plied Information Technology, 59(1):146–166.

Bouker, S., Saidi, R., Yahia, S. B., and Nguifo, E. M.

(2014). Mining undominated association rules

through interestingness measures. International Jour-

nal on Artiﬁcial Intelligence Tools, 23(4):22p.

Coenen, F. (2004). LUCS KDD implementation of

CBA. http://cgi.csc.liv.ac.uk/

∼

frans/KDD/Software/

CBA/cba.html.

Dahbi, A., Jabri, S., Balouki, Y., and Gadi, T. (2016). A

new method for ranking association rules with multi-

ple criteria based on dominance relation. In ACS/IEEE

13th International Conference of Computer Systems

and Applications, page 7p.

Fayyad, U. M. and Irani, K. B. (1993). Multi-interval dis-

cretization of continuous-valued attributes for classiﬁ-

cation learning. In International Joint Conference on

Artiﬁcial Intelligence, pages 1022–1029.

Jalali-Heravi, M. and Za

ıane, O. R. (2010). A study on

interestingness measures for associative classiﬁers. In

Proceedings of the 2010 ACM Symposium on Applied

Computing, pages 1039–1046.

Kannan, S. (2010). An Integration of Association Rules and

Classiﬁcation: An Empirical Analysis. PhD thesis,

Madurai Kamaraj University.

Liu, B., Hsu, W., and Ma, Y. (1998). Integrating classiﬁ-

cation and association rule mining. In Proceedings of

the 4th International Conference on Knowledge Dis-

covery and Data Mining, pages 80–86.

Moreno, M. N., Segrera, S., L

opez, V. F., Mu

noz, M. D.,

and S

anchez, A. L. (2016). Web mining based frame-

work for solving usual problems in recommender sys-

tems. A case study for movies’ recommendation. Neu-

rocomputing, 176:72–80.

Nandhini, M., Sivanandam, S. N., Rajalakshmi, M., and

Sidheswaran, D. (2015). Enhancing the spam email

classiﬁcation accuracy using post processing tech-

niques. International Journal of Applied Engineering

Research, 10(15):35125–35130.

Nguyen Le, T. T., Huynh, H. X., and Guillet, F. (2009).

Finding the most interesting association rules by

aggregating objective interestingness measures. In

Knowledge Acquisition: Approaches, Algorithms and

Applications, pages 40–49.

Shao, Y., Liu, B., Li, G., and Wang, S. (2017). Software

defect prediction based on class-association rules. In

2nd International Conference on Reliability Systems

Engineering, page 5p.

Silva, M. F. and Carvalho, V. O. (2018). Aggregating inter-

estingness measures in associative classiﬁers. In Anais

do XV Encontro Nacional de Intelig

encia Artiﬁcial e

Computacional, pages 70–81.

Singh, J., Kamra, A., and Singh, H. (2016). Prediction of

heart diseases using associative classiﬁcation. In 5th

International Conference on Wireless Networks and

Embedded Systems, page 7p.

Tew, C., Giraud-Carrier, C., Tanner, K., and Burton, S.

(2014). Behavior-based clustering and analysis of

interestingness measures for association rule mining.

Data Mining and Knowledge Discovery, 28(4):1004–

1045.

Thabtah, F. (2007). A review of associative classiﬁcation

mining. Knowledge Engineering Review, 22(1):37–

65.

Yang, G. and Cui, X. (2015). A study of interestingness

measures for associative classiﬁcation on imbalanced

data. In Trends and Applications in Knowledge Dis-

covery and Data Mining, pages 141–151.

Yang, G., Shimada, K., Mabu, S., and Hirasawa, K. (2009).

A nonlinear model to rank association rules based on

semantic similarity and genetic network programing.

IEEJ Transactions on Electrical and Electronic Engi-

neering, 4(2):248–256.

Yin, C., Guo, Y., Yang, J., and Ren, X. (2018). A new

recommendation system on the basis of consumer ini-

tiative decision based on an associative classiﬁcation

approach. Industrial Management and Data Systems,

118(1):188–203.

ICEIS 2020 - 22nd International Conference on Enterprise Information Systems