Function-based Case Classiﬁcation for Improving Business Process

Mining

Yaguang Sun and Bernhard Bauer

Programming Distributed Systems Lab, University of Augsburg, Augsburg, Germany

Keywords:

Business Process Mining, Multi-label Case Classiﬁcation, Sequential Pattern Mining, Business Process

Extension.

Abstract:

In the last years business process mining has become a wide research area. However, existing process mining

techniques encounter challenges while dealing with event logs stemming from highly ﬂexible environments

because such logs contain a large amount of different behaviors. As a result, inaccurate and wrong analysis

results might be obtained. In this paper we propose a case (a case is an instance of the business process)

classiﬁcation technique which is able to combine domain experts knowledge for classifying cases so that

each group is calculated containing the cases with similar behaviors. By applying existing process mining

techniques on the cases for each group, more meaningful and accurate analysis results can be obtained.

1 INTRODUCTION

Business process mining techniques aim at analysing

various aspects of enterprise business processes such

as workﬂow discovery, process performance analysis

and organizational structure analysis, etc. The starting

point of these analyses is usually an event log which

is a set of cases, where each case is an instance of the

business process (van der Aalst, 2011). Cases always

have an attribute trace which is a ﬁnite sequence of

ordered events. In an event log, events and cases are

represented by unique identiﬁers (event ID and case

ID). Except for trace, a case may also have other at-

tributes such as originator and cost, etc. An event can

also have its own attributes. Table 1 shows an exam-

ple event log.

In the real world business processes are often ex-

ecuted in highly ﬂexible environments, e.g., health-

care, customer relationship management (CRM) and

product development (Weerdt et al., 2013). The event

logs that stem from such ﬂexible environments (real-

life event logs) often contain a high variety of case be-

haviors. The behaviors of a certain case are reﬂected

by the values of case attributes. For instance, the be-

havior of a case related to the case attribute trace is

called structural case behavior which is expressed by

the events and their precedence relations in the trace.

Cases are generated so as to adhere to a certain cat-

egory of behaviors determined by business process-

based domain criterion (Weerdt et al., 2013). The pro-

Table 1: An example event log.

Case id Event id Properties

Activity Resource Cost

1 421 A Pete 40

422 B Sun 200

423 C Simon 300

424 D Chris 100

425 E Pete 200

2 452 A Mike 30

453 C Simon 300

454 F Chris 200

455 D Sun 100

456 G Mike 500

. . . . . . . . . . . . . . .

cess mining techniques are facing difﬁculties while

dealing with event logs containing multiple process-

based criteria. For instance, inaccurate and complex

business process models might be generated by exist-

ing workﬂow discovery algorithms with an input of

such logs because multiple criteria may represent a

lot of structural behaviors of cases from structurally

different process executions. Furthermore, some im-

portant process analysis results related to certain pro-

cess criterion might be concealed in the ﬁnal results

generated by using the logs containing multiple pro-

cess criteria. There is a need to classify the cases with

different behaviors into different groups.

Accordingly, some approaches are developed for

251

Sun Y. and Bauer B..

Function-based Case Classiﬁcation for Improving Business Process Mining.

DOI: 10.5220/0005349202510258

In Proceedings of the 17th International Conference on Enterprise Information Systems (ICEIS-2015), pages 251-258

ISBN: 978-989-758-096-3

 2015 SCITEPRESS (Science and Technology Publications, Lda.)

Figure 1: Model of case classiﬁcation in the scenario of process mining.

this purpose. One efﬁcient technique is trace clus-

tering (Greco et al., 2006; Song et al., 2009; Bose

and van der Aalst, 2010; Bose and van der Aalst,

2009) which is designed mainly to help discover bet-

ter workﬂows. It is able to automatically capture the

behaviors of cases in an event log and then group the

cases with similar behaviors into the same sub-log.

By applying process mining techniques on each sim-

pler sub-log more accurate and understandable analy-

sis results can be obtained.

Trace clustering can help ﬁnd a lot of hidden be-

haviors among the cases. However, it is an unsu-

pervised learning technique and lack domain knowl-

edge. As a result it is unable to indicate which be-

haviors found are crucial for classifying the cases or

which behaviors are wanted by customers for classi-

fying the cases. Additionally, treating all of the be-

haviors found equally may not generate a correct or

meaningful classiﬁcation of cases.

Classiﬁcation (supervised learning) which is able

to combine the domain knowledge from business ex-

perts can be a useful tool for classifying cases. In this

paper we put forward a case classiﬁcation technique:

• We demonstrate and formalise the problem of

multi-label case classiﬁcation in Section 3.1.

• One important sub-problem of the case classiﬁca-

tion problem is how to exploit the case attribute

trace for classifying the cases. To solve this sub-

problem we develop a systematic method based

on sequential pattern mining technique for utilis-

ing the case attribute trace for classifying cases in

Section 3.2 and Section 3.3.

• To test the efﬁciency of our method, we make a

case study in Section 4 by using a real-life event

log of a Dutch academic hospital from Business

Process Intelligence Contest 2011 (BPIC 2011).

2 PROBLEM DESCRIPTION

Figure 1 depicts a detailed model for case classiﬁca-

tion in the scenario of business process mining. Ba-

sically, the case-classiﬁer links the primitive cases in

an event log with the labels. Each label represents one

category and cases connected to the same label may

share some common behaviors. Finally by working

with the already classiﬁed cases, the process mining

techniques are able to analyse the enterprise business

process in different points of view and generate more

readable and meaningful analysis results.

Traditional data classiﬁcation approaches proceed

in two steps (Kotsiantis, 2007):

– In the ﬁrst step, training data is analysed by a

classiﬁcation algorithm so that a function y =

f (X) can be learned. The generated function

(classiﬁer) is able to predict the associated cate-

gory label y of a given tuple X = (x

, x

, ..., x

where X represents the set of attributes of a spe-

ciﬁc item.

– In the second step, predictive accuracy of the

classiﬁer built is estimated by utilising a test set

made up of tuples and their relevant category la-

bels.

A training event log should be generated ﬁrstly

for a case classiﬁcation problem. Labels are added

manually to part of the cases in an event log by do-

main experts according to their domain knowledge

and the behaviors of cases. Then these labeled cases

are extracted to generate the training event log and

test event log. A training event log is used for build-

ing the classiﬁer and a test event log for estimating the

performance of the classiﬁer. By looking into some

real-life event logs we discover that a case may have

more than one label which makes this case classiﬁca-

ICEIS2015-17thInternationalConferenceonEnterpriseInformationSystems

252

tion problem a multi-label classiﬁcation problem (as

shown in Figure 1).

The behaviors of cases are reﬂected by the val-

ues of case attributes recorded in the event log. Most

case attributes that have discrete values or numeric

values can be easily utilised for building the classi-

ﬁer or judging which labels a case belongs to. But the

trace of a case can not be directly used. A trace is an

important attribute of a case which is a ﬁnite sequence

of ordered events, for instance, in Table 1 the trace of

case1 is < A,B,C, D,E >. The trace may be a major

element for deciding which labels a case pertains to

(while the labels are related to the structural feature

of trace).

For the cases generated by structured business

processes

, it is easy to transform the traces into a

suitable form that can be utilised as an attribute for

case classiﬁcation because the compositions of traces

are limited by a structured process model. However,

for the cases from a real-life event log, such a trans-

formation should not be directly carried out because it

is possible that cases with similar features may seem

very different from each other. Additionally, there

might be multiple structural features (presence or ab-

sence of an activity, presence or absence of combina-

tions of activities and so on) of traces related to one

label. So how to capture the possible structural be-

haviors of traces (expressed by the events and their

precedence relations in the traces) relevant to labels is

an important sub-problem for solving the multi-label

case classiﬁcation problem.

To solve the multi-label case classiﬁcation prob-

lem and its sub-problem mentioned above, we pro-

posed in this paper a sequential pattern mining

technique-based method (Section 3) for mining all of

the possible label-related structural features of traces

and then transforming these found features into suit-

able forms of case attributes to help the later case clas-

siﬁcation.

3 BASIC CONCEPTS AND

APPROACH DESIGN

In this section we ﬁrst elaborate the concept of multi-

label case classiﬁcation (Section 3.1). Afterwards the

structural feature of traces is presented in a formal

way (Section 3.2). At last we provide a method for

A structured business process is a rigidly deﬁned pro-

cess with a model which considers all of the process in-

stance permutations and every process instance complies

with this model. If a structured business process has a loop

structure it can also generate a large amount of isomerous

traces.

transforming the found structural features of traces

into a suitable form of case attributes for the later case

classiﬁcation (Section 3.3).

3.1 Multi-label Case Classiﬁcation

A multi-label classiﬁcation technique solves the prob-

lem of predicting to which set of classes (also rep-

resented by labels) a new instance belongs by ex-

ploiting a training set of data. The training data is

a set T = {t

, t

, ..., t

} of already classiﬁed sam-

ples where each sample t

is constructed by a k-

dimensional vector X

= (x

, x

, ..., x

). The di-

mensions in X

represent attributes of the sample, as

well as the categories to which t

pertains.

The existing multi-label classiﬁcation methods are

mainly divided into two types, one is algorithm in-

dependent and the other one is algorithm dependent

(Tsoumakas and Katakis, 2007; Carvalho and Freitas,

2009). In this paper we will utilise the algorithm in-

dependent approach for solving the problem of multi-

label case classiﬁcation. In the algorithm independent

approach, a multi-label classiﬁcation problem can be

converted into several single-label problems. For each

label (or category) a classiﬁer is built so that the clas-

siﬁcation problem related to this label can be dealt

with. For the multi-label case classiﬁcation problem

in this paper, the training data is conveyed by the fol-

lowing deﬁnitions:

Deﬁnition 1. Let C

be the set of training cases. A

case c ∈ C

is deﬁned as a tuple c = (N

, L

, Θ

where N

= {n

, n

, ..., n

} is the set of names of

case attributes, L

is a set of labels, Θ

: N

→ A

an attribute-transition function which maps the name

of an attribute into the value of this attribute, where A

is the set of attribute values for case c. A label l ∈ L

represents a manually given class to which case c be-

longs.

As already mentioned, a case in an event log may

be assigned multiple labels in the real world, thus for

all c ∈ C

, we have |L

| ≥ 1, where |L

| is the number

of labels.

Deﬁnition 2. A training event log is deﬁned as E

⊆

, for any c

∈ E

such that c

6= c

Let’s presume that the example event log in Ta-

ble 1 is a training event log, all cases in this log have

an attribute originator and an attribute labels, case1

has an originator ”Mike” and a set of assigned labels

}. According to the concepts deﬁned above,

(originator) = ”Mike” is the originator for case1,

(trace) =< A,B,C,D,E > is the trace for case1,

= {l

} is the set of labels for case1.

Function-basedCaseClassificationforImprovingBusinessProcessMining

253

Deﬁnition 3. The multi-label case classiﬁcation

problem is deﬁned as Prob = (SE

, SE

test

, Φ, Ξ),

where SE

is the set of training event logs, SE

test

represents the set of test event logs for evaluating

the accuracy of the learned classiﬁers, Φ is a multi-

label classiﬁcation algorithm, Ξ is a classiﬁer eval-

uation schema. Φ : SE

→ Ψ represents the pro-

cess for building a classiﬁer (or a set of classiﬁers),

where Ψ is the set of classiﬁers. Ξ : SE

test

,Ψ →

{Accurate,Inaccurate} represents the process for

evaluating the performance of a built classiﬁer.

3.2 Deﬁnitions Relevant to Function

The sequential pattern mining techniques solve the

problem of ﬁnding all frequent subsequences from a

given set of sequences, where each sequence contains

a list of ordered events and each event consists of a

set of items (Han and Kamber, 2000). A minimum

support threshold is manually given for judging if the

occurrence of a subsequence is frequent or not.

Let I = {I

, I

, ..., I

} be the set of all items,

S = {S

, S

, ..., S

} be the set of all sequences. A

sequence S

∈ S is an ordered list of events and de-

noted < e

, e

, ..., e

i j

>, where each event e

rep-

resents an item set composed by items of I. For

any two sequences α =< a

, a

, ..., a

> and β =<

, b

, ..., b

> over S, α is a subsequence of β, if

1 ≤ p

< p

< · · · < p

≤ q such that a

= b

, a

, ..., a

= b

. Let D ⊆ S be a database of se-

quences, for a given minimum support threshold

min sup (0 < min sup < 1), a sequence λ is called

a sequential pattern if support(λ) ≥ min sup × |D|,

where support(λ) is the number of sequences in D

which contain λ and |D| represents the total number

of sequences in D.

As introduced in Section 3.1, the trace of a case is

a sequence of ordered events. Thus the set of traces

collected from an event log can be deemed as a se-

quence database on which the sequential pattern min-

ing algorithms can be implemented. We deﬁne a se-

quential pattern mined from a set of traces extracted

from an event log as a function:

Deﬁnition 4. Let E be an event log, Y ⊆ E be a set of

traces collected from E, Γ : Y

min sup

−→ F be a sequential

pattern mining algorithm, where F = { f

, f

, ..., f

}

is the mined set of sequential patterns with a mini-

mum support threshold min sup, a sequential pattern

∈ F is deﬁned as a function relevant to Y and F is

also called a set of functions.

Deﬁnition 5. Let E

be a training event log, Y

label

⊆

be a set of traces from E

, where each trace in Y

label

is related to one common label. A label-related func-

tion set F

label

is a set of functions mined from Y

label

Set of traces Set of traces Set of traces - - - - - -

Training event

log

Label-related

functions

Sequential

pattern mining

Label n

Label 1

Label m

m n

Set of functions

Figure 2: Mining label-related functions from a training

event log.

and a function f

∈ F

label

is called a label-related func-

tion.

In our opinion, the label-related functions in F

label

reveal the commonly and frequently appeared struc-

tures of the traces in Y

label

. As mentioned in Section 2,

a label may be associated with the structural charac-

teristics of traces. While this is true, the label-related

functions can be exploited to judge if a trace belongs

to a speciﬁc label.

Figure 2 illustrates the process of mining all of the

possible label-related functions from a training event

log. In the ﬁrst step, all the traces included by a train-

ing event log are separated into different sets where

each set is associated with one label. For example,

in Figure 2 the traces of cases with label 1 are col-

lected and sent to set 1. In the second step, a sequen-

tial pattern mining procedure is executed on each set

of traces for discovering label-related functions. Fi-

nally all of the found functions are grouped together

in one set (label-related function set).

Let’s presume that the event log shown in Table 1

is a training event log and each case in this log has a

set of labels from {l

}. For this training event

log, a set of traces Y

is obtained through extracting

all traces with label l

. Then by mining Y

using a se-

quential pattern mining method a label-related func-

tion set F

can be extracted. This label-related func-

tion mining procedure mentioned above is described

in Algorithm 1.

3.3 Transforming Label-related

Functions into Case Attributes

The raw label-related functions found could not be

exploited directly, they should be transformed into a

suitable form of case attributes so that they can be

ICEIS2015-17thInternationalConferenceonEnterpriseInformationSystems

254

Algorithm 1 Mining label-related functions for a

given training event log

Input: a training event log E

, the minimum support

threshold min sup

Let G(label) be a set of traces related to a speciﬁc

label in E

Let S be a set of all possible label-related func-

tions for E

Let LB be the set of labels in E

Let Θ be a case attribute-transition function as de-

scribed in Deﬁnition 1.

Let Γ(Y,m s) → F be a sequential pattern mining

algorithm, where Y is a sequence database, m s is

a minimum support threshold, F is the mined set

of sequential patterns.

1: S ←

0 // is set to be an empty set

2: for each G(label

) such that label

∈ LB do

3: G(label

) ←

4: end for

5: for each case c

∈ E

6: for each label l

∈ L

7: G(l

) = G(l

) ∪ Θ

(trace) // L

is the

set of labels owned by c

8: end for

9: end for

10: for each label lb ∈ LB do

11: S = S ∪ Γ(G(lb),min sup)

12: end for

Output: the set of all label-related functions in E

: S

checked by a classiﬁcation algorithm. In the follow-

ing parts we will propose a method for this purpose.

Through algorithm 1, the traces in a training

event log are grouped into different sets where

each set is related to one label and the functions

for each set are discovered. Before building a

classiﬁer for each label, the found functions need

to be converted into usable case attributes. To

explain our method clearly, we will ﬁrst set up a

few variants: E

is a training event log, a set G =

{G(label

), G(label

), ..., G(label

)} where each

G(label

) is a set of all the traces relevant to label

, a set F = {F(label

), F(label

), ..., F(label

)}

where each F(label

) is a function set

for G(label

) found by algorithm 1, F

∗

{ f unction

, f unction

, ..., f unction

} is a set

which contains all of the functions in F. An asso-

ciation table AT is then established which connects

each function in F

∗

with a global unique identiﬁer. In

the association table shown in Table 2, for instance,

f unction

has an id A

Table 2: An example association table for the functions in

∗

Function ID Function

f unction

Let c

∈ E

be a case from training event log E

add all of the function ids in the association table as

attribute names to the attribute list of c

, and their ini-

tial values are set to 0. The next step is to calculate the

value of each new added attribute. Take function A

from Table 2 as an example, a subsequence-detection

process is carried out with Θ

(trace) (the trace of

) and f unction

(matched with A

in Table 2) as in-

puts. If f unction

is judged to be a subsequence of

(trace), then Θ

) is reset to be 1. The proce-

dure mentioned above should also be applied to both

the test event logs and the normal event logs which

contain cases needed to be classiﬁed.

4 CASE STUDY

We tested the effectiveness of our technique on the

hospital event log from the BPIC 2011. This log con-

tains 624 activities and 1143 cases where each case

stands for a treatment process of a patient of the gy-

naecology department. A lot of attributes relevant to

the cases have been recorded in this log, such as the

ages of patients and the ﬁnal diagnosis for patients.

In the experiment the treatments performed on a

patient are regarded as labels (categories). One rea-

son is that in healthcare industry treatment is often

used as a label for classifying cases, for instance, the

SAP Business Suite for Patient Management exploits

the treatment as one category for case classiﬁcation

(SAP Community, 2014). There are overall 48 kinds

of treatment (coded by number) in this log from which

we have chosen seven frequently happened treatments

(namely 13, 23, 61, 101, 113, 603, 3101) for analysis.

Each case may belong to more than one treatments

and each treatment may be characterized by multiple

behaviors of traces.

For generating the training event log, a simple

strategy mentioned in (Tsoumakas and Katakis, 2007)

is exploited which discards every multi-label case in

the hospital event log. For instance, all the cases that

have only a single treatment belonging to the treat-

ment set TS = {13, 23, 61, 101, 113, 603, 3101} are

extracted and form the training event log. All of the

Function-basedCaseClassificationforImprovingBusinessProcessMining

255

Table 3: Performances of the classiﬁers built for each treatment in TS.

Treatment Correctly classiﬁed AUC Kappa statistic Recall

instances ratio

13 0.896940 0.935 0.6662 0.988/0.882

23 0.900161 0.850 0.6060 0.782/0.917

61 0.901771 0.968 0.7241 0.729/0.959

101 0.597424 0.679 0.2935 0.481/1.000

113 0.887279 0.740 0.0590 0.068/0.973

603 0.855072 0.758 0.3304 0.638/0.873

3101 0.853462 0.873 0.5335 0.900/0.847

multi-label cases are organized as the test event log.

As a result, we obtain a training event log with 279

cases and a test event log with 621 cases.

In Algorithm 1 the closed sequential pattern min-

ing algorithm

BIDE (Wang and Han 2004) is used.

By implementing Algorithm 1 with the training event

log and a min sup = 0.3 as input parameters, a label-

related function set S which contains 3159 functions

has been discovered. Then all of the found functions

are transformed into case attributes for both the train-

ing event log and the test event log through method

introduced in Section 3.3. In the generated associ-

ation table AT, the id of a function is in the form

of FPattern

where k represents the position of this

function in AT. For example, A

= FPattern 1 in Ta-

ble 2 because f unction

is the ﬁrst item and A

FPattern n because f unction

is the n

item in this

association table.

In this paper we utilise an algorithm independent

approach for solving the problem of multi-label case

classiﬁcation which converts the learning problem

into traditional single-label classiﬁcation. For each

element in TS a classiﬁer is learned by using a train-

ing event log. For example, for treatment 13, a clas-

siﬁer is built which is able to judge if a case falls in

treatment 13 or not. In our experiment seven binary

classiﬁers (because seven kinds of treatment are anal-

ysed) are established.

To testify the standpoint that the labels pertained

by a case may be related to the structural feature of

its trace, only the case attributes generated by the

transformation of discovered label-related functions

are considered in our experiment, for instance, in this

hospital event log, the other case attributes such like

Age and Diagnisis are not put to use.

Firstly we use the Decision Tree-based algorithm

The reason to use a closed sequential pattern mining

algorithm is that it effectively decreases the total number of

sequential patterns generated but in the meantime preserves

the complete information about all the sequential patterns.

FPattern

1086

113

FPattern

1047

FPattern

not 113

113

not 113

(a) Decision Tree for treatment 113

FPattern

3127

3101

FPattern

3128

3101

not 3101

(b) Decision Tree for treatment 3101

Figure 3: Decision Trees built for treatment 113 and treat-

ment 3101.

C4.5 (Quinlan, 1993) in our experiment. For each

treatment in TS a decision tree is built by exploit-

ing the attributes of cases. For example, we got a

decision tree for treatment 113 as shown in Figure

3(a), and Figure 3(b) shows the decision tree for treat-

ment 3101. According to the decision tree for treat-

ment 113, a case will be inferred to belong to treat-

ment 113 if it has an attribute FPattern 1086 = 1,

and not if it has the attributes FPattern 1086 = 0 and

FPattern 1047 = 0 at the same time.

Table 3 shows the performances for the seven clas-

siﬁers evaluated by using the test event log gener-

ated. Several parameters are calculated in the eval-

uation step. The area under the ROC curve (AUC)

which has a value between 0 and 1 reﬂects the per-

formance of the classiﬁcation model. An ideal classi-

ﬁer has an AUC value close to 1. Correctly Classiﬁed

ICEIS2015-17thInternationalConferenceonEnterpriseInformationSystems

256

13 23 61 101 113 603 3101

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Labels

Value of ICS−Fitness

Test−Fitness

Predicted−Fitness

Original−Fitness

Figure 4: ICS-Fitness for the models generated by using the sub-logs from SC and PC.

Instances Ratio reﬂects the total classiﬁcation accu-

racy of a speciﬁc classiﬁer. The Kappa Statistic (KS)

measures the diversity factor between the classiﬁca-

tion results from a classiﬁer built and a classiﬁcation

by chance, KS ∈ (0.75, 1) implies that the effect of

the classiﬁer is very good, KS ∈ (0.4, 0.75) is char-

acterized as fair to good and KS ∈ (0, 0.4) as poor.

Recall measures the proportion of correctly classiﬁed

instances among all the instances with the same label

or without a speciﬁc label.

Additionally, to testify the practicability of our

technique in business process mining area, we then

evaluate the effectiveness of the classiﬁcation re-

sults (obtained by using Decision Tree algorithm) on

the process discovery task (the most crucial learn-

ing task in process mining domain). Let C

test

the test event log generated in our experiment, SC =

, C

101

, C

113

, C

603

, C

3101

} be the set

of sub-logs where each sub-log contains the cases

correlates to one treatment from T S. Let PC =

{PC

, PC

101

, PC

113

, PC

603

, PC

3101

}

be the set of sub-logs where each consists of the

cases predicted to have one same treatment from T S.

Then process models for the entire test log C

test

and

for each sub-log in SC and PC are learned by us-

ing process discovery techniques. In our experiment

Heuristic Mining Algorithm as described in (Weijters

and Ribeiro, 2011) is utilised. Afterwards, the ICS-

Fitness (Weerdt et al., 2013) (ﬁtness measures the

proportion of behavior in the event log possible ac-

cording to the model) for each model is calculated

and the results are shown in Figure 4. The ICS-

Fitness related to each sub-log in SC is called Test-

Fitness and the ICS-Fitness for the sub-logs in PC is

called Predicted-Fitness. The Original-Fitness (as a

base line in Figure 4) is calculated by using the en-

tire test event log C

test

and the model for it. In Figure

4 we can see that the Predicted-Fitnesses and Test-

Fitnesses for most of the sub-logs in SC and PC are

much higher than the Original-Fitness due to the sub-

logs in SC and PC are simpler and contain less behav-

iors than the original event log C

test

, as a result the

models generated for these sub-logs become more ac-

curate. By comparing the Predicted-Fitness with the

Test-Fitness in Figure 4, we discovered that most of

the Predicted-Fitnesses are very close to their relevant

Test-Fitnesses, for instance, the value of Predicted-

Fitness for PC

3101

is 0.9267 and the value of Test-

Fitness for C

3101

is 0.9382 which is very close to

the Predicted-Fitness for PC

3101

. From the analy-

ses mentioned above it can be deduced that the re-

sults from the multi-label case classiﬁcation tech-

nique have some practical values in the business pro-

cess mining area. Process discovery is only one per-

spective of business process mining techniques and

the multi-label case classiﬁcation method can also

beneﬁt other techniques in this area (e.g., business

process performance analysis) to help generate more

meaningful and accurate analysis results.

5 RELATED WORK

In the literature, different approaches have been put

forward to overcome the negative impacts from high

variety of behaviors stored in event logs on business

process mining techniques. One efﬁcient technique is

trace clustering.

In (Song et al., 2009) the authors present an ap-

proach for characterizing the traces by proﬁles for the

later trace clustering. Each proﬁle is a set of items

that describe the trace from a speciﬁc angle. Five

proﬁles are deﬁned in (Song et al., 2009), they are

activity proﬁle, transition proﬁle, case attributes pro-

Function-basedCaseClassificationforImprovingBusinessProcessMining

257

ﬁle, event attributes proﬁle and performance proﬁle.

Then by converting the proﬁles deﬁned into an aggre-

gate vector the distance between any two traces can

be calculated. One advantage of this technique is that

it provides a full range of metrics for clustering traces.

Context-aware trace clustering methods are pro-

posed in (Bose and van der Aalst, 2010) and (Bose

and van der Aalst, 2009). In (Bose and van der Aalst,

2010) the authors indicate that the feature sets based

on sub-sequences of different lengths are context-

aware for the vector space model and can reveal some

set of common functionality accessed by the process.

Two traces that have a lot of common conserved fea-

tures should be put in the same cluster. In (Bose

and van der Aalst, 2009) the authors presents an edit

distance-based approach for partitioning traces into

clusters such that each cluster consists of traces with

similar structure. The cost of edit operations is asso-

ciated with the contexts of activities so that the calcu-

lated edit distance between traces is more accurate.

In (Weerdt et al., 2013) a novel technique for trace

clustering is presented which is able to directly opti-

mise the accuracy of each cluster’s underlying pro-

cess model. This method doesn’t consider the vector

space model or deﬁne a metric for trace clustering, it

simply discovers the suitable traces for each cluster

so that the combined accuracy of the related models

for these clusters is maximized. This method sufﬁ-

ciently resolves the divergence between the clustering

bias and the evaluation bias.

Classiﬁcation technique is widely used on De-

cision Mining area in business process mining. In

(Rozinat and van der Aalst, 2006) the author devel-

oped a Decision Miner based on Decision Tree algo-

rithm which aims at analysing the choice constructs

of process models by exploiting the event attributes

recorded in event logs.

6 CONCLUSION

In this paper we proposed and elaborated the basic

deﬁnition of Multi-label Case Classiﬁcation. Next,

a concrete systematic method was introduced which

is able to discover all of the label-related structural

features of traces and transform these found features

into case attributes for the later classiﬁcation job. The

effectiveness and practicability of our technique were

then testiﬁed through a case study.

Our next research task will be to focus on exploit-

ing the decision trees generated by our technique so as

to clearly reveal the inﬂuences of different categories

on the execution of business processes.

REFERENCES

Bose, R. and van der Aalst, W. (2009). Context Aware Trace

Clustering: Towards Improving Process Mining Re-

sults. In Proceedings of the SIAM International Con-

ference on Data Mining, pages 401–412.

Bose, R. and van der Aalst, W. (2010). Trace Cluster-

ing Based on Conserved Patterns: Towards Achiev-

ing Better Process Models. In Business Process

Management Workshops, volume 43 of Lecture Notes

in Business Information Processing, pages 170–181.

Springer Berlin.

SAP Community. (2014). Customer-deﬁned Case Classiﬁ-

cation. http://help.sap.com.

Greco, G., Guzzo, A., Pontieri, L., and Sacca, D. (2006).

Discovering Expressive Process Models by Cluster-

ing Log Traces. IEEE Transaction on Knowledge and

Data Engineering, 18(8):1010–1027.

Han, J. and Kamber, M. (2000). Data Mining: Concepts

and Techniques. Morgan Kaufmann, 2nd edition.

Kotsiantis, S. (2007). Supervised Machine Learning: A

Review of Classiﬁcation Techniques. In Proceed-

ings of the 2007 Conference on Emerging Artiﬁcial

Intelligence Applications in Computer Engineering:

Real Word AI Systems with Applications in eHealth,

HCI, Information Retrieval and Pervasive Technolo-

gies, pages 3–24. IOS Press.

Quinlan, J. (1993). C4.5: Programs for Machine Learning.

Morgan Kaufmann.

Rozinat, A. and van der Aalst, W. (2006). Decision Min-

ing in Prom. In International Conference on Business

Process Management (BPM 2006), volume 4102 of

Lecture Notes in Computer Science, pages 420–425.

Springer Berlin.

Song, M., Gnther, C., and van der Aalst, W. (2009). Trace

Clustering in Process Mining. In Business Process

Management Workshops, volume 17 of Lecture Notes

in Business Information Processing, pages 109–120.

Springer Berlin.

Tsoumakas, G. and Katakis, I. (2007). Multi-label Classi-

ﬁcation: An Overview. International Journal of Data

Warehousing and Mining, 3(3):10–13.

Carvalho, A. and Freitas, A. A. (2009). A Tutorial on Multi-

label Classiﬁcation Techniques. Foundations of Com-

putational Intelligence, Studies in Computational In-

telligence, pages 117-195. Springer Berlin.

van der Aalst, W. (2011). Process Mining: Discovery, Con-

formance and Enhancement of Business Processes.

Springer Berlin Heidelberg, Berlin, 1nd edition.

Weerdt, J. D., vanden Broucke, S., Vanthienen, J., and Bae-

sens, B. (2013). Active Trace Clustering for Improved

Process Discovery. IEEE Transaction on Knowledge

and Data Engineering, 25(12):2708–2720.

Weijters, A. and Ribeiro, J. (2011). Flexible Heuristics

Miner (FHM). In Proceedings of CIDM, pages 310–

317.

Wang, J. and Han, J. (2004). BIDE: Efﬁcient Mining of Fre-

quent Closed Sequences. In 20th Int. Conf. on Data

Engineering, Boston, MA.

ICEIS2015-17thInternationalConferenceonEnterpriseInformationSystems

258