2.  Integration  of  data  (data  integration)  Data 
integration  is  the  combination  of  data  from 
different databases into the new database. 
3. Selection of Data (Data Selection) Data contained 
in the database is often not all used, therefore only 
the appropriate data to be analyzed to be retrieved 
from the database.  
4.  The  data  transformation  (Data  Transformation 
Services) Data amended or merged into a format 
suitable for processing in Data Mining. 
5. The process of mining, It is a major process when 
the method is applied to find valuable and hidden 
knowledge from data. 
6.  Evaluation of  the  pattern (pattern  evaluation)  To 
identify interesting patterns into knowledge-based 
alerts. 
7.  Presentation  of  knowledge  (knowledge 
presentation) A visualization and presentation of 
knowledge about the methods used to obtain the 
knowledge acquired.
2.1.1  Classification 
Classification is  a  data  mining  technique  that  maps 
the  data  into  predefined  groups  or  classes.  It  is  a 
supervised  learning  method  labeled  roommates 
training  requires  the  data  to  generate  rules  for 
classifying the data into predetermined test groups or 
classes (Dunham, 2003). The method of classification 
refers to the formation of groups of data by applying 
known  algorithms  to  the  data  warehouse  under 
examination.  This  method  is  useful  for  business 
processes that require categorical information such as 
marketing or sales. It can use various algorithms such 
as  nearest  neighbor,  decision  tree  learning,  and 
others.Decision  Trees  are  also  used  to  explore  the 
data,  find  hidden  relationships  among  a  number  of 
candidates  for  the  input  variables  with  a  target 
variable.The decision tree combines data exploration 
and  modeling,  so  it  is  great  as  a  first  step  in  the 
modeling process even when used as the final models 
of several other techniques. 
2.2  Decision Tree 
The decision tree is a prediction model technique that 
can be used for classification and prediction of tasks. 
The Decision Tree uses the technique of "divide and 
conquers" to divide the problem-finding space into a 
set of problems (Dunham, 2003). The process on the 
decision tree is to change the shape of the data table 
into a model tree. The model tree will generate rule 
and simplified (Basuki & Syarif, 2003). 
The advantages of the decision tree method are: 
1.  The area of decision making that was previously 
complex and very global, can be changed to be 
more simple and specific. 
 
2.  Elimination  calculations  are  not  necessary 
because  when  using  decision  tree  method  the 
samples tested was based criteria or a particular 
class. 
 
3.  Flexible  to  choose  features  from  the  different 
internal nodes, feature selected will distinguish 
criteria other than the criteria in the same node. 
The  flexibility  of  this  decision  tree  method 
increases  the  quality  of  the  resulting  decisions 
than  when  using  the method  of  calculating  the 
phase of a more conventional 
 
Figure 1. Decision Tree Concept 
2.3  C4.5 Algorithm 
There  are  several  steps  in  making  a  decision  tree 
algorithm C4.5, Larose, namely:  
1.  Prepare  the  training  data.  The  training  data  are 
typically  taken  from  historical  data  that  never 
happened before or referred to the past data and is 
already classified in a particular class. 
 
2.  Calculate  the  root  of  the  tree.  The  roots  will  be 
taken of the attributes to be elected, by calculating 
the value of the gain of each attribute, the highest 
gain  value  which  will  be  the  first  roots.  Before 
calculating  the  gain  of  attribute  values,  first, 
calculate  the  value  of  entropy.  To  calculate  the 
value of entropy used the formula: 
 
        ...................... ..................  (1) 
 
 
by:  
S: The set Case  
A: Features  
n: number of partitions S 
pi: The proportion of Si to S 
4.  Calculate  the  value  of  Gain  using  the 
equation.