K-modes and Entropy Cluster Centers Initialization Methods

Doaa S. Ali, Ayman Ghoneim and Mohamed Saleh

Department of Operations Research and Decision Support

Faculty of Computers and Information, Cairo University

5 Dr. Ahmed Zewail Street, Orman, 12613 Giza, Egypt

Keywords: Multiobjective Data Clustering, Categorical Datasets, K-modes Clustering Algorithm, Entropy.

Abstract: Data clustering is an important unsupervised technique in data mining which aims to extract the natural

partitions in a dataset without a priori class information. Unfortunately, every clustering model is very

sensitive to the set of randomly initialized centers, since such initial clusters directly influence the formation

of final clusters. Thus, determining the initial cluster centers is an important issue in clustering models.

Previous work has shown that using multiple clustering validity indices in a multiobjective clustering model

(e.g., MODEK-Modes model) yields more accurate results than using a single validity index. In this study,

we enhance the performance of MODEK-Modes model by introducing two new initialization methods. The

two proposed methods are the K-Modes initialization method and the entropy initialization method. The two

proposed methods are tested using ten benchmark real life datasets obtained from the UCI Machine

Learning Repository. Experimental results show that the two initialization methods achieve significant

improvement in the clustering performance compared to other existing initialization methods.

1 INTRODUCTION

Data clustering extracts the natural partitions in a

dataset without a priori class information. It aims to

group the dataset observations into clusters where

observations within a cluster are more similar to

each other than observations in other clusters.

Clustering is used as a data processing technique in

many different areas, such as artificial intelligence

(e.g., Bhagat et al., 2013), medical (e.g., Rahman

and Sarma, 2013), pattern recognition (e.g., Pratima

and Nimmakant, 2008), and customer analysis (e.g.,

Alvandi et al., 2012). The most popular clustering

algorithms are K-means, K-modes, and K-medoids

algorithms. K-means algorithm is efficiently used

when processing numerical datasets, where means

serve as centers/centroids of the data clusters. In K-

means algorithm, observations are grouped into K

clusters in which an observation belongs to the

cluster with the closest mean (i.e., centroid)

(Serapião et al., 2016). When dealing with

categorical data (Bai et al., 2013; Kim and

Hyunchul, 2008), K-modes (Ammar and Lingras,

2012) and K-medoids (Mukhopadhyay and Maulik,

2007) are used instead of K-means. K-modes

algorithm uses a most frequency based method to

update modes during the clustering process. While

K-medoids algorithm selects a cluster medoid

instead of computing the mean of cluster. A medoid

is a representative observation in each cluster which

has the minimum sum of distances to other

observations in the cluster. Evolutionary

computation techniques play a vital role in

improving the performance of data clustering

because of its ability to avoid local optimal

solutions. Evolutionary computation techniques are

considered global optimization techniques, which

use selection and recombination as their primary

operators to tackle optimization problems (Kim and

Hyunchul, 2008). Data clustering models are very

sensitive to the randomly initialized centers used at

the beginning of the search process, since such

initial centers directly influence the formation of

final clusters (Khan and Ahmed, 2013).

Furthermore, these random initialized centers may

lead to premature convergence to local optimal

solutions. To address such concerns, several

initialization methods were proposed in the

literature. Redmond and Heneghan (2007) propose

to initialize cluster centers by first assigning data

points to one of the k clusters randomly, and then the

centroids of these initial clusters are taken as the

S. Ali D., Ghoneim A. and Saleh M.

K-modes and Entropy Cluster Centers Initialization Methods.

DOI: 10.5220/0006245504470454

In Proceedings of the 6th International Conference on Operations Research and Enterprise Systems (ICORES 2017), pages 447-454

ISBN: 978-989-758-218-9

447

initial cluster centers. Jancey initialization method

(Jancey, 1966) aims to allocate each cluster center a

synthetic data object generated randomly within the

given data space. Ball and Hall’s method (Ball and

Hall, 1967) takes the center of the entire dataset as

the first cluster center, and then picks up the data

object which is at least T units away from the

existing cluster centers as the next cluster center.

Maxmin initialization method (Gonzalez, 1985;

Katsavounidis et al., 1994) randomly selects a data

object as the first cluster center, and then the data

object with the greatest minimum-distance to the

existing cluster centers is taken as the next cluster

center. This process repeats until k cluster centers

are determined. In (Cao et al., 2009), initial centers

are determined based on the most frequent values of

the attributes, and the average density values. In (Bai

et al., 2012), an initialization method based on

integrating the distance measure and the density

together was introduced. In (Ji et al., 2015), the

authors propose a cluster center initialization method

for the k-prototypes algorithms based on the

neighbor set concept.

Cluster validity indices are used to evaluate the

performance of the clustering algorithm

(Mukhopadhyay and Maulik, 2007). Some recently

studies used the cluster validity indices as objective

functions in a multiobjective framework (e.g.,

Mukhopadhyay and Maulik, 2007; Serapião et al.,

2016). Previous work introduced a multiobjective

clustering model (MODEK-Modes) based on self-

adaptive differential evolution with three cluster

validity indices, which are symmetry index,

compactness index, and silhouette index (Soliman

and Saleh, 2015). The MODEK-Modes model used

K-modes algorithm as it deals with categorical data,

and it has been implemented using randomly

initialized centers. This work aims to enhance data

clustering performance through introducing two

initialization methods which are the K-Modes (KM)

initialization method and the entropy initialization

method. These proposed initialization methods are

used in conjunction with the MODEK-Modes

Model. The two proposed methods are tested using

ten benchmark real life datasets obtained from the

UCI Machine Learning Repository. To evaluate the

performance of the proposed initialization methods,

we compared them against four initialization

methods from the literature which are the Forgy

method (Redmond and Heneghan, 2007), the density

method (Bai et al., 2012), the value-attributes (Khan

and Ahmed, 2013), and K-Prototype (Ji et al., 2015),

in addition to the results of the MODEK-Modes

Model with randomly initialized center. The time

and space complexity of our proposed methods are

analyzed, and the comparison with the other

methods confirms the effectiveness of our methods.

The rest of the paper is organized as follows. Section

2 introduces preliminary concepts needed in the

current work. Section 3 presents the KM

initialization method, while Section 4 presents the

entropy initialization method. Section 5 shows the

experimental results and analysis for the proposed

methods in comparison with other initialization

methods. Section 6 concludes the work and

discusses future work.

2 PRELIMINARY CONCEPTS

This section introduces preliminary concepts needed

in our work. We will discuss the K-modes algorithm

and the entropy concept followed by a brief

overview of the MODEK-Modes model.

2.1 K-modes Algorithm

K-modes algorithm extends the K-means algorithm

to cluster categorical data by replacing means of

clusters by modes (Bai et al., 2013; Kim and

Hyunchul, 2008). K-modes algorithm uses a simple

matching distance, or a hamming distance when

measuring distances between data observations.

Formally, a clustering problem is formulated as an

optimization problem as follows:



,



(

μ,

)

=







,



1≤≤,1≤≤











(1)

where n is the number of data points, k is the

number of data clusters, and µ

is a membership of

data observation to cluster j (i.e. 



takes binary

values in crisp case). 



,



 is the matching

distance measure between data point x

and data

cluster center z

. To understand the matching

distance measure, let x and y be two data

observations in data set D and L be the number of

attributes in a data observation. The simple matching

distance measure between x and y in D is defined as:





(

,

)





(





,



)







(2)

where



(





,



)

=

0



=





1ℎ



ICORES 2017 - 6th International Conference on Operations Research and Enterprise Systems

448

The center of a cluster is updated using the

following equation:





=



∈

(





)

, ∈



(3)

where 



represents the new updated value of cluster

j in the 



attribute, and 



is the value of the data

observation r which has the most frequent value in

the 



attribute for the data observations within

cluster j. 



expresses all the possible values for

attribute , DOM is a domain of the attribute, and 



is the total number of data observations in cluster j.

Procedure 1 illustrates the steps of the K-Modes

algorithm.

Randomly initialize centers for the k clusters

2: Each data point is assigned to the cluster with the nearest

center (Eq. 2).

Update the center of each cluster using (Eq. 3).

Repeat steps 2 and 3 until the clusters’ centers stop

changing or other stopping criteria are met.

Procedure 1: Steps of K-Modes algorithm.

2.2 Entropy Concept

In data clustering, the entropy between each pair of

data points helps us visualize the complete dataset.

The value of entropy (a value in the range [0, 1]) is

low (close to zero) for either very close data points,

or very far data points. Here the data points can be

easily separated into clusters, thus the uncertainty is

low and the entropy is also low. On the other hand,

the entropy of high value (close to one) indicates

that the data points are separated by a distance close

to the average distances of all pairs of data points;

i.e., the maximum entropy occurs when the data

points are uniformly distributed in the feature space.

The entropy depends on the similarity measure. The

similarity measure S, in turn, is inversely

proportional to the distance; i.e., the similarity

measure has a very high value (close to one) for very

close data points - which should be located in the

same cluster, and a very low value (close to zero) for

very far data points should be located in different

clusters. As shown in Eq. 4, the entropy between a

pair of data points is defined as (Terano et al., 2000):





=−



log







−1−



log



1−





(4)





=0.5∗(





+





)







=

∑





=











(5)

In this paper, we enhanced the similarity measure

via augmenting the traditional distance-based

similarity with another term associated with the

current clustering of the two data points (Eq. 5). We

denote this novel term by the cluster-based similarity

measure. Recall that the traditional similarity

measure for categorical data is inversely

proportional to the Hamming distance; i.e., the

distance-based similarity measure between two data

points is given as





=

∑



















, where i and

j be any two data points, D the number of attributes,

and 



=



is 1 if 



equals to 



and 0

otherwise. In the paper, we define the novel cluster-

based similarity measure as follows: 





= 1 if

and only if i and j lie in the same cluster; and 0

otherwise. Finally, the total entropy between all

pairs of data points is calculated by the following

equation (Terano et al., 2000):

=















(6)

2.3 MODEK-Modes Model

MODEK-Modes model is a multiobjective data

clustering model based on self-adaptive differential

evolution using three cluster validity indices (i.e.,

objective functions) (Soliman and Saleh, 2015). The

three cluster validity indices are symmetry index (to

maximize similarity within cluster), compactness

index (to maximize dissimilarity between different

clusters), and silhouette index (to test a suitability

the clustering model to the processed dataset). The

MODEK-Modes model starts with initializing a

population of random centers, then assigning data

points to the initialized centers forming clusters.

Next steps are repeated until the total number of

iterations is reached. These steps start with updating

centers of clusters, reassigning data points to

clusters, and evaluating fitness of individuals. The

next two steps apply the adapted mutation,

crossover, and evaluate the candidates for each

parent. The selection operator then creates the new

population based on fitness function.

3 K-MODES INITIALIZATION

METHOD

This proposed method uses the traditional K-modes

algorithm as an initialization method, where the

output of K-modes algorithm is considered the input

to the MODEK-Modes Model (see Figure 1). The

traditional K-modes algorithm has a single objective

function which minimizes the total distances

K-modes and Entropy Cluster Centers Initialization Methods

449

between all data observations and centers of clusters.

The traditional K-modes algorithms starts with

initializing random centers, and then assign data

points to the clusters, then update centers. The k-

modes algorithm repeats the last two steps until it

converges. These converged centers are considered

the initial population for the MODEK-Modes model.

Figure 1: The KM initialization method.

4 ENTROPY INITIALIZATION

METHOD

The proposed entropy initialization method consists

of two stages (see Figure 3). The first stage is called

an entropy stage which produces an accumulated

matrix. This accumulated matrix is the input for the

second initialization stage. The entropy stage

(Procedure 2) operates only once for each dataset

under consideration. The entropy stage takes the

dataset as an input and outputs an accumulated

matrix based on the entropy similarity measure. The

entropy stage starts with clustering the data points

based on the entropy values. We illustrate an

algorithm (inspired by (Li et al., 2004)) to utilize the

total entropy in the cluster process. Our contribution

is the enhancement of the similarity measure used in

this algorithm. The core idea of the algorithm is to

try to minimize the total entropy via randomly

swapping points between clusters. Recall that the

entropy between a pair of data points will be small if

either the proposed enhanced similarity measure is

low or high. The enhanced similarity measure has a

small value only when both the distance-based

measure and the cluster-based measure have small

values (i.e., large distance and different clusters).

The enhanced similarity measure has a large value

only when both the distance-based measure and the

cluster-based measure have large values (i.e., small

distance and same clusters). Thus swapping points

between clusters will have significant and

meaningful impacts on the total entropy. After

clustering the given data points, we need to

determine a mode of each cluster by using Eq. 3.

The distances between every data point and the k

clusters are then calculated to producing a

membership matrix (n × K). Elements of the

membership matrix will be computed by the

following equation:



,



,

∑



,

∀=1,..,&∀j=1,...,

(7)

where n is the total number of data points in the used

dataset, K is the number of data clusters, 

,

is a

membership between data point i and cluster j, and



,

is the distance between data point i

and cluster j.

1. Entropy clustering procedure

• Input: (data points: X, # of classes: k)

• Output: cluster assignment;

• Begin

1. Initialization:

1.1. Put all data points into one cluster

1.2. Compute Initial Criterion E

(Initial total entropy)

2. Iteration: Repeat until no more changes in cluster

assignment

2.1. Randomly pick a point x from a cluster A

2.2. Randomly pick another cluster B

2.3. Put x into B

2.4. Compute the new total entropy E

2.5. If 



2.5.1. Put x back into A

2.5.2. =



2.6. End

2.7. E

= E

2.8. Go to Step 2.1

2.9. End

3. Return the cluster assignment

• End

2. Determine a mode of each cluster using Eq. (3).

3. Measure the distances between each data point and the K clusters by

using Eq. (2).

4. Calculate membership values using Eq. (7) to form the membership

matrix (n x k), where n: number of data points and k: number of

clusters.

5. Building accumulated matrix, where 



be accumulated value

between the data point i

and cluster 



, and 



=

∑



,







,

Output: the accumulated matrix

Procedure 2: The procedure of the entropy stage.

An accumulated matrix is produced from the

computed membership matrix in step 5 (see

Procedure 2). This accumulated matrix is the output

of the entropy stage and the start of the second

initialization stage. Figure 2 presents an example of

how to calculate the accumulated matrix.

ICORES 2017 - 6th International Conference on Operations Research and Enterprise Systems

450

Figure 2: An example of the membership and accumulated

matrixes in the proposed entropy initialization method.

The initialization stage mainly depends on the

accumulated matrix (Procedure 3).

• For every individual in the population:

• Initialization Stage: (Input: the previous calculated

accumulated matrix)

1. For each data point, i =1,2,…,n.

 Generate uniform random number U~(0, 1).

 Assign data point to a cluster (upper bound of U

interval) (see the accumulated matrix).

2. End For

3. Determine the mode of each cluster (be centroid of this

cluster) for categorical data.

• Output: set of the initialized centers of the k clusters.

Procedure 3: Steps of the initialization stage.

This stage is repeated every time we need to

generate a new population of centers for the

MODEK-modes model. The initialized centers are

produced from an initial distribution to the data

points based on the accumulated matrix. Let U

and

be uniform random numbers for the first and the

second data points. According to the accumulated

matrix in figure 2, if U

= 0.7 and U

= 0.3, then we

have to assign the first data point to the third cluster

(0.5≤



≤1.0) and assign the second data point

to the first cluster(0.0≤



≤0.7).

This process

is repeated for every data point in the dataset.

Finally, after distributing all data points, we have to

calculate the mode of the cluster by Eq. 3. And

hence, the mode of each cluster is considered the

initial coordinates for this cluster.

5 EXPERIMENTAL RESULTS

The clustering accuracy of the MODEK-modes

model is compared when using the two proposed

initialization methods against the Forgy method

(Redmond and Heneghan, 2007), the density method

(Bai et al., 2012), the value-attributes (Khan and

Ahmed, 2013), and K-Prototype (Ji et al., 2015), in

Figure 3: Stages of the entropy initialization method.

addition to the results of the MODEK-Modes Model

with randomly initialized centers. The results of 50

independent runs are summarized in Tables 1 and 2.

For every dataset, the tables illustrate the mean of

the best solution (in 50 runs) and the standard

deviation for every initialization method. T-test was

performed with confidence level 0.05 to check if the

differences between the results are statistically

significant or non-significant compared to the

second best performing method. As shown in Table

1, the proposed KM initialization method performed

better in eight datasets and with significant change

compared to other initialization methods. Table 2

illustrates the experimental results for the proposed

entropy method, and shows that the proposed

entropy method significantly improved the

clustering accuracy for seven dataset.

Moreover, Table 3 lists the run time of the seven

clustering initialization methods on different

datasets. From this table, we can see that the KM

method needs more time than the random and the

Forgy initialization methods. However, the KM

method consumes time less than Density, Value-

attributes, and K-Prototype initialization methods.

On the other hand, the Entropy method needs more

time than other methods in comparison except for

the K-Prototype initialization method.

6 CONCLUSION AND FUTURE

WORK

In the light of the importance of data clustering, the

paper aimed to improve the performance of data

clustering through addressing and enhancing

handling the initialization step, where clustering

models start with a set of random initial clusters and

are very sensitive to such randomly initialized

K-modes and Entropy Cluster Centers Initialization Methods

451

Table 1: Mean ± standard deviation of best solution of 50 independent runs and T-test for the KM initialization method and

the other compared methods, where “No”: refers to non-significant change, and “Yes”: refers to significant change.

KM Density Random Forgy

Value-

Attributes

K-

Prototype

T-test

Soybean

0.92784921

± 0.00018

0.8928741

±0.0000221

0.8961113

±0.00038

0.885262 ±

0.0071

0.889932 ±

0.0039

0.902728 ±

0.0073

Yes

B. Cancer

0.887345 ±

0.00021

0.86689 ±

0.0027

0.8210432

±0.00666

0.872992 ±

0.0023

0.878943 ±

0.0011

0.8956345

±0.0013

Spect

Heart

0.8198299

± 0.0010

0.7759342

±0.0016

0.7782529

±0.0080

0.7224645

±0.0031

0.787823 ±

0.0067

0.792651±

0.044

Yes

St. Heart

0.738873 ±

0.0023

0.7493424

±0.0020

0.6778213

±0.0018

0.717923 ±

0.0081

0.754592 ±

0.0029

0.7356782

± 0.0017

Yes

Zoo

0.958879 ±

0.0031

0.9144634

±0.0017

0.9377262

±0.0024

0.931978 ±

0.0038

0.933932 ±

0.019

0.928853

±0.0014

Yes

Liver

0.8845267

± 0.00041

0.8254823

± 0.0013

0.8175647

±0.0280

0.862247 ±

0.0027

0.857934 ±

0.0026

0.845723 ±

0.0011

Yes

HServival

0.6478901

± 0.0109

0.6106859

±0.031

0.6014788

±0.0027

0.6199879

±0.0037

0. 6204733

±0.033

0.612411 ±

0.0033

Yes

Dermatology

0.797225 ±

0.0021

0.7406811

±0.0011

0.710089 ±

0.0031

0.7122794

±0.0027

0.7588932

±0.0081

0.764563 ±

0.0055

Yes

L. Cancer

0.696881 ±

0.0020

0.6279032

±0.0032

0.6015231

±0.0061

0.6682301

±0.0170

0.648932 ±

0.0073

0.658932 ±

0.0044

Yes

Computer

0.6790211

±0.0051

0.6489341

±0.0037

0.5933928

±0.0053

0.6378941

±0.0062

0.6486372

±0.0016

0.6509342

±0.023

Yes

Table 2: Mean ± standard deviation of best solution of 50 independent runs and T-test for the entropy initialization method

and the other compared methods, where “No” refers to non-significant change, and “Yes” refers to significant change.

Entropy Density Random Forgy

Value-

Attributes

K-

Prototype

T-test

Soybean

0.9148790 ±

0.0017

0.8928741

±0.0000221

0.8961113

±0.00038

0.885262 ±

0.0071

0.889932 ±

0.0039

0.902728 ±

0.0073

Yes

B. Cancer

0.8482991 ±

0.001

0.86689 ±

0.0027

0.8210432 ±

0.00666

0.872992 ±

0.0023

0.878943 ±

0.0011

0.8956345

±0.0013

Yes

Spect

Heart

0.8256281 ±

0.0011

0.7759342

±0.0016

0.7782529

±0.0080

0.7224645

±0.0031

0.787823 ±

0.0067

0.792651±

0.044

Yes

St. Heart

0.7045282 ±

0.0002

0.7493424

±0.0020

0.6778213 ±

0.0018

0.717923 ±

0.0081

0.754592 ±

0.0029

0.7356782

±0.0017

Yes

Zoo

0.957811 ±

0.0031

0.9144634

±0.0017

0.9377262 ±

0.0024

0.931978 ±

0.0038

0.933932 ±

0.019

0.928853 ±

0.0014

Yes

Liver

0.870003 ±

0.00021

0.8254823

±0.0013

0.8175647 ±

0.0280

0.862247 ±

0.0027

0.857934 ±

0.0026

0.845723 ±

0.0011

HServival

0.6429739

±0.0011

0.6106859

±0.031

0.6014788

±0.0027

0.6199879

±0.0037

0.6204733±

0.033

0.612411 ±

0.0033

Yes

Dermatology

0.8571158 ±

0.00211

0.7406811

±0.0011

0.710089 ±

0.0031

0.7122794

±0.0027

0.7588932 ±

0.0081

0.764563 ±

0.0055

Yes

L. Cancer

0.7287116 ±

0.0003

0.6279032

±0.0032

0.6015231

±0.0061

0.6682301

±0.0170

0.648932 ±

0.0073

0.658932 ±

0.0044

Yes

Computer

0.7963731

±0.0019

0.6489341

±0.0037

0.5933928

±0.0053

0.6378941

±0.0062

0.6486372 ±

0.0016

0.6509342

±0.023

Yes

ICORES 2017 - 6th International Conference on Operations Research and Enterprise Systems

452

Table 3: The running time of the seven clustering initialization methods on the used datasets.

Average Running Time (Mintues)

KM Entropy Density Random Forgy

Value-

Attributes

K-

Prototype

Soybean

2.90 3.11 3.07 2.42 2.37 3.03 3.33

B. Cancer

11.81 12.20 11.89 9.34 9.22 11.84 12.23

Spect Heart 4.82 5.17 4.98 4.23 4.18 4.87 5.21

St. Heart

5.33 5.64 5.37 4.83 4.79 5.32 5.68

Zoo

3.40 3.62 3.43 3.21 3.17 3.41 3.66

Liver

6.91 7.10 6.93 6.67 6.59 6.89 7.24

HServival

6.27 6.68 6.31 5.82 5.66 6.29 6.72

Dermatology

9.62 9.89 9.67 9.32 9.27 9.62 9.95

L. Cancer

3.70 3.87 3.79 3.49 3.23 3.73 3.93

Computer

5.20 5.48 5.37 4.89 4.82 5.33 5.57

centers. Previous work has shown that using

multiple clustering validity indices in a

multiobjective clustering model (e.g., MODEK-

Modes model) yields more accurate results than

using a single validity index. Thus, we proposed to

enhance the performance of MODEK-Modes model

by introducing two new initialization methods.

These two proposed methods are K-Modes

initialization method and entropy initialization

method. The two proposed methods have been tested

using ten benchmark real life datasets obtained from

the UCI Machine Learning Repository. We applied

t-test to check the significance of the results. Based

on the experimental results, the two initialization

methods achieved a significant improvement in the

clustering performance compared to the other

initialization methods. The KM method achieved a

significant improvement in the clustering

performance of 8 datasets, while the entropy method

improved the clustering performance in 7 datasets.

The time and space complexity of our proposed

methods are analyzed, and the comparison with the

other methods demonstrates the effectiveness of our

methods. For further work, the proposed two

initialization methods can be extended to deal with

the numerical datasets by replacing k-modes by the

k-means algorithm.

REFERENCES

Ammar E. Z., Lingras P., 2012, K-modes clustering using

possibilistic membership, IPMU 2012, Part III, CCIS

299, pp. 596–605.

Alvand M., Fazli S., Abdoli F. S., 2012, K-mean

clustering method for analysis customer lifetime value

with LRFM relationship model in banking

services, International Research Journal of Applied

and Basic Sciences, 3 (11): pp. 2294-2302.

Bai L., Liang J., Dang Ch., Cao F., 2012, A cluster centers

initialization method for clustering categorical data,

Expert Systems with Applications, 39, pp. 8022–8029.

Bai L., Lianga J., Dang Ch., Cao F., 2013, A novel fuzzy

clustering algorithm with between-cluster information

for categorical data, Fuzzy Sets and Systems (215), pp.

55–73.

Ball G. H., Hall D. J., 1967, A clustering technique for

summarizing multivariate data, Behavioral Science 2

(2) 153–155.

Bhagat P. M., Halgaonkar P. S., Wadhai V. M., 2013,

Review of clustering algorithm for categorical data,

International Journal of Engineering and Advanced

Technology, 3 (2).

Cao F., Liang J., Bai L., 2009, A new initialization method

for categorical data clustering, Expert Systems with

Applications, 36, pp. 10223–10228.

Cao F., Liang J., Li D., Bai L., Dang Ch., 2012, A

dissimilarity measure for the k-Modes

clustering algorithm, Knowledge-Based Systems 26,

pp. 120–127.

Gonzalez T., 1985, Clustering to minimize the maximum

intercluster distance, Theoretical Computer Science,

38 (2– 3), pp. 293–306.

Jancey R. C., 1996, Multidimensional group analysis,

Australian Journal of Botany, 14 (1), pp. 127–130.

Ji J., Pang W., Zheng Y., Wang Z., Ma Zh., Zhang L.,

2015, A novel cluster center initialization method for

the k-Prototypes algorithms using centrality and

distance, Applied Mathematics and Information

Sciences, No. 6, pp. 2933-2942.

Katsavounidis, C.-C. Kuo J., Zhang Z., 1994, A new

initialization technique for generalized Lloyd iteration,

IEEE Signal Processing Letters, 1 (10), pp. 144–146.

Khan Sh. S., Ahmed A., 2013, Cluster center initialization

algorithm for K-modes clustering, Expert Systems with

Applications, 40, pp. 7444–7456.

K-modes and Entropy Cluster Centers Initialization Methods

453

Kim K.K., Hyunchul A., 2008, A recommender system

using GA K-means clustering in an online shopping

market, Expert Systems with Applications, 34, pp.

1200–1209.

Li T., MA S., Ogihara M., 2004, Entropy-based criterion

in categorical clustering, The 21rst International

Conference on Machine Learning, Banff, Canada.

Mukhopadhyay A., Maulik U., 2007, Multiobjective

approach to categorical data clustering, IEEE Congress

on Evolutionary Computation, pp. 1296 – 1303.

Pratima D., Nimmakant N. i, 2008, Pattern recognition

algorithms for cluster identification problem, Special

Issue of International Journal of Computer Science &

Informatics, Vol. II, Issue 1 (2), pp. 2231–5292.

Rahman N., Sarma P., 2013, Analysis of treatment of

prostate cancer by using multiple techniques of data

mining, International Journal of Advanced Research

in Computer Science and Software Engineering 3 (4),

pp. 584–589.

Redmond S. J., Heneghan C., 2007, A method for

initialising the k-means clustering algorithm using kd-

trees, Pattern Recognition Letters, 28(8), pp. 965–973.

Serapião B. S., Corrêa G. S. , Gonçalves F. B. , Carvalho

V. O., 2016, Combining K-means and K-harmonic

with fish school search slgorithm for data clustering

task on graphics processing units, Applied Soft

Computing, 41, pp. 290–304.

Soliman O. S. , Saleh D. A., 2015, Multi-objective K-

modes data clustering algorithm using self-adaptive

differential evolution, International Journal of

Advanced Research in Computer Science and

Software Engineering, 5(2), pp. 57-65.

Terano T., Liu H., Chen A. L.P., 2000, Knowledge

discovery and data mining. Current Issues and New

Applications, 4th Pacific Asia Conference, PAKDD.

ICORES 2017 - 6th International Conference on Operations Research and Enterprise Systems

454