
 
problems is that in most cases they generate a 
number of clusters that is much larger than the real 
one. Moreover, usually these algorithms do not 
stabilize in a cluster solution, this is, they constantly 
construct and deconstruct clusters during the 
process. To overcome these difficulties and improve 
the quality of results the authors proposed an 
Adaptive Ant Clustering Algorithm - A2CA. A 
modification included in the present approach is a 
cooling program for the parameter that controls the 
probability of ants picking up objects from the grid. 
2.1  Parameters of the Neighborhood 
Function 
The clusters’ spatial separation on the grid is crucial 
so that individual clusters are well defined, allowing 
their automatic recovery. Spatial proximity, when it 
occurs, may indicate a premature formation of the 
cluster (Handl et al., 2006). 
Defining the parameters for the neighborhood 
function is a key factor in the cluster quality. In the 
case of the σ perception radius it is more attractive to 
employ larger neighborhoods to improve the quality 
of clusters and their distribution on the grid. 
However, this procedure is computationally more 
expensive, once the number of cells to be considered 
for each action grows quadratically with the radius 
and it also inhibits the rapid formation of clusters 
during the initial distribution phase. A radius of 
perception that gradually increases in time 
accelerates the dissolution of preliminary small 
clusters (Handl et al., 2006). A progressive radius of 
perception was also used by (Vizine et al., 2005). 
Moreover, after the initial clustering phase, 
(Handl et al., 2006) replaced the scalar parameter 
2
1
 by 
occ
N
1
 in equation (5), where N
occ 
is the 
number of grid cells occupied, observed within the 
local neighborhood. Thus, only the similarity, not 
the density, was not taken into account. Boryczka 
(2009), in her algorithm ACAM, proposed to replace 
the scalar
2
1
 in equation in (5) by the scalar 
, in 
which 
0
 is the initial radius of perception. 
According to (Handl et al., ,2006), α determines 
the percentage patterns on the grid that rated as 
similar. The choice of a very small value for α 
prevents the formation of clusters on the grid. On the 
other hand, choosing a value too large for α results 
in the fusion of clusters. 
Determining parameter of α is not simple and its 
choice is highly dependent on the structure of the 
data set. An inadequate value is reflected by an 
excessive or extremely low activity in the grid. The 
amount of activity is reflected by the frequency of 
successful operations in the ant picking and 
dropping. Based on these analyses, (Handl et al., 
2006) proposed an automatic adaptation of α. 
Boryczka (2009) proposed a new scheme for 
adjusting the value of α. 
(Tan et al., 2007) examine the scalar parameter 
of dissimilarity in Ant Colonies approaches for data 
clustering. The authors show that there is no need to 
use an automatic adaptation of α. They propose a 
method to calculate a fixed α for each database. The 
value of α is calculated regardlessly of the clustering 
process. 
To measure the similarity between patterns, 
different metrics are used. (Handl et al., 2006) use 
Euclidean distance for synthetic data and cosine for 
real data. Boryczka (2009) tested different 
dissimilarity measures: Euclidean, Cosine and 
Gower measures. 
2.2  The Basic Algorithm Proposed by 
(Deneubourg et al., 1991) 
At an initial phase, patterns are randomly scattered 
throughout the grid. Then, each ant randomly 
chooses a pattern to pick and is placed at a random 
position on the grid. 
In the next phase, called the distribution phase, in 
a simple loop each ant is randomly selected. This ant 
travels the grid running steps of length L in a 
direction randomly determined. According to (Handl 
et al., 2006), using a large step size speeds up the 
clustering process. The ant then, probabilistically 
decides if it drops its pattern at this position. 
If the decision to drop the pattern is negative, 
another ant is randomly chosen and the process 
starts over. If the decision is positive, the ant drops 
the pattern at its current position on the grid, if it is 
free. If this grid cell is occupied by another pattern it 
must be dropped at a free neighboring cell through a 
random search. 
The ant then seeks for a new pattern to pick. 
Among the free patterns on the grid, this is, patterns 
that are not being carried by any ant, the ant 
randomly selects one, goes to its position on the 
grid, evaluates of the neighborhood function and 
probabilistically decide if it picks this pattern. This 
choosing process of a free pattern on the grid runs 
until the ant finds a pattern that should be picked. 
Only then this phase is resumed, choosing 
another ant until a stop criterion is satisfied. 
PATTERN CLUSTERING USING ANTS COLONY, WARD METHOD AND KOHONEN MAPS
139