MULTI-SCALE COMMUNITY DETECTION USING STABILITY

AS OPTIMISATION CRITERION IN A GREEDY ALGORITHM

Erwan Le Martelot and Chris Hankin

Imperial College London, Department of Computing, South Kensington Campus, London SW7 2AZ, U.K.

Keywords:

Community detection, Multi-scale, Multi-resolution, Network analysis, Stability, Modularity, Network

partition, Greedy optimisation, Markov process.

Abstract:

Whether biological, social or technical, many real systems are represented as networks whose structure can be

very informative regarding the original system’s organisation. In this respect the ﬁeld of community detection

has received a lot of attention in the past decade. Most of the approaches rely on the notion of modularity to

assess the quality of a partition and use this measure as an optimisation criterion. Recently stability was intro-

duced as a new partition quality measure encompassing former partition quality measures such as modularity.

The work presented here assesses stability as an optimisation criterion in a greedy approach similar to mod-

ularity optimisation techniques and enables multi-scale analysis using Markov time as resolution parameter.

The method is validated and compared with other popular approaches against synthetic and various real data

networks and the results show that the method enables accurate multi-scale network analysis.

1 INTRODUCTION

In biology, sociology, engineering and beyond, many

systems are represented and studied as graphs, or net-

works (e.g. protein networks, social networks, web).

In the past decade the ﬁeld of community detection at-

tracted a lot of interest considering community struc-

tures as important features of real-world networks

(Fortunato, 2010). Given a network of any kind,

looking for communities refers to ﬁnding groups of

nodes that are more densely connected internally than

with the rest of the network. The concept considers

the inhomogeneity within the connections between

nodes to derive a partitioning of the network. As op-

posed to clustering methods which commonly involve

a given number of clusters, communities are usually

unknown, can be of unequal size and density and of-

ten have hierarchies (Fortunato, 2010). Finding such

partitioning can provide information about the under-

lying structure of a network and its functioning. It can

also be used as a more compact representation of the

network, for instance for visualisations.

Detecting community structure in networks can be

split into two subtasks: how to partition a graph, and

how to measure the quality of a partition. The latter is

commonly done using modularity (Newman and Gir-

van, 2004). Partitioning graphs is an NP-hard task

(Fortunato, 2010) and heuristics based algorithms ha-

ve thus been devised to reduce the complexity while

still providing acceptable solutions. Considering the

size of some real-world networks much effort is put

into ﬁnding efﬁcient algorithms able to deal with

larger and larger networks such as modularity op-

timisation methods. However it has been shown

that networks often have several levels of organisa-

tion (Simon, 1962), leading to different partitions for

each level which modularity optimisation alone can-

not handle (Fortunato, 2010). Methods have been pro-

vided to adapt modularity optimisation to multi-scale

(multi-resolution) analysis using a tuning parameter

(Reichardt and Bornholdt, 2006; Arenas et al., 2008).

Yet the search for a partition quality function that ac-

knowledges the multi-resolution nature of networks

with appropriate theoretical foundations has received

less attention. Recently, stability (Delvenne et al.,

2010) was introduced as a new quality measure for

community partitions. Here we investigate its use as

an optimisation criterion for multi-scale analysis. We

show how stability can be used in place of modularity

in modularity optimisation methods and present a new

greedy agglomerative algorithm using stability as op-

timisation function and enabling multi-scale analysis

using Markov time as a resolution parameter.

The next section reviews related work. Then our

method is presented followed by experiments assess-

ing its potential, discussion and conclusion.

216

Le Martelot E. and Hankin C..

MULTI-SCALE COMMUNITY DETECTION USING STABILITY AS OPTIMISATION CRITERION IN A GREEDY ALGORITHM.

DOI: 10.5220/0003655002080217

In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval (KDIR-2011), pages 208-217

ISBN: 978-989-8425-79-9

 2011 SCITEPRESS (Science and Technology Publications, Lda.)

2 BACKGROUND

While several community partition quality measures

have been used (Fortunato, 2010), the most com-

monly found in the literature is modularity (Newman

and Girvan, 2004). Given a partition into c communi-

ties let e be the community matrix of size c × c where

each e

i j

gives the fraction of links going from a com-

munity i to a community j and a

∑

i j

the fraction

of links connected to i. (If the graph is undirected,

each e

i j

not on the diagonal should be given half of

the edges connecting communities i and j so that the

number of edges connecting the communities is given

by e

i j

(Newman and Girvan, 2004).) Modularity

is the sum of the difference between the fraction

of links within a partition linking to this very parti-

tion minus the expected value of the fraction of links

doing so if edges were randomly placed:

∑

i=1

− a

) (1)

One advantage of modularity is to impose no con-

straint on the shape of communities as opposed for

instance to the clique percolation method (Palla et al.,

2005) that deﬁnes communities as adjacent k-cliques

thus imposing that each node in a community is part

of a k-clique.

Modularity was initially introduced to evaluate

partitions. However its use has broadened from

partition quality measure to optimisation function

and modularity optimisation is now a very common

approach to community detection (Newman, 2004;

Clauset et al., 2004; Newman, 2006; Blondel et al.,

2008). (Recent reviews and comparisons of com-

munity detection methods including modularity op-

timisation methods can be found in (Fortunato, 2010;

Lancichinetti and Fortunato, 2009).) Modularity op-

timisation methods commonly start with each node

placed in a different community and then successively

merge the communities that maximise modularity at

each step. Modularity is thus locally optimised at

each step based on the assumption that a local peak

should indicate a particularly good partition. The ﬁrst

algorithm of this kind was Newman’s fast algorithm

(Newman, 2004). Here, for each candidate partition

the variation in modularity ∆Q

that merging two

communities i and j would yield is computed as

∆Q

i j

= 2(e

i j

− a

) (2)

where i and j are the communities merged in the new

candidate partition. Computing only ∆Q

minimises

the computations required to evaluate modularity and

leads to the fast greedy algorithm given in Algorithm

1. This algorithm enables the incremental building

Algorithm 1: Greedy algorithm sketch for modularity opti-

misation.

1. Divide in as many clusters as there are nodes.

2. Measure modularity variation ∆Q

for each can-

didate partition where a pair of clusters are

merged.

3. Select the network with the highest ∆Q

4. Go back to step 2.

of a hierarchy where each new partition is the local

optima maximising Q

at each step. It was shown

to provide good solutions with respect to the original

Girvan-Newman algorithm that performs accurately

but is computationally highly demanding and is thus

not suitable for large networks (Newman and Girvan,

2004). (Note that accuracy refers in this context to a

high modularity value. Other measures, such as sta-

bility, might rank partitions differently.) Since then

other methods have been devised such as (Clauset

et al., 2004) optimising the former method, another

approach based on the eigenvectors of matrices (New-

man, 2006) or the Louvain method (Blondel et al.,

2008). The last method has shown to outperform in

speed previous greedy modularity optimisation meth-

ods by reducing the number of intermediate steps for

aggregating all the nodes.

These methods all rely on modularity optimisa-

tion. Yet modularity optimisation suffers from sev-

eral issues. One issue is known as the resolution limit

meaning that modularity optimisation methods can

fail to detect small communities or over-partition net-

works (Fortunato and Barth

elemy, 2007) thus miss-

ing the most natural partitioning of the network. An-

other issue is that the modularity landscape admits a

large number of structurally different high-modularity

value solutions and lacks a clear global maximum

value (Good et al., 2010). It has also been shown

that random-graphs can have a high modularity value

(Guimer

a et al., 2004).

Some biases have been introduced to alter the be-

haviour of the method towards communities of var-

ious sizes (Danon et al., 2006; Reichardt and Born-

holdt, 2006; Arenas et al., 2008). In (Danon et al.,

2006), the authors observed that large communities

are favoured at the expense of smaller ones biasing

the partitioning towards a structure with a few large

clusters which may not be an accurate representation

of the network. They provided a normalised measure

of ∆Q

deﬁned as

∆Q

i j

= max



∆Q

i j

∆Q

i j



(3)

MULTI-SCALE COMMUNITY DETECTION USING STABILITY AS OPTIMISATION CRITERION IN A GREEDY

ALGORITHM

217

which aims at treating communities of different sizes

equally. In (Reichardt and Bornholdt, 2006), modu-

larity optimisation is modiﬁed by using a scalar pa-

rameter γ in front of the null term (the fraction of

edges connecting vertices of a same community in a

random graph) turning equation (1) into

∑

− γa

) (4)

where γ can be varied to alter the importance given

to the null term (modularity optimisation is found for

γ = 1). In (Arenas et al., 2008), modularity optimi-

sation is performed on a network where each node’s

strength has been reinforced with self loops. Consid-

ering the adjacency matrix A, modularity optimisation

is performed on A + rI where I is the identity matrix

and r is a scalar. Varying the value of r enables the

detection of communities at various coarseness levels

(modularity optimisation is found for r = 0). With

their resolution parameter, the two latter methods en-

able a multi-scale network analysis.

Signiﬁcant attention has been given to modular-

ity but little attention has been given to new partition

quality measures. Recently, stability was introduced

in (Delvenne et al., 2010) as a new partition quality

measure unifying some known clustering heuristics

including modularity and using Markov time as an in-

ner resolution parameter. The stability of a graph con-

siders the graph as a Markov chain where each node

represents a state and each edge a possible state tran-

sition. Let n be the number of nodes, m the number

of edges, A the n × n adjacency matrix containing the

weights of all edges (the graph can be weighted or

not), d a size n vector giving for each node its degree

(or strength for a weighted network) and D = diag(d)

the corresponding diagonal matrix. The stability of a

graph considers the graph as a Markov chain where

each node represents a state and each edge a possible

state transition. The chain distribution is given by the

stationary distribution π =

. Let also Π be the cor-

responding diagonal matrix Π = diag(π). The transi-

tion between states is given by the n×n stochastic ma-

trix M = D

−1

A. Assuming a community partition, let

H be the indicator matrix of size n × c giving for each

node its community. The clustered auto-covariance

matrix at Markov time t is deﬁned as:

= H

(ΠM

− π

π)H (5)

Stability at time t noted Q

is given by the trace of

and the global stability measure Q

considers the

minimum value of the Q

over time from time 0 to a

given upper bound τ:

= min

0≤t≤τ

trace(R

) (6)

This model can be extended to deal with real values

of t by using the linear interpolation:

= (c(t) − t) · R( f (t)) + (t − f (t)) · R(c(t)) (7)

where c(t) returns the smallest integer greater than t

and f (t) returns the greatest integer smaller than t.

This is useful to investigate for instance time values

between 0 and 1. It was indeed shown in (Delvenne

et al., 2010) that the use of Markov time with values

between 0 and 1 enables detecting ﬁner partitions than

those detected at time 1 and above.

Also, this model can be turned into a continuous

time Markov process by using the expression e

(M−I)t

in place of M

(where e is the exponential function)

(Delvenne et al., 2010).

Stability has been introduced as a measure to eval-

uate the quality of a partition hierarchy and has been

used to assess the results of various modularity op-

timisation algorithms. Further mathematical founda-

tions have been presented in (Lambiotte et al., 2008;

Lambiotte, 2010). The work presented here investi-

gates stability as an optimisation function with var-

ious practical test cases. It presents and assesses a

greedy algorithm that optimises stability similarly to

modularity optimisation methods and uses stability’s

inner Markov time as a resolution parameter. While

related approaches such as Arenas et al’s and Re-

ichardt et al’s methods also offer a multi-scale anal-

ysis with their respective parameters, these methods

offer a tuneable version of modularity optimisation

by modifying the importance of the null factor or by

adding self-loops to nodes. Such analysis remains

based on a one step random walk analysis of the net-

work with modiﬁcations of its structure. Stability op-

timisation enables on the contrary random walks of

variable length deﬁned by the Markov time thus ex-

ploiting thoroughly the actual topology of the net-

work. As communities reﬂect the organisation of

a network, and hence its connectivity, this approach

seems to be more suitable. The next section presents

the method followed by experiments assessing it and

comparing it to other relevant approaches.

3 METHOD

As discussed in (Delvenne et al., 2010) the measure of

stability and the investigation of stability of a partition

along the Markov time (stability curve) for a partition

can help addressing the partition scale issue and the

optimal community identiﬁcation. The results from

the authors indeed show with the stability curve that

the clustering varies depending on the time window

during which the Markov time is considered. From

KDIR 2011 - International Conference on Knowledge Discovery and Information Retrieval

218

there, our work uses the Markov time as a resolution

parameter in an optimisation context where stability

is used as the optimisation criterion.

Considering the Markov chain model, it has been

shown in (Delvenne et al., 2010) that stability at time

1 is modularity. From equation (5) it can also be

derived that stability at time t is the modularity of

a graph whose adjacency matrix is A

= DM

(or

= De

(M−I)t

for the continuous time model).

Considering stability at time t as the modularity

of a graph given by the adjacency matrix A

allows

us to apply the work done on modularity optimisa-

tion to stability. Stability optimisation then becomes a

broader measure where modularity is the special case

t = 1 in the Markov chain. Using the modularity no-

tation from equation (1), stability at time t can be de-

ﬁned as:

∑

− a

) (8)

where e

is the community matrix for Markov time t

computed from A

Modularity optimisation is based on computing

the change in modularity between an initial partition

and a new partition where two clusters have been

merged. The change in modularity when merging

communities i and j is given by equation (2) and sim-

ilarly the change in stability at time t is

∆Q

i j

= 2(e

i j

− a

) (9)

Following equation (6) the new Q

value Q

is:

= min

0≤t≤τ

+ ∆Q

) (10)

At each clustering step, the partition with the best Q

value is kept and Q

is then updated as Q

= Q

. For

computational reasons the time needs to be sampled

between 0 and τ. Markov time can be sampled lin-

early or following a log scale. The latter is usually

more efﬁcient for large time intervals.

The matrices e

are computed in the initialisation

step of the algorithm and then updated by succes-

sively merging the lines and columns corresponding

to the merged communities. This leads to the greedy

stability optimisation (GSO) algorithm given in Algo-

rithm 2, based on the principle of Algorithm 1.

Depending on the Markov time boundaries the

partitions will vary as the larger the time window, the

longer a partition must keep a high stability value to

get a high overall stability value, as deﬁned in equa-

tion (6). The Markov time thus acts as a resolution

parameter.

Compared to Newman’s fast algorithm the addi-

tional cost of stability computation and memory re-

quirement is proportional to the number of times con-

sidered in the Markov time window. For each time

t considered in the computation, a matrix e

must be

computed and kept in memory. Let n be the number

of nodes in a network, m the number of edges and s

the number of time steps required for stability com-

putation. The number of merge operations needed to

terminate are n − 1. Each merging operation requires

to iterate through all edges, hence m times, and for

each edge to compute the stability variation s times.

The computation of each ∆Q can be performed in

constant time. In a non optimised implementation the

merging of communities i and j can be performed in n

steps, hence the complexity of the algorithm would be

O(n(m.s + n)). However the merging of communities

i and j really consists in adding the edges of commu-

nity j to i and then deleting j. To do so there is one

operation per edge. The size of a cluster at iteration

i is given by cs(i) =

n−i

. and therefore the average

cluster size over all iterations is

¯cs =

∑

i=1

cs(i) =

∑

i=1

≈ ln(n) + γ (11)

with γ the Euler-Mascheroni constant. As the number

of edges is bounded by the number of nodes squared,

the algorithm can be implemented with the complex-

ity O (n(m.s + ln

(n))) provided the appropriate data

structures are used to exploit this, as discussed in

(Clauset et al., 2004). Considering that s should be

low, the complexity is O(n(m + ln

(n))).

Also a large time window does not imply many

time values. The speed-accuracy trade-off comes

from the number of steps s within the boundaries,

whatever the boundaries. While the full mathematical

deﬁnition of stability considers all Markov times in a

given interval, all Markov times may not be crucial

to a good (or even exact) approximation of stability.

The fastest way to approximate stability is to compute

it with only one time value. As stability tends to de-

crease as the Markov time increases, we are seeking

when the following approximation can be made:

= min

0≤t≤τ

trace(R

) ≈ trace(R

) (12)

The need for considering consecutive time values in

the computation of stability addresses an issue en-

countered within random walks. Considering for in-

stance a graph with three nodes a, b and c with an

edge between nodes a and b and between nodes b and

c. Using a Markov time of 2 only (i.e. a random walk

of 2 steps with no consideration of the ﬁrst step) start-

ing from a there would be no transition between a and

b as after one step from a the random walker would

be in b and then it could only go back to a or walk

to c. However, the more densely connected the clus-

ters, the less likely this situation is to happen as many

paths can be borrowed to reach each node. This time

MULTI-SCALE COMMUNITY DETECTION USING STABILITY AS OPTIMISATION CRITERION IN A GREEDY

ALGORITHM

219

optimisation is assessed with the rest of the method in

the next section.

4 EXPERIMENTS & RESULTS

To assess our method we use various known networks

made of real or synthetic data that have been used in

related work and are publicly available. For compari-

son, other relevant methods are tested along with our

method: Newman’s fast algorithm (Newman, 2004),

Danon et al’s method (Danon et al., 2006), the Lou-

vain method (Blondel et al., 2008), Reichardt et al’s

method (Reichardt and Bornholdt, 2006), and Arenas

et al’s method (Arenas et al., 2008). Both Reichardt

et al’s and Arenas et al’s methods use a resolution pa-

rameter that enables multi-scale community detection

based on this parameter. The Louvain method also

returns a succession of partition (not tuneable). The

other two methods are not multi-scale and hence only

return one partition.

The community detection algorithms were imple-

mented in Matlab

except for the code of the Lou-

vain method downloaded from the authors website

(we used their hybrid C++ Matlab implementation).

All experiments were run using Matlab R2010b un-

der MacOS X on an iMac 3.06GHz Intel Core i3.

4.1 Networks

The networks considered are two synthetic and four

real-world data networks

that have been used as

benchmarks in the literature to assess community de-

tection algorithms. The networks have been chosen

for their respective properties (e.g. multi-scale, scale-

free) and popularity that enable an assessment of

our method and comparisons with other approaches.

While selecting very large networks can demonstrate

speed efﬁciency, the results are commonly ranked us-

ing modularity which as previously discussed is not

suitable for this work. We therefore deemed appro-

priate to use here networks of smaller size with some

knowledge of their structure or content used for the

evaluation of the results, similarly to what has been

done in related work (Reichardt and Bornholdt, 2006;

Arenas et al., 2008).

The code developed for this work is available on re-

quest. More also at http://www.elemartelot.org.

http://sites.google.com/site/ﬁndcommunities/

Available from http://www-

personal.umich.edu/ mejn/netdata/

4.1.1 Synthetic Datasets

Ravasz et al’s Scale-free Hierarchical Network.

This network was presented in (Ravasz and Barab

asi,

2003) and deﬁnes a hierarchical network of 125 nodes

as shown in Figure 1(a). The network is built itera-

tively from a small cluster of 5 densely linked nodes.

A node at the centre of a square is connected to 4 oth-

ers at the corners of the square themselves also con-

nected to their neighbours. Then 4 replicas of this

cluster are generated and placed around the ﬁrst one,

producing a 25 nodes network. The centre of the cen-

tral cluster is linked to the corner nodes of the other

clusters. This process is repeated again to produce the

125 nodes network. The structure can be seen as 25

clusters of 5 nodes or 5 clusters of 25 nodes.

Arenas et al’s Homogeneous in Degree Network.

This network taken from (Arenas et al., 2006) and

named H13-4 is a two hierarchy levels network of 256

nodes organised as follow. 13 edges are distributed

between the nodes of each of the 16 communities

of the ﬁrst level (internal community) formed of 16

nodes each. 4 edges are distributed between nodes of

each of the 4 communities of the second level (exter-

nal community) formed of 64 nodes each. 1 edge per

node links it with any community of the rest of the

network. The network is presented in Figure 1(b).

(a) Ravasz 125 nodes. (b) H13-4.

Figure 1: (a) Hierarchical scale free network generated in 3

steps producing 125 nodes at step 3 (Ravasz and Barab

asi,

2003). (b) Network presented in (Arenas et al., 2006) made

of 256 nodes organised in two hierarchical levels with 16

communities of 16 nodes for the ﬁrst level and 4 communi-

ties of 64 nodes for the second level.

4.1.2 Real-world Datasets

Zachary’s Karate Club. This network is a social

network of friendships between 34 members of a

karate club at a US university in 1970 (Zachary,

1977). Following a dispute the network was divided

into 2 groups between the club’s administrator and the

club’s instructor. The dispute ended in the instructor

creating his own club and taking about half of the ini-

tial club with him. The network can hence be divided

into 2 main communities. A division into 4 communi-

ties has also been acknowledged (Medus et al., 2005).

KDIR 2011 - International Conference on Knowledge Discovery and Information Retrieval

220

Algorithm 2: Greedy stability optimisation (GSO) algorithm taking in input an adjacency matrix and a Markov time window,

and returning a partition and its stability value.

Divide in as many communities as there are nodes

Set this partition as current partition C

cur

and as best known partition C

Set its stability value as best known stability Q

Set its stability vector (stability values at each Markov time) as current stability vector QV

Compute initial community matrix e

Compute initial community matrices at Markov times e

while 2 communities at least are left in current partition: length(e) > 1 do

Initialise best loop stability Q

loop

← −∞

for all pair of communities with edges linking them: e

i j

> 0 do

for all times t in time window do

Compute dQV (t) ← ∆Q

end for

Compute partition stability vector: QV

tmp

← QV + dQV

Compute partition stability value by taking its minimum value: Q

tmp

← min(QV

tmp

)

if current stability is the best of the loop: Q

tmp

> Q

loop

then

loop

← Q

tmp

loop

← QV

tmp

Keep in memory best pair of communities (i, j)

end if

end for

Compute C

cur

by merging the communities i and j

Update all community matrices e and e

by merging rows i and j and columns i and j

Set current stability vector to best loop stability vector: QV ← QV

loop

if best loop stability higher than best known stability: Q

loop

> Q then

Q ← Q

loop

C ← C

cur

end if

end while

return best found partition C and its stability Q

Lusseau et al’s Dolphins Social Network. This

network is an undirected social network resulting

from observations of a community of 62 bottle-nose

dolphins over a period of 7 years (Lusseau et al.,

2003). Nodes represent dolphins and edges represent

frequent associations between dolphin pairs occurring

more often than expected by chance. Analysis of the

data revealed 2 main groups and a further division can

be made into 4 groups (Lusseau and Newman, 2004).

American College Football Dataset. This dataset

contains the network of American football games

(Girvan and Newman, 2002). The 115 nodes repre-

sent teams and the edges represent games between

2 teams. The teams are divided into 12 groups con-

taining around 8-12 teams each and games are more

frequent between members of the same group. Also

teams that are geographically close but belong to dif-

ferent groups are more likely to play one another than

teams separated by a large distance. Therefore in this

dataset the groups can be considered as known com-

munities.

Les Mis

erables. This dataset taken from (Knuth,

1993) represents the co-appearance of 77 characters

in Victor Hugo’s novel Les Mis

erables. Two nodes

share an edge if the corresponding characters appear

in a same chapter of the book. The values on the edges

are the number of such co-appearances.

4.2 Results

The networks are analysed by the 5 aforementioned

community detection methods and our stability opti-

misation algorithm for which both discrete and con-

tinuous Markov time versions are used. Also two time

window setups are considered to investigate the ap-

proximation from equation (12): one that considers a

time window [0, τ] and another that considers only τ.

Table 1 provides the results of the community de-

tection methods that have no resolution parameter.

The results show that these methods do not necessar-

ily ﬁnd the most meaningful partitions and that differ-

ent methods can identify different partitions. Yet this

does not imply that an unexpected result is meaning-

less. For example in Ravasz et al’s network, the cen-

tral node of the central community has many more

connections than the other nodes and more connec-

tions outside its community than inside. Therefore

this node can be seen as a community on its own.

Figure 2 plots the results of the tuneable meth-

MULTI-SCALE COMMUNITY DETECTION USING STABILITY AS OPTIMISATION CRITERION IN A GREEDY

ALGORITHM

221

Table 1: Number of detected communities by fast Newman, Louvain and Danon et al’s methods on the presented networks.

The identiﬁed division(s) known for these networks are also indicated (’-’ indicates that there is no clear a priori knowledge).

The Louvain method returns a hierarchy of partitions, given in order.

Algorithm Ravasz H13-4 Karate Dolphins Football Miserables

Fast Newman 6 4 3 4 6 5

Danon 6 4 4 4 6 6

Louvain 30, 10, 6 12, 4 6, 4 10, 5 12, 10 9, 6

Identiﬁed 5, 25 4, 16 2, 4 2, 4 12 -

−1

Parameter

Number of Communities

(a) Ravasz et al’s network.

−1

Parameter

Number of Communities

(b) Hierarchical 13-4 network.

−1

Parameter

Number of Communities

−1

Parameter

Number of Communities

(d) Dolphins social network.

−1

Parameter

Number of Communities

(e) American football network.

−1

Parameter

Number of Communities

(f) Les Mis

erables characters network.

Figure 2: Number of partitions returned by Reichardt et al’s, Arenas et al’s and our stability optimisation methods. The x-axis

represents 10γ for Reichardt et al’s method, r − r

for Arenas et al’s and t for ours. The setup using a time window is noted

[0,t] while the setup using only one time value is noted t. The time discrete Markov model (Markov chain) is noted (Dis) and

the time continuous Markov model is noted (Con). The dotted horizontal lines indicate known partitions size: (a) 5 and 25,

(b) 4 and 16, (c) 2 and 4, (d) 2 and 4, (e) 12.

KDIR 2011 - International Conference on Knowledge Discovery and Information Retrieval

222

ods along their respective parameter values. For Re-

ichardt et al’s algorithm, the x-axis represents 10γ.

For Arenas et al’s the x-axis represents r − r

where

= −

with m the number of edges and n the num-

ber of nodes. (See (Arenas et al., 2008) for details.

The authors use the lower bound r

asymp

= −

but

here we found that values of r below r

were irrele-

vant.) The value of r

is calculated for each network.

For our algorithm, the x-axis represents the Markov

time t. For the time window setup the sampling is

done from time 0 to 100 with a step of 0.05 between

within [0, 2], a step of 0.25 within [2, 10] and a step

of 1 afterwards. The steps between successive values

of the parameter are 0.05 for Arenas and

0.05

= 0.005

for Reichardt et al’s (as the x-axis represents 10γ).

Considering only stability optimisation we can ob-

serve that the two Markov processes behave in a very

similar way. We can also observe that the difference

between the runs considering the time window and

those considering only its upper bound is minimal.

The curves are similar or overlapping thus suggesting

that the approximation from equation (12) holds.

From Figure 2 it can be observed that the be-

haviour of our method is opposite to those of the two

other tuneable methods. At low values of t partitions

are small (at t = 0 our method ﬁnds as many parti-

tions as there are nodes). Then as t increases, parti-

tions tend to become larger. Conversely, the two other

methods start by partitioning in large clusters when

their parameter is minimal. Then they ﬁnd ﬁner par-

titions as their parameter increases. As the discrete

and continuous time versions of our algorithm have a

very similar behaviour, and as the time optimised se-

tups almost overlap with the full time interval setups,

we will only comment on the discrete time version.

Considering Figure 2(a) for instance, after t = 4,

the result of our algorithm remains stable on 5 clus-

ters that represent the most stable partition. A parti-

tion in 6 elements can also be found and corresponds

to the 5 large communities with the centre of the cen-

tral community in a separate community. Also, when

t ∈ [0.1, 0.2] the algorithm ﬁnds a partition in 26 com-

munities (25 small communities plus central node on

its own). Arenas et al’s method detects the 5 commu-

nities around about r − r

∈ [1.5, 2.5]. Then it grows

to stabilise on 25 communities around r − r

∈ [9, 25]

and then goes up to 26 communities and more. Re-

ichardt et al’s stabilises around 5 communities around

γ = [0.5, 0.9] and around 26 communities around γ =

3.6 and onwards. Looking at the stability optimisa-

tion and Reichardt et al’s method compared to Are-

nas et al’s method, the two former tend to stabilise at

intermediate partitions of a size 1 larger than those

detected by the latter. By using the resistance pa-

rameter r, Arenas et al’s method alters the impact of

edge weights across the network. In this instance this

blends the central node into the small central com-

munity. However when considering the partition in

25 communities, this central node has 4 connections

with all communities, including its own. By adding

it to any community, this community would gain 4

edges pointing inside and 80 pointing outside. There-

fore it is ambiguous whether this node should belong

to the small central community or any of the others.

This is handled in our method by keeping this node in

a separate community until it is clear that it belongs to

the central community of the 5 communities partition

(it then shares 20 edges inside its own community and

16 edges with each of the 4 others).

On Figure 2(b), the intended partitions in 16 and

then 4 communities are clearly detected. As expected

the most stable partition is the partition in 4 communi-

ties, as indicated by the stability optimisation methods

with the long stretch of time settling on this partition

compared to the shorter plateau for 16 communities.

Considering Figure 2(c), our algorithm quickly

settles on the 2 expected partitions, found by Arenas

et al’s for about r − r

∈ [0.7, 2] and by Reichardt et

al’s for about γ ∈ [0.4, 0.8]. It also consistently set-

tles beforehand on the partition in 4 communities, re-

vealing the relevance of this partition, as suggested by

other analysis (Medus et al., 2005).

Regarding Figure 2(d), our algorithm settles on 2

partitions, as expected from the results (Lusseau and

Newman, 2004) that analysed the dataset using mod-

ularity. Arenas et al’s solution for r − r

∈ [0.7, 1.2]

and Reichardt et al’s solution for γ ∈ [0.2, 0.5] cor-

responds to the partition of (Lusseau and Newman,

2004). Our algorithm also stops over on 4 partitions

for t ∈ [0.5, 3[ \1.5 which is another relevant division

size of the network (Lusseau and Newman, 2004).

On Figure 2(e) we can observe several scales of

relevance. Based on the knowledge of the teams dis-

tribution, a community of size 12 is expected. Such

partition is detected at an early time (t = 0.3) with

our method and is the ﬁrst plateau. A normalised mu-

tual information value of 0.919 compared with the

12 known groups can be found by our method on

this plateau. Most of the nodes are therefore placed

into the right communities. Regarding the remaining

nodes, (Khadivi et al., 2011) explains that some nodes

do not ﬁt in the expected classiﬁcation. Therefore

other divisions can also be of relevance. While sta-

bility settles on a few plateaus the two other methods

tend to detect shortly many intermediate partitions.

As the time grows communities also grow bigger and

get more stable. A partition into 3 communities is

consistently identiﬁed followed by a partition with 2

MULTI-SCALE COMMUNITY DETECTION USING STABILITY AS OPTIMISATION CRITERION IN A GREEDY

ALGORITHM

223

communities. Analysing the former we ﬁnd that it re-

ﬂects the geographical locations of the teams, reﬂect-

ing the fact that teams located geographically closer

are more likely to play one another. This partition sep-

arates the country roughly into West, South-east and

North-east. Then the partition with 2 communities di-

vides the country into West and East. Therefore the

successive stable partitions reﬂect the organisation of

the teams, ﬁrst locally with the 12 communities par-

tition and then nationally with the partitions in 3 and

2 communities (other partitions in between may also

reﬂect smaller geographical divisions). Another anal-

ysis could consider the large and stable communities

(e.g. of size 2 or 3) and sub-partition them to analyse

the games distribution at a smaller geographical scale.

Analysing the network of the characters from Les

Mis

erables several divisions appear. Considering sta-

bility optimisation 2 main partitions appear, as shown

on Figure 2(f), while the other methods detect more

partitions on short intervals. The ﬁrst one consistently

identiﬁed by our method contains 5 communities and

the second one contains 3 communities. In the parti-

tion into 5 communities the ﬁrst is a central commu-

nity containing most of the main plot characters such

as Valjean, Javert, Cosette, Marius or the Thenardier.

The second community relates to the story of Fantine,

the third one relates to Mgr Myriel, the fourth one

relates to Valjean’s story as a prisoner and contains

other convicts. The ﬁfth one relates to Gavroche, an-

other main character. Considering the partition in 3

communities, the central community is merged with

the fourth community (convicts, judge, etc) and the

third community. The community mainly represents

characters connected to Valjean at a moment of his

story. The second community remains as well as the

ﬁfth one with Marius now part of it.

These results highlight the fact that different meth-

ods provide different approaches and solutions to the

problem of community partitions. Consistent or sta-

ble partitions are used to identify relevant divisions

in a network. Considering the three tuneable meth-

ods the results show that stability optimisation tends

to stabilise on fewer partitions and often more con-

sistently than the other two methods. This is use-

ful to identify most relevant partitions and hence in-

form about networks structure in the absence of a pri-

ori knowledge. To better detect stable partitions the

normalised mutual information (NMI) (Fred and Jain,

2003) between successive partitions (e.g. found at

times t and t + dt) was also sometimes used. A per-

sistent high NMI value conﬁrms that successive parti-

tions of same size are indeed the same or similar, thus

ruling out the possibility of having different partitions

that happen to have the same number of communities.

5 CONCLUSIONS

This work investigated stability as an optimisation

criterion for a greedy approach similar to the one

used in (Newman, 2004; Danon et al., 2006; Re-

ichardt and Bornholdt, 2006; Arenas et al., 2008).

The results showed that our method enables accurate

multi-scale analysis and tackles the problem differ-

ently than other methods by ﬁnding stable partitions

over a Markov time which is part of the deﬁnition of

stability. The method was tested against various net-

works and compared to ﬁve relevant community de-

tection algorithms. Stability optimisation can be seen

as an extension of modularity optimisation that does

not alter the graph (like Arenas et al’s) or the impor-

tance of the null factor (like Reichardt et al’s) but in-

stead exploits the graph by interpreting it as a Markov

process. By analysing it over different time scales it

explores the graph thoroughly by considering paths

of various lengths between nodes where each consid-

ered Markov time deﬁnes a path length. This analysis

can be performed without complexity increase com-

pared to modularity optimisation methods such as fast

Newman’s algorithm. Two Markov processes have

been tested for our method, one based on a Markov

chain model of a network and the other one extending

this model to a continuous time Markov process. Ex-

periments showed that both models behave similarly

considering the cases studied in this work. They also

showed that stability can be optimised with almost no

loss of accuracy by only using the upper bound of a

Markov time interval, as opposed to the whole inter-

val suggested by the mathematical deﬁnition of stabil-

ity. This heuristic provides a signiﬁcant gain in speed.

The results showed that multiple levels of organ-

isations are clearly identiﬁed when optimising sta-

bility over time. Stability optimisation tends to set-

tle for longer on fewer partitions than other related

approaches considered here, thus highlighting better

partitions of relevance. Stability optimisation also

converges towards large and stable clusters. This be-

haviour also differs from those of other approaches

and in the absence of a priori knowledge our method

has therefore the advantage of leading to stable and

relevant communities from where a deeper analysis

could be performed in each community subgraph.

The complexity of our method with the time-

optimised heuristic is in O(n(m + ln

(n))) which

compares to similar approaches using modularity and

scaling to large networks. Therefore our method

should scale up very well to large networks.

Further work will consider a randomised algo-

rithm similarly to the randomised modularity optimi-

sation algorithm presented in (Ovelg

onne et al., 2010)

KDIR 2011 - International Conference on Knowledge Discovery and Information Retrieval

224

that would reduce the complexity to O (n · ln

(n)).

Another possible optimisation is a multi-step ap-

proach as presented in (Schuetz and Caﬂisch, 2008)

for modularity optimisation. This work can also be

applied to detecting overlapping communities by us-

ing the line graph of the initial graph as in (Pereira-

Leal et al., 2004), thus working on link communities

(Ahn et al., 2010). Further work could also consider

a self-tuneable algorithm that returns the most stable

partition(s). Another algorithm could then provide a

stable hierarchy by repeatedly subdividing the stable

partitions found at each hierarchical level.

REFERENCES

Ahn, Y.-Y., Bagrow, J. P., and Lehmann, S. (2010). Link

communities reveal multiscale complexity in net-

works. Nature, 466:761–764.

Arenas, A., D

ıaz-Guilera, A., and P

erez-Vicente, C. J.

(2006). Synchronization reveals topological scales in

complex networks. Physical Review Letters.

Arenas, A., Fernandez, A., and Gomez, S. (2008). Anal-

ysis of the structure of complex networks at different

resolution levels. New Journal of Physics, 10:053039.

Blondel, V. D., Guillaume, J.-L., Lambiotte, R., and Lefeb-

vre, E. (2008). Fast unfolding of communities in large

networks. Journal of Statistical Mechanics: Theory

and Experiment, 10:1742–5468.

Clauset, A., Newman, M. E. J., and Moore, C. (2004). Find-

ing community structure in very large networks. Phys-

ical Review E, 70:066111.

Danon, L., D

ıaz-Guilera, A., and Arenas, A. (2006). The ef-

fect of size heterogeneity on community identiﬁcation

in complex networks. Journal of Statistical Mechan-

ics: Theory and Experiment, 2006(11):P11010.

Delvenne, J.-C., Yaliraki, S. N., and Barahona, M. (2010).

Stability of graph communities across time scales.

PNAS, 107(29):12755–12760.

Fortunato, S. (2010). Community detection in graphs.

Physics Reports, 486(3-5):75–174.

Fortunato, S. and Barth

elemy, M. (2007). Resolution limit

in community detection. PNAS, 104(1):36–41.

Fred, A. L. N. and Jain, A. K. (2003). Robust data cluster-

ing. IEEE Computer Society Conference on Computer

Vision and Pattern Recognition, 2:128–133.

Girvan, M. and Newman, M. E. J. (2002). Community

structure in social and biological networks. PNAS,

99:7821–7826.

Good, B. H., de Montjoye, Y.-A., and Clauset, A. (2010).

Performance of modularity maximization in practical

contexts. Physical Review E, 81(4):046106.

Guimer

a, R., Sales-Pardo, M., and Amaral, L. A. N. (2004).

Modularity from ﬂuctuations in random graphs and

complex networks. Physical Review E, 70(2):025101.

Khadivi, A., Ajdari Rad, A., and Hasler, M. (2011). Net-

work community-detection enhancement by proper

weighting. Physical Review E, 83(4):046104.

Knuth, D. E. (1993). The Stanford GraphBase. A Platform

for Combinatorial Computing. Addison-Wesley.

Lambiotte, R. (2010). Multi-scale Modularity in Complex

Networks. ArXiv e-prints.

Lambiotte, R., Delvenne, J.-C., and Barahona, M. (2008).

Laplacian Dynamics and Multiscale Modular Struc-

ture in Networks. ArXiv e-prints.

Lancichinetti, A. and Fortunato, S. (2009). Community de-

tection algorithms: A comparative analysis. Physical

Review E, 80(5):056117.

Lusseau, D. and Newman, M. E. J. (2004). Identifying the

role that individual animals play in their social net-

work. Proceedings of the Royal Society London B,

271:S477–S481.

Lusseau, D., Schneider, K., Boisseau, O. J., Haase, P.,

Slooten, E., and Dawson, S. M. (2003). The bot-

tlenose dolphin community of doubtful sound features

a large proportion of long-lasting associations. can ge-

ographic isolation explain this unique trait? Behav-

ioral Ecology and Sociobiology, 54(4):396–405.

Medus, A., Acuna, G., and Dorso, C. (2005). Detection

of community structures in networks via global op-

timization. Physica A: Statistical Mechanics and its

Applications, 358(2-4):593–604.

Newman, M. E. J. (2004). Fast algorithm for detecting

community structure in networks. Physical Review E,

69(6):066133.

Newman, M. E. J. (2006). Finding community structure in

networks using the eigenvectors of matrices. Physical

Review E, 74:036104.

Newman, M. E. J. and Girvan, M. (2004). Finding and eval-

uating community structure in networks. Phisical Re-

view E, 69:026113.

Ovelg

onne, M., Geyer-Schulz, A., and Stein, M. (2010).

Randomized greedy modularity optimization for

group detection in huge social networks. In Pro-

ceedings of the 4th SNA-KDD Workshop ’10 (SNA-

KDD’10), pages 1–9, Washington, DC, USA.

Palla, G., Derenyi, I., Farkas, I., and Vicsek, T. (2005).

Uncovering the overlapping community structure of

complex networks in nature and society. Nature,

435(7043):814–818.

Pereira-Leal, J. B., Enright, A. J., and Ouzounis, C. A.

(2004). Detection of functional modules from protein

interaction networks. Proteins, 54:49–57.

Ravasz, E. and Barab

asi, A. L. (2003). Hierarchical or-

ganization in complex networks. Physical Review E,

67(2):026112.

Reichardt, J. and Bornholdt, S. (2006). Statistical mechan-

ics of community detection. Phys Rev E, 74(1 Pt

2):016110.

Schuetz, P. and Caﬂisch, A. (2008). Efﬁcient modularity

optimization by multistep greedy algorithm and vertex

mover reﬁnement. Physical Review E, 77(4):046112.

Simon, H. A. (1962). The architecture of complexity. In

Proceedings of the American Philosophical Society,

pages 467–482.

Zachary, W. W. (1977). An information ﬂow model for con-

ﬂict and ﬁssion in small groups. Journal of Anthropo-

logical Research, 33(4):452–473.

MULTI-SCALE COMMUNITY DETECTION USING STABILITY AS OPTIMISATION CRITERION IN A GREEDY

ALGORITHM

225