Extracting Navigation Hierarchies from Networks

with Genetic Algorithms

Stefan John

, Michael Granitzer

and Denis Helic

Chair of Media Computer Science, University of Passau, Passau, Germany

Knowledge Technologies Institute, Graz University of Technology, Graz, Austria

Keywords:

Decentralized Search, Genetic Algorithm, Optimization, Heuristic, Network Hierarchy.

Abstract:

Information networks are nowadays an important source of knowledge, indispensable for our daily tasks.

Because of their size, however, efﬁcient navigation can be a challenge. Following the idea to use network

hierarchies as guidance in human as well as algorithmic search processes, this work focuses on the creation of

optimized navigation hierarchies. Based on an established model of human navigation, decentralized search,

we deﬁned two quality criteria for network hierarchies and propose a genetic algorithm applying them. We

conducted experiments on an information as well as a social network and analyzed the optimization effectivity

of our approach. Furthermore, we investigated the structure of the resulting navigation hierarchies. We found

our algorithm to be well-suited for the task of hierarchy optimization and found distinct structural properties

inﬂuencing the quality of navigational hierarchies.

1 INTRODUCTION

Although the Internet has become the primary plat-

form of information retrieval and social interaction

only over the last years, the analysis of human nav-

igation in networks in general and the Internet in par-

ticular has been part of scientiﬁc research for several

decades already. One of the most famous publications

initiating the research on social networks is Milgram’s

small-world experiment (Milgram, 1967), which led

to two assumptions. First, social networks are struc-

tured in a way that arbitrarily chosen people are con-

nected by short chains of relationships, a property re-

ferred to as the “small world phenomenon”. And sec-

ond, humans are able to ﬁnd these short paths among

the nearly endless number of possible routes.

Incited by these results, a lot of effort was put into

analyzing human navigation in networks. Describing

the navigational concept applied in Milgram’s exper-

iment, Kleinberg formed the notion of decentralized

search and investigated the human ability to ﬁnd short

paths to arbitrary people in social networks (Klein-

berg, 2000a; Kleinberg, 2000b; Kleinberg, 2002).

Along with analyzing “the structure of networks in

which this phenomenon emerges” (Kleinberg, 2002),

Kleinberg showed that using additional hierarchical

knowledge, decentralized search can efﬁciently be

used to navigate in such networks, suggesting the

existence of human hierarchical background knowl-

edge.

Based on this insight, Adamic and Adar imple-

mented a decentralized search algorithm using back-

ground knowledge for its greedy navigation (Adamic

and Adar, 2005). Feeding their implementation with a

naturally existent hierarchy (the hierarchical structure

of an organization), they were able to conﬁrm Klein-

berg’s theoretical ﬁndings. Additionally, they found

their algorithm to be less efﬁcient using background

knowledge which could not be represented as a hi-

erarchy, further pointing out the importance of hier-

archical structures for both, algorithmic and human

navigation processes. Since then, various studies have

shown humans to be efﬁcient navigators in social as

well as information networks. Unfortunately, our lim-

itation of knowledge sometimes hinders us from tak-

ing shortcuts, slowing down search processes.

Problem. Previous research has shown that back-

ground knowledge and in particular a hierarchical

background knowledge is of primary importance in

human navigation in networks. Traditionally, in infor-

mation systems category hierarchies are often used to

support user navigation. Although the goal of these

hierarchies is to support navigation they are typically

optimized for semantics. Previous work has shown

that semantically optimal categories often posses de-

sirable navigational properties—at least in the domain

John, S., Granitzer, M. and Helic, D.

Extracting Navigation Hierarchies from Networks with Genetic Algor ithms.

In Proceedings of the 12th International Conference on Web Information Systems and Technologies (WEBIST 2016) - Volume 2, pages 63-74

ISBN: 978-989-758-186-1

of social tagging systems (Strohmaier et al., 2012).

However, the question if and how these results can

be generalized to other information networks still re-

mains an open one in our community.

Objective. Therefore, in this work, we take another

approach for obtaining navigationally sound hierar-

chies. We aim for automatic extraction of hierarchies

while optimizing their navigational properties. The

ﬁnal goal of our work is to enhance navigational sup-

port for users by embedding such navigationally opti-

mal hierarchies in a typical user interface of an infor-

mation system.

Approach. To that end, we develop an approach for

evaluating navigational properties of hierarchies. We

base our approach on optimization of shortest paths

between arbitrary pairs of nodes in extracted hierar-

chies. To tackle the complexity of the optimization

problem we resort to a genetic algorithm for which

we develop a novel crossover rule. Finally, we illus-

trate our approach by performing experiments on real

datasets.

Contributions. With our work we make the follow-

ing three contributions:

1. A methodology for assessing the quality of navi-

gational network hierarchies.

2. An algorithmic approach for the creation of opti-

mized navigation hierarchies.

3. Novel insights about the structure of navigation-

ally good hierarchies.

An implementation of the algorithmic approach is

publicly available at GitHub

2 RELATED WORK

In recent years, several studies have investigated the

relationship of decentralized search and human nav-

igation in networks. For our work, especially those

dealing with information networks were of interest.

2.1 Decentralized Search

With regard to navigation in networks, this work fo-

cuses on the concept of decentralized search ﬁrst for-

malized by Kleinberg (Kleinberg, 2000b). Accord-

ingly, a search starting at node s of a network has to

be performed without global knowledge of the net-

work. In order to reach a target node t, only local

information may be used. This comprises knowledge

of the direct neighbors of the currently visited node as

https://github.com/Leative/STOA (Feb. 2016)

well as some intuition about their distance to the tar-

get. Limited to local decisions, decentralized search

is an iterative process. In each step, the search pro-

gresses by greedily selecting the neighbor closest to

the target.

In the hierarchical network model introduced by

Kleinberg (Kleinberg, 2002), a hierarchy of the stud-

ied network is used to measure distances. The length

of a shortest path connecting two nodes in that hierar-

chy determines their distance. Therefore, each greedy

step of a search depends on the nodes “proposed” by

the hierarchy. Assuming the existence of a hierarchi-

cally structured background knowledge, this model

can be applied to human search processes.

2.2 Model of Human Navigation

Gamifying the navigational concept of decentralized

search, West and Leskovec studied human navigation

in an online game

based on Wikipedia (West and

Leskovec, 2012). Their work conﬁrms that the efﬁ-

ciency of human navigation is not limited to social

networks, but that we are also able to ﬁnd short paths

in information networks. Furthermore, it shows that

people tend to visit network nodes of high degree,

called hubs, in early stages of a search before clos-

ing in on the target.

Going one step further, Trattner et al. tried to

model the observed human behavior using a de-

centralized search algorithm (Trattner et al., 2012).

Based on the ideas of Adamic and Adar (Adamic

and Adar, 2005), their approach relies on hierarchi-

cal background knowledge to determine the distance

of nodes. To that end, they used two different types

of hierarchies. Their results show that, paired with

their algorithm, hierarchies built directly from the link

structure of the underlying network are better suited to

simulate human navigation than hierarchies built from

external knowledge (like an existing semantic catego-

rization). However, using such hierarchies, their al-

gorithm slightly outperforms the human counterpart.

Helic et al. continued with this study by extending

the greedy mechanism of pure decentralized search

with stochastic decision processes (Helic et al., 2013).

They invented a new selection mechanism for decen-

tralized search, called decaying ε-greedy action se-

lection. Applying this selection mechanism instead

of simple greedy navigation, hierarchical decentral-

ized search proves to be well suited for modeling hu-

man navigation. Picking up on the results of West

and Leskovec (West and Leskovec, 2012), they also

investigated the development of human search paths

and conclude that human navigation can roughly be

http://www.wikispeedia.net (Oct. 2015)

WEBIST 2016 - 12th International Conference on Web Information Systems and Technologies

divided into two stages. Starting with an orienta-

tion phase, referred to as exploration, people try to

reach familiar nodes. In this stage, while trying to

reach hubs, humans are more likely to perform ran-

dom decisions on where to go next. In later steps,

however, as people become more conﬁdent in their

background knowledge, they switch to exploitation,

efﬁciently following their intuitions more often.

The above insights suggest that a the efﬁciency of

human navigation may be increased by, for example,

shortening the exploration phase. Intuitively, this can

be achieved by offering additional hierarchical infor-

mation extending the human background knowledge.

In current information networks, a semantic catego-

rization often serves as an additional layer of infor-

mation. However, category hierarchies do not con-

sider navigational properties of the underlying net-

work. Therefore, this work focuses on the extraction

of navigationally optimized hierarchies to further en-

hance the support of human navigation.

3 METHODOLOGY

In the following, a network is represented by an un-

weighted graph G(V, E) with V being the set of nodes

and E the set of edges of the network.

3.1 Hierarchy Creation

To our knowledge, the methods available to create

hierarchies for information networks can roughly be

split into two groups. Those following a probabilis-

tic approach (Clauset et al., 2008) and those trying

to uncover hierarchical structures inherently exist-

ing in the underlying network (Heymann and Garcia-

Molina, 2006; Muchnik et al., 2007). The latter, usu-

ally rely on some notion of structural or semantic gen-

erality in order to relate the hierarchies’ nodes to each

other. Although these methods have shown to be vi-

able for structural analysis or the generation of se-

mantic overviews, our work follows a different path.

Rather than retaining semantic or structural re-

lations, our approach seeks to optimize the quality

of the produced hierarchies with respect to deﬁnable

quality criteria. Relying on genetic algorithms, we

perform a global search on the set of possible hierar-

chies of a network. This approach allows for an easy

adaption to different use cases. By deﬁning custom

quality criteria, the created hierarchies can be opti-

mized for the task at hand.

Focusing on the enhancement of human naviga-

tion, the quality criteria proposed in this work are

tailored towards the navigational optimality of hier-

archies. While most likely loosing structural infor-

mation of the original network, this optimization fa-

cilitates the application to human search scenarios as

well as algorithmic search processes based on hierar-

chical decentralized search.

3.2 Quality Criteria

In the context of decentralized search, efﬁciency can

be deﬁned considering the length of the search path

used to succeed in a search. At the same time, an

agents success rate determines her effectivity. Ideally,

hierarchies created by our approach should increase

both, the efﬁciency as well as the effectivity of a nav-

igating agent.

In order to gain maximum effectivity, our created

hierarchies have to meet a simple constraint. To fa-

cilitate success regardless of the search scenario, each

pair of network nodes needs to be connected by at

least a single path in the hierarchy. For this reason,

we decided to restrict ourselves to spanning trees, for

which this property naturally holds. Besides being

optimal regarding search effectivity, this restriction

also vastly reduces the search space for our optimiza-

tion algorithm. As a drawback, however, to create

spanning trees on directed networks we had to treat

them as undirected.

As per above deﬁnition, improving the efﬁciency

of a search requires an agent, either man or machine,

to ﬁnd a short path to the target. Intuitively, propos-

ing a node lying on a shortest path to the target in each

step of the search would yield unbeatable results. In

general, however, many edges of the underlying net-

work have to be sacriﬁced to create a meaningful hi-

erarchy. Thereby, a lot of shortest paths become un-

available. To counter this problem, our work proposes

two quality criteria for navigation hierarchies aimed

at preserving as many of the shortest paths of a net-

work as possible.

3.2.1 Global Stretch

For our ﬁrst criterion, we resort to a measure used in

a similar context by Helic et al. (Helic et al., 2013)

and refer to it as global stretch. Given a spanning

tree T of a network N(V

, E

), for two nodes s and

t we compare their distance d

(s,t) in the tree with

their distance d

(s,t) in the network. In each case,

the distance is measured by the length of a shortest

path between s and t. We call this measure τ(s, t) the

local stretch (eq. 1). By averaging over all combi-

nations of network nodes, where n is the number of

nodes in the network, we get the global stretch (eq. 2).

Extracting Navigation Hierarchies from Networks with Genetic Algorithms

τ(s,t) =

(s,t)

, s 6= t (1)

τ(T ) =

n(n − 1)

∑

s,t∈V

s6=t

(s,t)

(2)

Although it does not make a general statement about

the quality of a solution, since we do not know the

possible optimum, global stretch enables us to com-

pare hierarchies.

3.2.2 Local Tree Fitness

Local tree ﬁtness, is a bit more complex. Instead of

comparing complete paths, we consider the neighbor-

hood and the possible navigation steps proposed by

the hierarchy at each node. N(s) being the set of

neighbors of a node s in our network, we ﬁrst have

to deﬁne two subsets of neighbors. Those which are

lying on a shortest network path from s to a target

node t (eq. 3) and those lying on a shortest path in the

evaluated tree (eq. 4).

= argmin

n∈N(s)

(n,t) (3)

= argmin

n∈N(s)

(n,t) (4)

Considering a human navigation being stuck at node

s, with these sets we can calculate the probability

(s,t) (eq. 5) that the spanning tree proposes a node

as the next navigation step which is lying on a short-

est path to target t in the original network. As we did

for stretch, we average over all source-target pairs to

gather the overall local tree ﬁtness F

(T ), deﬁned in

eq. 6.

(s,t) =

∩ M

(5)

(T ) =

n(n − 1)

∑

s,t∈V

s6=t

(s,t) (6)

4 OPTIMIZATION ALGORITHM

Both of the presented quality criteria consider short-

est paths, in general requiring computationally inten-

sive all pairs shortest path calculations to be con-

ducted on the studied network as well as on evalu-

ated hierarchies. However, restricting the hierarchies

to be spanning trees—thereby treating all networks as

undirected—simple breadth ﬁrst search can be used

Population

Substitute population

Create offspring

Starting

Population

Created by DFS

Mutation

Alter parent relation

Crossover

Exchange random path

Selection

Roulette wheel + elite

Fitness

Quality criteria Parent relations

Encoding

Figure 1: Overview of the spanning tree optimization algo-

rithm (STOA).

for this purpose. Although this restriction also de-

creases the search space of our global search, the up-

per bound for the number of possible spanning trees

of a network with n nodes is still n

n−2

, according to

Caley’s formula. Therefore, optimizing the quality

of navigation hierarchies still constitutes a hard op-

timization problem.

Because of their versatility and modularity we de-

cided to tackle this problem by applying genetic al-

gorithms, a heuristic optimization approach. Genetic

algorithms mimic natural evolution processes. In par-

ticular, they follow Darwin’s theory of natural selec-

tion to optimize a population of solutions. Genera-

tion by generation, good individuals, hierarchies in

our case, are chosen to “breed” offspring, while bad

individuals are condemned to extinction.

Genetic algorithms are highly extensible and offer

several ways to allow this natural selection to hap-

pen. However, our spanning tree optimization algo-

rithm (STOA) mainly relies on the core components

of simple genetic algorithms as described by Gold-

berg (Goldberg, 1989). Figure 1 shows an overview

of our algorithm. The optimization process starts by

providing a starting population of solutions (hierar-

chies). In each iteration (generation) of the algorithm,

the ﬁtness (quality) of all individuals of the current

population is evaluated. A selection mechanism cre-

ates an intermediate population from which random

matings are drawn. During the following reproduc-

tion step, the mated individuals are subject to two ge-

netic operators. Crossover, where parts of the genetic

information of the mates are interchanged and muta-

tion, randomly altering single “genes” of solutions. In

the following, the components of STOA are explained

in more detail.

4.1 Starting Population

The creation of a starting population is the ﬁrst task

which has to be accomplished in order to run a ge-

netic algorithm. As proved by Lov

asz (Lov

asz, 1993),

random spanning trees could be generated with equal

probability by conducting random walks on the un-

derlying network. However, in large networks, ran-

WEBIST 2016 - 12th International Conference on Web Information Systems and Technologies

dom walks can take a considerably long time to visit

every node. For this reason, we chose a less ran-

dom, yet simple and fast approach. After selecting

a random node as root, a depth ﬁrst search is con-

ducted, naturally generating a spanning tree. This

approach is biased towards linear spanning trees, i.e.

trees which are deep but not broad. Although leav-

ing room for improvement, we accepted this approach

to be sufﬁcient for our study. Ultimately, the experi-

ments showed that our recombination and mutation

mechanisms can overcome the linearity in the ﬁrst

few generations.

4.2 Fitness

In genetic algorithms a ﬁtness function is used to de-

termine the ﬁtness of individuals. Since our goal of

optimization was to produce hierarchies of high qual-

ity with regard to the deﬁned quality criteria, these

quality criteria were also used as ﬁtness functions for

the algorithm.

4.3 Encoding of Spanning Trees

In our use case, the hierarchies we want to optimize

constitute the solutions (individuals) the genetic al-

gorithm is working with. Similar to nature, solu-

tions in a genetic algorithm need to be deﬁned by a

set of “genes“ on which changes can be performed.

This is usually done by choosing a proper encoding

rather than working directly on the parameters rel-

evant to the ﬁtness function. We based the encod-

ing on the work of Carvalho et al. (Carvalho et al.,

2001). Considering a spanning tree to be rooted and

directed, we encode it by the set of parent relations of

its nodes. Each parent relation P(a, b), representing

an edge from a to b in the tree, can be considered a

gene, constituting the smallest alterable unit of a tree.

4.4 Roulette Wheel Selection with

Elitism

Being responsible for the selection of parents for the

reproduction, the selection mechanism is crucial for

every genetic algorithm. To this end, we rely on a

commonly used strategy, known as roulette wheel se-

lection. Enforcing survival of the ﬁttest, individu-

als are chosen for an intermediate population with a

probability proportional to their ﬁtness value. How-

ever, while being an intriguingly simple concept, this

purely probabilistic approach does not guarantee for

the best individuals to be selected. For this reason,

we extended the simple roulette wheel selection by

the concept of elitism.

With elitism, the best k individuals of a genera-

tion (where k has to be speciﬁed as a parameter) are

directly copied to the offspring, guaranteeing their ge-

netic information to be available in the next iteration.

Additionally, these individuals are still eligible for se-

lection.

4.5 Random Path Crossover

Working with spanning trees as individuals im-

poses two requirements on viable recombination al-

gorithms:

i) Spanning tree properties have to be preserved.

ii) Valuable partial structures of good trees should

not be destroyed.

Built upon their proposed encoding, Carvalho et al.

also presented a recombination algorithm, which sat-

isﬁes both of these requirements (Carvalho et al.,

2001). In the scenario of planning electrical dis-

tribution networks, which can also be formulated

as a spanning tree problem, their approach outper-

formed the combination of binary encoding and one-

point crossover (a standard approach for genetic algo-

rithms) as well as an approach of randomly generating

new spanning trees from the set of edges of the par-

ent trees. It also has shown to be effective in mov-

ing towards better solutions in both theoretical and

real world problems. Based on their work, we im-

plemented a crossover operator which we refer to as

random path crossover.

4.5.1 Mating

In random path crossover, pairs of trees are drawn

at random from the intermediate population created

by the selection mechanism. In case of an odd num-

ber of individuals in the population, the remainder of

this mating passes the crossover process unchanged.

Some approaches to genetic algorithms make use of

a crossover probability, i.e. there is a ﬁxed chance

that a couple stays unaffected by crossover. As will

be shown in the next section, our process of inter-

changing genetic information is not guaranteed to

succeed. Inherent to the interchange algorithm, for

a pair of trees a slight chance already exists to not be-

ing changed by random path crossover. Therefore, we

decided to always apply crossover to all of the chosen

pairs, as a side effect reducing the effort needed for

parameterizing STOA.

4.5.2 Recombination

To recombine genetic information between the mated

individuals, random path crossover interchanges ran-

Extracting Navigation Hierarchies from Networks with Genetic Algorithms

(a) Valid crossover

(b) Invalid crossover

Figure 2: Concept of crossover. In the ﬁrst section of each subﬁgure, a path between two nodes (blue ﬁlled nodes) is chosen

for both trees. Afterwards, crossover tries to inject the path of the ﬁrst tree into the second one (gray dotted lines indicate the

original parent relation which is about to be substituted). While in subﬁgure (a) the injection succeeds for both path elements

(injected parent relations are depicted by green dashed lines), subﬁgure (b) shows an invalid try. Injecting the red dashed

parent relation would create a circle.

domly chosen paths of the trees. Originally, our en-

coding represents a spanning tree as a rooted tree and

therefore imposes a direction on its edges. To choose

paths, we have to abstract from this representation for

a moment and consider the parent relations to be undi-

rected. A path can then be deﬁned by the parent rela-

tions connecting two nodes.

The algorithm randomly selects two nodes and

identiﬁes the paths connecting them in both of the par-

ent trees. To exchange genetic information, the parent

relations deﬁning the path of the ﬁrst tree are injected

into the second tree (ﬁgure 2a) and vice versa. To pre-

serve the spanning tree properties, and thereby fulﬁll

requirement (i), each injection is tested and rejected if

substituting the parent relation would result in a cycle

(ﬁgure 2b). The order at which the crossover tries to

substitute the single relations of a path is important,

here. To avoid the rejection of valid paths, the parent

relations have to be substituted in ascending order of

their distance to the root, i.e. edges closer to the root

of a path’s tree have to be chosen ﬁrst.

If a path is rejected during a recombination at-

tempt, the process is canceled and a new pair of ran-

dom nodes is chosen. This procedure is repeated until

either a couple of interchangeable paths is found or

a threshold of attempts is reached. In the latter case,

both trees are passed unchanged to the next step of the

genetic algorithm. Depending on the threshold, this is

usually a rarely occurring event.

By exchanging paths, random path crossover only

considers coherent structural units for the recombina-

tion of trees. In doing so, in accordance with the sec-

ond requirement (ii), structures responsible for a high

ﬁtness are likely to either remain unchanged or to be

transferred to the partner tree.

4.5.3 Random Root Switching

Performing the above recombination, a path can never

be injected if it contains a parent relation where the

root node of the mated tree is the child. This can be

shown considering the following scenario. Let r be

the root of tree T

. Furthermore, let P(a, r) be the

parent relation found in the path chosen for tree T

During recombination, a will be chosen to substitute

the original parent of r in T

. Since r is the root, it is an

ancestor of all other nodes in T

. Therefore, selecting

any of the other nodes as parent of r results in a cycle,

which in turn is followed by the rejection of the path.

If the root nodes chosen for the initial population

are never changed during an evolution, this behav-

ior biases the exploration of the search space. Cer-

tain solution variants will be harder to generate than

others, due to the limited ways of combining mated

trees. Hence, before applying recombination, a new

common root node is chosen randomly for the mated

trees.

4.6 Random Edge Mutation

To prevent an early stagnation of the search, we im-

plemented a mutation operator suitable for spanning

trees. The genes of a tree, i.e. its parent relations, are

independently mutated with a ﬁxed mutation proba-

bility, which can be set as parameter for the genetic

algorithm. Excluding the root of the tree, mutation of

a parent relation is done by randomly selecting a new

parent for the respective child node from the set of

its neighbors in the original network. Depending on

the actual structure of the spanning tree, not all of the

neighbors are valid choices. Mutation attempts which

would introduce cycles are ignored and, as long as

other neighbors are available, another parent is tried.

A relation remains unchanged if all neighbors were

tested without success. Similar to crossover, the order

in which parent relations are mutated inﬂuences the

outcome. To prevent any unwanted bias, the list of

parent relations is shufﬂed before applying mutation.

WEBIST 2016 - 12th International Conference on Web Information Systems and Technologies

5 EXPERIMENTAL SETUP

To analyze the effectivity of our approach and gain in-

sights on the structural properties of navigational hier-

archies, we conducted experiments with two medium

sized test networks. First, a condensed version of

Wikipedia, which was already part of the studies of

West et al. (West et al., 2009; West and Leskovec,

2012). Second, a sample of the Facebook network,

collected by McAuley and Leskovec in the course of

an analysis of social circles (McAuley and Leskovec,

2012). Both networks are offered by Stanford Uni-

versity as part of the Stanford Network Analysis

Project

. In the following we will refer to them just

as Wikipedia and Facebook network, respectively.

5.1 Parametrization Experiments

In order to gather the best possible hierarchies, we

conducted a series of parametrization experiments to

determine the settings needed to optimize the perfor-

mance of our genetic algorithm. Both of the test net-

works are known to be scale-free and to reveal a sim-

ilar structure. Hence, we based the parametrization

solely on the Wikipedia network. However, due to

the runtime of STOA being quadratically dependent

on the size of the input network, we used a network

sample to facilitate a meaningful number of optimiza-

tion runs within an acceptable period of time.

5.1.1 Sampling

To guarantee the representativeness of the sample, we

focused on retaining two measures of the original net-

work which we assumed to be highly inﬂuential for

the outcome of the ﬁtness functions: its degree distri-

bution and its density.

Changing the distribution of node degrees inﬂu-

ences the availability of shortest paths in the net-

work. Biasing the distribution towards lower degrees

for example, weakens nodes of relatively high degree

(known as hubs), which are important connectors and

might lie on many shortest paths. This bias is likely to

lead to a higher average length of shortest paths which

in turn are the basis of the ﬁtness evaluation.

The density for an undirected network N with

n nodes and m edges, in accordance with Wasser-

man (Wasserman, 1994), is deﬁned by eq. 7.

density(N) =

n(n − 1)

(7)

It reﬂects how many of the possible edges are actu-

ally present in the network. In relative terms, a lower

http://snap.stanford.edu (Oct. 2015)

density means less available edges. This in turn can

also negatively affect the average length of shortest

paths. An extreme example to demonstrate this effect

is that of a spanning tree of a complete network, i.e. a

network where all pairs of nodes are connected by an

edge. Containing only n −1 edges, almost all shortest

paths of the original network (having a length of one)

are lost and substituted by longer paths.

Furthermore, working with spanning trees, con-

nectedness of the underlying network is a hard con-

straint. Accordingly, the generated sample had to

be connected to be useable with STOA. Consider-

ing this restriction and our measures, we implemented

and compared three sampling concepts. A bare ran-

dom walk as discussed by Leskovec and Falout-

sos (Leskovec and Faloutsos, 2006). An induced ran-

dom walk, inspired by the TIES algorithm proposed

by Ahmed et al. (Ahmed et al., 2011), adding addi-

tional edges to the sample in an induction step. And a

random deletion method similar to the one described

by Krishnamurthy et al. (Krishnamurthy et al., 2005).

For each algorithm we considered ﬁve different

sample sizes and conducted ten sampling runs for

each combination. Without going into detail here,

we found the random deletion implementation to be

most effective in retaining the original degree distri-

bution. For sample sizes of at least 500 nodes (about

9% of the network size), it also showed reasonably

good results with regard to the density. Hence, for our

parametrization we used the best sample generated by

this method.

5.1.2 Parametrization

We ﬁrst had to decide on the two fundamental param-

eters of genetic algorithms: the size of the popula-

tion and the number of generations each optimization

run should evolve. For an optimal parallel process-

ing, we let 24 individuals form our population. Con-

ducting several test runs with different parameter set-

tings, we found 50 generations to be enough for the

genetic algorithm to converge, i.e. new offspring so-

lutions being only marginally better than those of the

last generation. To provide meaningful and compa-

rable results, we performed 40 independent optimiza-

tion runs with the same starting population for all of

the parametrization experiments.

The parametrization experiments comprised the

evaluation of two parameters: the size of the elite be-

ing copied directly to the offspring and the probability

by which mutation should be applied to the evolved

individuals. Assuming that the prevention of degra-

dation introduced by elitism might highly affect the

usefulness of mutation, we chose to investigate and

determine the optimal size for the elite ﬁrst. Further-

Extracting Navigation Hierarchies from Networks with Genetic Algorithms

0 1 2 3 4 5 6 7 8 9 10

Size of Elite

Global Stretch

Figure 3: Global stretch results for different elite sizes.

Lower values being better.

more, after settling both parameters, we investigated

how changes to the population size and generation

count affect the outcome in this setting.

The results of the optimization runs are presented

for all tested parameter values as boxplots (Field

et al., 2014) summarizing the ﬁtness values of the best

individuals of each run (e.g. ﬁgure 3). The horizon-

tal line dividing each box denotes the median of the

sample. The lower end of the box shows the value

of the ﬁrst quartile (25% of the results lie below that

value), likewise the upper end of the box represents

the third quartile (25% of the results lie above that

value). Consequently, the box itself covers the 50%

in the middle. The whiskers indicate the total range

of values, with outliers exceeding a certain threshold

plotted as separate dots.

Size of Elite. For the elite size we tried 0–10 indi-

viduals. This equals approximately 4% steps up to

a value of 40% of the population. We did not ﬁnd

recommendations on how to use elitism in literature,

but we assumed any values greater than 40% of the

population size to be counterproductive. Figure 3 de-

picts the results for the ﬁrst ﬁtness function, the global

stretch. Obviously, completely forgoing elitism deliv-

ers the worst optimization results. This ﬁnding con-

ﬁrms the negative impact of potentially replacing the

best individual when evolving a new generation. Ac-

cording to the median, preserving the best 5 individu-

als for the next generation slightly outperforms other

elite sizes. For higher values a small degradation can

be observed acknowledging our decision to limit the

test range. The situation for the local tree ﬁtness was

similar. However, the best results were achieved for a

slightly larger elite with 6 individuals.

Mutation Probability. Applying the determined

elite sizes, we started with a mutation probability of

0%, stepwise increasing it by 5%. As low proba-

bilities (smaller than 20%) have shown to perform

well (Haupt, 2000), we limited our experiments for

that parameter to a maximum value of 25%. The re-

sults, shown in ﬁgure 4, are similar to those gained

0 5 10 15 20 25

Mutation Probabillity (%)

Global Stretch

Figure 4: Global stretch results for different mutation prob-

abilities. Lower values being better.

for elitism. Without mutation, the expected loss in the

variety of genetic information is reﬂected by the cre-

ation of less optimal hierarchies. As soon as mutation

is applied, the used probability plays only a secondary

role. For both ﬁtness functions, a probability of 10%

produces the best individuals with regard to their me-

dian ﬁtness values.

Population Size vs. Generations. Although with the

above parameters the genetic algorithm starts to con-

verge, for the experiments on the original networks

we wanted to go beyond that point. To facilitate the

algorithm to produce even better optimized hierar-

chies, we had two options. Either to increase the size

of the population, allowing the algorithm to choose

from a larger gene pool and to create more offspring

in each generation, or to increase the number of gen-

erations, spending more time on reﬁning already good

solutions. To decide between those options, we per-

formed another experiment. First, we doubled the size

of our population using the parameters determined be-

fore. However, to check whether or not the size of the

elite needs to be kept at a certain percentage, we in-

cluded another test set where the elite size was dou-

bled as well. Finally, we tested the algorithm’s per-

formance for 100 generations instead of 50 without

changing the other parameters. Again, for both ﬁtness

functions we observed similar results (those for global

stretch shown in ﬁgure 5). While all changes further

increased the ﬁtness of the best individuals, doubling

the number of generations had the most notable ef-

fect. Compared to the values gathered in the muta-

tion probability experiments the global stretch could

be improved by approximately 10%. Aside from that,

the size of the elite reveals to be a parameter which

should be changed proportional to the population size.

5.2 Final Setup

Based on the ﬁndings of the parametrization exper-

iments, we used mostly the same settings for both

ﬁtness functions when generating hierarchies for the

WEBIST 2016 - 12th International Conference on Web Information Systems and Technologies

2.4

2.6

2.8

Population Size (Elite 5) Population Size (Elite 10) Generation Count

Doubled Value

Global Stretch

Figure 5: Global stretch results for different improvements

of the optimization process. Lower values being better.

Table 1: Fitness values (global stretch) and standard devia-

tions of the best individuals of the starting populations and

the last generations.

Wikipedia Facebook

Starting

pop.

Last

gen.

Starting

pop.

Last

gen.

Best 497.47 8.54 192.06 6.16

Median 500.18 10.68 198.65 7.04

SD 1.21 1.05 11.68 0.48

original networks. For a population of 24 individu-

als we relied on a mutation probability of 10% and

used 100 generations as a hard termination condition.

The size of the elite was set to 5 individuals for the

global stretch and 6 individuals for the local tree ﬁt-

ness, respectively. With respect to the time needed

for evolving one generation on the original networks,

we additionally used a stagnation soft limit, terminat-

ing the optimization when no improvement could be

observed for 20 generations.

6 RESULTS

Applying global stretch as the ﬁtness function, we

performed 50 experimental runs on both original net-

works. The evaluation of the local tree ﬁtness, how-

ever, showed to be about 20 times slower than evalu-

ating a hierarchy’s global stretch. For this reason, we

had to rigorously limit the extent of the experiments

using this quality criterion. In fact, we only conducted

5 runs on the Wikipedia network. Consequently, the

following analysis of the quality and structure of op-

timized hierarchies, is based on the much larger data

set collected for the global stretch. To compensate

for this shortcoming, we analyzed the progression of

the evolution processes for both ﬁtness functions and

compared the generated hierarchies.

6.1 Optimality of Hierarchies

In the course of the parametrization, increasing the

number of generations showed to be an effective mean

for extending the optimization process. However, in

the ﬁnal experiments, 74% of the evolutionary pro-

cesses ended before the limit of 100 generations was

reached. In average, the best hierarchy was already

found after 51 generations. Without further experi-

ments, we can only speculate whether or not an in-

crease of the population size might yield better results

for these networks, though.

Despite this discrepancy to the parametrization

experiments, the optimization results are promising

(table 1). Comparing the best hierarchy of all start-

ing populations to the overall best hierarchy obtained

by our experiments, a decrease in the global stretch

of over 96% can be observed for both networks. Al-

though we do not know the possible optimum values,

this outcome at least indicates a good optimization

performance. Since the Facebook network is about

10% smaller than the Wikipedia network, the better

quality of the Facebook hierarchies is not a surprise.

Interestingly, however, the relative improvement is

nearly the same for both networks. This suggests that

our algorithm performs well independently of the size

and structure of the network. The best Wikipedia hier-

archy has a ﬁtness of 8.54, meaning its shortest paths

are in average 8 times longer than those of the orig-

inal network. For the smaller Facebook network, at

least a value of 6.16 was reached. Taking into ac-

count that the hierarchies contain less than 5% of the

network edges, these results can still be considered

good. Additionally, the standard deviations for the

best hierarchies of the last generation are pleasantly

low. Although we have seen some outliers, our algo-

rithm shows to deliver hierarchies of rather constant

quality.

6.2 Hierarchy Structure

To get an idea about how valuable navigation hier-

archies look like, we examined several measures re-

vealing information about the structure of a network.

We considered two centrality measures, which can be

used as indicators for the importance of nodes. For a

node v of a graph G = (V, E) containing n > 1 nodes,

we deﬁne the degree centrality to be the fraction of

nodes v is directly connected to (eq. 8).

(v) =

deg(v)

n − 1

(8)

Like the node degree itself, this measure reveals net-

work hubs (high degree nodes), which have been

Extracting Navigation Hierarchies from Networks with Genetic Algorithms

0.00

0.25

0.50

0.75

1.00

1 10 100

Degree

P(X ≤ x)

network

Wikipedia

Facebook

Figure 6: Average degree distribution functions of the best

hierarchies. Conﬁdence intervals are too small to be shown.

shown to be important for human navigation in infor-

mation networks. In the following, such nodes will

be called degree hubs. Furthermore, we investigated

betweenness centrality, a measure indicating the im-

portance of a node for the shortest paths of a net-

work (Brandes, 2008). More precisely, in a directed

network for a node v the betweenness centrality sums

up the fractions of all shortest paths in the graph that

pass through v (eq. 9).

(v) =

∑

s,t∈V

s6=v6=t

sp(s, t|v)

sp(s, t)

(9)

While sp(s, t) is the number of shortest paths between

the nodes s and t, sp(s,t|v) denotes the number of

those paths passing through v. In an undirected net-

work the score is typically halved to account for all

pairs of nodes being considered twice. We addition-

ally averaged that value by the total number of node

combinations relevant for node v to get a more expres-

sive and comparable score (eq. 10). We refer to this

measure by the term average betweenness centrality.

Analogous to degree hubs, in the following we refer

to nodes with a high average betweenness centrality

as navigational hubs.

avg(c

(v)) =

(n − 1)(n − 2)

· c

(v) (10)

Along with the centrality measures, we analyzed

the hierarchies’ degree distribution. Additionally, we

compared their diameter, i.e. the length of the longest

shortest path, to that of the networks. Naturally, we

also considered the hierarchies’ ﬁtness values.

6.2.1 Good Hierarchies

First, we analyzed the best hierarchies of each run of

our algorithm. We noticed that there is nearly no dif-

ference in the degree distribution between the hierar-

chies of one network. Furthermore, averaged over all

of the hierarchies of one network, the distributions of

both networks show to be similar (ﬁgure 6). While

the major fraction of nodes are leaves, i.e. nodes only

connected by a single edge, the optimized hierarchies

also contain a small number of degree hubs. Since

these have shown to be important ﬁrst steps in human

search paths (West and Leskovec, 2012), we further

investigated them. Therefor, for a centrality measure

c, a hierarchy H and a node m having the highest cen-

trality value in H, we deﬁned hubs of H as shown in

equation 11.

v is a hub of H ⇐⇒ c(v) ≥ 0.75 · c(m) (11)

Although this approach might be problematic in net-

works without distinctive hubs, for our use case this

showed to be a viable deﬁnition. Considering both

centrality measures, we observed an interesting dif-

ference between the hierarchies of both networks. In

all of the Wikipedia hierarchies there exists exactly 1

degree hub. At the same time, the number of naviga-

tional hubs varies between 1 and 10, with a median of

2 nodes. The Facebook hierarchies show a contrary

structure. While 75% of them have exactly 1 naviga-

tional hub, in average 4 hubs of high degree can be

found. We found, that in general the degree central-

ity of navigational hubs strongly correlates with their

average betweenness centrality (r > 0.61). For de-

gree hubs, however, this correlation is far less distinct

(r < 0.36).

Analyzing the compactness of the hierarchies, in

terms of their diameter, for both networks we found

nearly identical, but high median values (≈ 75). Com-

pared to the diameters of the underlying networks

(Wikipedia: 5, Facebook: 8), this poses a drastic in-

crease. To estimate how the diameter is reﬂected in

the structure of the hierarchies, we analyzed the distri-

bution of nodes around the most central navigational

hub. We found a notably amount of nodes being situ-

ated far away of the central nodes. With this in mind,

the high diameter is unlikely to be caused by single

linear branches leading away from the core of the hi-

erarchy. Instead, an overall widespread structure can

be assumed.

6.2.2 The Good, the Bad and the Ugly

Calling to mind table 1, the linear structure of the

starting population is obviously a degrading factor for

the quality of hierarchies. Apart from this extreme,

knowledge about the subtle differences distinguish-

ing the best solutions from the second best might be

of use. Either for manually improving hierarchies or

estimating their optimality. To this end, we also ex-

amined the worst hierarchies of the last generation of

our experimental runs.

WEBIST 2016 - 12th International Conference on Web Information Systems and Technologies

(a) The good (b) The bad (c) The ugly

Figure 7: Three Wikipedia hierarchies of different quality (layouts are based on a spring model). Subﬁgure (a) shows the best

hierarchy created, with a global stretch of 8.54. In subﬁgure (b) the global stretch is 11.97. A subtree splitting off from the

navigational core can be observed. The last hierarchy’s global stretch is 22.35. A linear structure is clearly noticeable.

Starting off by comparing the distribution of node

degrees, no differences between good and bad hier-

archies can be detected. A major difference, how-

ever, can be observed for the number of navigational

hubs. While the navigational core of good hierarchies

(ﬁgure 7a) is composed of very few dominant naviga-

tional hubs as well as some nodes of lower between-

ness centrality, the core of bad hierarchies is dis-

tributed amongst a much larger set of nodes, skewed

towards lower centrality. As discussed in the last sec-

tion, this distribution and the lack of a speciﬁc central

node are likely to introduce a separation of parts of the

hierarchy (ﬁgures 7b and 7c). Consequently, for bad

hierarchies a much higher spread, reﬂected by large

average diameters, can be recognized. These observa-

tions suggest that the distribution of navigational hubs

can be used as an indicator for the quality of naviga-

tional hierarchies. Condensed navigational cores of

only few nodes with very high average betweenness

centrality are beneﬁcial. On the other hand, numer-

ous nodes of mediocre centrality introduce linearity,

which has been shown to be fatal.

6.3 Comparing the Fitness Functions

As the basis of comparison of our ﬁtness functions,

we chose one of the experimental runs in which the

local tree ﬁtness had been applied. In particular, we

considered the run which created the best of all hier-

archies with regard to local tree ﬁtness. Throughout

the evolutionary process of this run 3588 hierarchies

were created. To get a comparable data set, we addi-

tionally calculated the global stretch for all of them.

Analyzing this data, we ﬁrst found the values of both

ﬁtness functions to highly and signiﬁcantly correlate

(r = −0.52, p < 2.2 · 10

−16

). Interpreting the corre-

lation, care has to be taken. Both ﬁtness functions

indicate improvement in opposite directions. There-

fore, the negative correlation of the ﬁtness values cor-

responds to a positive correlation of the quality of the

generated hierarchies. This result conﬁrms that our

quality criteria are likely to be interchangeable and,

on the long run, should lead to the same hierarchies.

7 CONCLUSION

With the size and importance of information networks

growing constantly, the necessity for retaining efﬁ-

cient navigation arises. Inspired by human exam-

ple, our work focuses on network hierarchies used as

means for efﬁcient navigation. We offer a method for

the creation of optimized navigation hierarchies based

on genetic algorithms. Although ultimate optimal-

ity is unlikely to be reached by a heuristic approach,

the hierarchies created for two sample networks are

promising. Good hierarchies share structural proper-

ties. They possess a large number of leaf nodes, ac-

companied by a few degree hubs. Additionally, their

navigational core consists of a few, extraordinary nav-

igational hubs. From there, a rather homogenous net-

work of nodes emerges, leading to hierarchies more

widespread than their underlying networks.

By integration in user interfaces or back-ends

these optimized navigation hierarchies might be valu-

able for guiding human navigation. Especially in

early stages of a search, they can offer structural in-

formation about the network, which can be used to

overcome human knowledge gaps in foreign domains.

Focusing on the preservation of shortest paths, they

might also be of interest for improving the perfor-

mance of decentralized search algorithms.

Apart from applying optimized navigational hier-

archies in practice there are other interesting aspects

which should be considered in future work. Due to the

Extracting Navigation Hierarchies from Networks with Genetic Algorithms

modularity of genetic algorithms, STOA is change-

able in various ways. As a side-effect of an early

bug in the implementation of random edge mutation,

for example, we conducted a few experiments with

an unintentionally and randomly decaying mutation

probability. Surprisingly, the optimization performed

slightly better in these cases suggesting the purposeful

application of a decaying mutation probability. Ad-

ditionally, we did not rigorously analyze the perfor-

mance of our optimization approach. Applying our

algorithm to networks with known optima in future

analysis, might clarify how much room for improve-

ment is left.

REFERENCES

Adamic, L. and Adar, E. (2005). How to search a social

network. Social Networks, 27(3):187–203.

Ahmed, N., Neville, J., and Kompella, R. R. (2011). Net-

work sampling via edge-based node selection with

graph induction. Technical report, Purdue University.

Brandes, U. (2008). On variants of shortest-path between-

ness centrality and their generic computation. Social

Networks, 30(2):136–145.

Carvalho, P. M. S., Ferreira, L. A. F. M., and Barruncho, L.

M. F. (2001). On spanning-tree recombination in evo-

lutionary large-scale network problems-application to

electrical distribution planning. Evolutionary Compu-

tation, IEEE Transactions on, 5(6):623–630.

Clauset, A., Moore, C., and Newman, M. E. (2008). Hier-

archical structure and the prediction of missing links

in networks. Nature, 453(7191):98–101.

Field, A., Miles, J., and Field, Z. (2014). Discovering statis-

tics using R. SAGE Publications.

Goldberg, D. E. (1989). Genetic Algorithms in Search, Op-

timization, and Machine Learning. Addison-Wesley

Professional.

Haupt, R. L. (2000). Optimum population size and muta-

tion rate for a simple real genetic algorithm that op-

timizes array factors. In Antennas and Propagation

Society International Symposium, 2000. IEEE, vol-

ume 2, pages 1034–1037. IEEE.

Helic, D., Strohmaier, M., Granitzer, M., and Scherer, R.

(2013). Models of human navigation in information

networks based on decentralized search. In Proceed-

ings of the 24th ACM Conference on Hypertext and

Social Media, pages 89–98. ACM.

Heymann, P. and Garcia-Molina, H. (2006). Collaborative

creation of communal hierarchical taxonomies in so-

cial tagging systems. Technical report, Stanford Info-

Lab.

Kleinberg, J. (2000a). Navigation in a small world. Nature,

406(6798):845–845.

Kleinberg, J. (2000b). The small-world phenomenon: An

algorithmic perspective. In Proceedings of the thirty-

second annual ACM symposium on Theory of comput-

ing, pages 163–170. ACM.

Kleinberg, J. (2002). Small-world phenomena and the dy-

namics of information. Advances in neural informa-

tion processing systems, 1:431–438.

Krishnamurthy, V., Faloutsos, M., Chrobak, M., Lao, L.,

Cui, J.-H., and Percus, A. G. (2005). Reducing large

internet topologies for faster simulations. In NET-

WORKING 2005. Networking Technologies, Services,

and Protocols; Performance of Computer and Com-

munication Networks; Mobile and Wireless Commu-

nications Systems, pages 328–341. Springer.

Leskovec, J. and Faloutsos, C. (2006). Sampling from large

graphs. In Proceedings of the 12th ACM SIGKDD in-

ternational conference on Knowledge discovery and

data mining, pages 631–636. ACM.

Lov

asz, L. (1993). Random walks on graphs: A survey.

Combinatorics, Paul erdos is eighty, 2(1):1–46.

McAuley, J. J. and Leskovec, J. (2012). Learning to dis-

cover social circles in ego networks. In NIPS, volume

272, pages 548–556.

Milgram, S. (1967). The small world problem. Psychology

today, 2(1):60–67.

Muchnik, L., Itzhack, R., Solomon, S., and Louzoun, Y.

(2007). Self-emergence of knowledge trees: Extrac-

tion of the wikipedia hierarchies. Physical Review E,

76(1):016106.

Strohmaier, M., Helic, D., Benz, D., K

orner, C., and Kern,

R. (2012). Evaluation of folksonomy induction algo-

rithms. ACM Trans. Intell. Syst. Technol., 3(4):74:1–

74:22.

Trattner, C., Singer, P., Helic, D., and Strohmaier, M.

(2012). Exploring the differences and similarities be-

tween hierarchical decentralized search and human

navigation in information networks. In Proceedings

of the 12th International Conference on Knowledge

Management and Knowledge Technologies, page 14.

ACM.

Wasserman, S. (1994). Social network analysis: Methods

and applications, volume 8. Cambridge university

press.

West, R. and Leskovec, J. (2012). Human wayﬁnding in in-

formation networks. In Proceedings of the 21st inter-

national conference on World Wide Web, pages 619–

628. ACM.

West, R., Pineau, J., and Precup, D. (2009). Wikispeedia:

An online game for inferring semantic distances be-

tween concepts. In IJCAI, pages 1598–1603.

WEBIST 2016 - 12th International Conference on Web Information Systems and Technologies