Modelling and Analysing Social Networks through Formal Methods

and Heuristic Searches

Antonella Santone

and Gigliola Vaglini

Dipartimento di Ingegneria, University of Sannio, Benevento, Italy

Dipartimento di Ingegneria della Informazione, University of Pisa, Pisa, Italy

Keywords:

Social Networks, Formal Methods, Heuristic Searches.

Abstract:

The paper presents a process algebraic approach to formal speciﬁcation and veriﬁcation of social networks.

They are described using the Calculus of Communicating Systems and we reason and verify such formal sys-

tems by using directed model checking, which uses AI-inspired heuristic search strategies in order to improve

model checking techniques.

1 INTRODUCTION

Recently, great interest arose for analysis methods of

complex social networks, from communication net-

works, to friendship networks, to professional and or-

ganisational networks. A social network is a set of

people (or organizations or other social entities) con-

nected by a set of social relationships, such as friend-

ship, co-working or information exchange. Social

network analysis focuses on the analysis of patterns of

relationships among people, organizations, states and

such social entities. Social network analysis provides

both a visual and a mathematical analysis of human

relationships. The Web can also be considered as a

social network. Social networks are formed between

Web pages by hyperlinking to other Web pages.

Heuristic search (Pearl, 1984) is one of the classi-

cal techniques in Artiﬁcial Intelligence and has been

applied to a wide range of problem-solving tasks in-

cluding puzzles, two player games, and path ﬁnding

problems. A key assumption of heuristic search is that

we can assign a utility or cost to each state. This cost

guides the search suggesting the next state to expand;

in this way the most promising paths are considered

ﬁrst.

Model checking (Clarke et al., 2001) is a method

to formally and automatically verify the correctness

of ﬁnite-state concurrent and distributed systems. As

model checking can be seen as a search in a state

space, heuristics can be exploited to explore state

spaces and verify properties. This approach is known

as directed model checking (Edelkamp et al., 2001;

Santone, 2003).

In this paper an application of directed model

checking to social networks is presented. First social

networks are described using the Calculus of Com-

municating Systems (CCS) of Milner (Milner, 1989).

Then, we consider some interesting properties of so-

cial network and we show how these properties can

be expressed in a temporal logic and veriﬁed using

either model checking or heuristic search. As an ex-

ample of heuristic search, we analyse the social dis-

tance: “how far (in terms of social distance) an actor

is from others”. In fact, the connections of an actor’s

social neighbours can be very important, even if the

actor is not directly connected to them. An admissi-

ble heuristic function is deﬁned which is syntactically

deﬁned, i.e., based on the CCS speciﬁcation only, and

can be automatically computed. With an admissible

heuristic function, the A* algorithm is guaranteed to

ﬁnd the minimal social distance. This is a prelimi-

nary work to formalise a social network in a simple

way. Our interest is in showing how formal methods

can be applied in this ﬁeld. As future work we want

to enrich this model to capture more aspects.

2 THE PROPOSED METHOD

The term network has different meanings in different

disciplines. In the social sciences, a network is usu-

ally deﬁned as a set of actors (or agents, or nodes,

or vertices) that may have relationships (or links, or

edges, or ties) with one another see Fig. 1, which is a

running example of this paper.

336

Santone A. and Vaglini G..

Modelling and Analysing Social Networks through Formal Methods and Heuristic Searches.

DOI: 10.5220/0004068903360339

In Proceedings of the 7th International Conference on Software Paradigm Trends (ICSOFT-2012), pages 336-339

ISBN: 978-989-8565-19-8

 2012 SCITEPRESS (Science and Technology Publications, Lda.)

Figure 1: A simple example of network.

The two most common ways of representing social

network data are by drawing the network and by us-

ing matrices. This section presents a representation of

social networks through CCS and interesting proper-

ties are formalised by using the selective mu-calculus

logic (Barbuti et al., 1999). The purpose is using for-

mal veriﬁcation environments also to analyse social

networks. In this preliminary work social networks

are formalised in a simple way, as discussed below,

as future work the model can be enriched to capture

more aspects. Each node N, with k outgoing arcs, is

represented as the following CCS process:

def

∑

i=1

n.M

where M

, for i ∈ [1..k], are the immediate successors

of N. Note that all arcs of a node N are labelled with

the corresponding lower-case letter n.

For example, the CCS process rep-

resenting the network in Fig. 1 is:

def

a.B+ a.C B

def

b.C C

def

c.B.

Clearly, many networks can be composed using

the CCS parallel operator “|”.

2.1 Basic Properties of Networks

In this section we ﬁrst recall some important proper-

ties of networks and we formalise them using selec-

tive mu-calculus logic.

Size of a network. The size of a network can be

determined in terms of the number of nodes of the

network (as stated in (Hanneman and Riddle, 2005))

or, alternatively, as the number of edges in the net-

work. In the network of our running example (shown

in Fig. 1) the number of nodes is 3, while the number

of edges is 4. Using any veriﬁcation environment it is

sufﬁcient to exploit speciﬁc commands and we obtain

the number of both states and edges.

Social distance. The connections of an actor’s so-

cial neighbours can be very important, even if the ac-

tor is not directly connected to them. In other words,

sometimes being a friend of a friend may be quitecon-

sequential. To capture this aspect of how individuals

are embedded in networks, one approach is to exam-

ine how far (in terms of social distance) an actor is

from others. The distance between two actors is the

minimum number of edges that takes to go from one

to another.

Reachability. Reachability between nodes is es-

tablished by the existence of a path between the

nodes. In simpler words, an actor is reachable by an-

other if there exists a set of connections by which we

can go from the source to the target actor, regardless

of how many others fall between them. In general, the

property: “node Y is reachable from node X” can be

expressed with the following logic formula:

hxi

hyi

tt (1)

On the other hand, the property: “node Y is reachable

from node X, without crossing nodes in the set S” can

be expressed with the following logic formula:

hxi

hyi

tt (2)

Recall the network shown in Fig. 1:

• “C is reachable from A”. The logic formula is:

= hai

hci

tt. It holds that A |= ψ

(instantia-

tion of (1)).

• “it is possible that C is reachable from A not

crossing node B. The logic formula is: ψ

hai

hci

{b}

tt. It holds that A |= ψ

(instantiation

of (2)).

Cycle. A cycle is a walk where the beginning and end

point of the walk are the same actor. In general, the

property: “a cycle on actor X” can be expressed with

the following logic formula:

νZ. hxi

Z (3)

We formally check, on the CCS process A, the follow-

ing property, which is an instantiation of the previous

formula (3):

• “there exists a cycle on actor B”: ϕ

= νZ. hbi

It holds that A |= ϕ

3 HEURISTIC BASED METHOD

FOR CALCULATING SOCIAL

DISTANCE

The problem of ﬁnding the minimal distance between

two actors can be seen as a search problem: there-

fore, heuristic search (Pearl, 1984) taken from Artiﬁ-

cial Intelligence can be used. We assume the reader

be familiar with heuristic search algorithms, like A*

and Greedy. A* returns a minimal-cost solution path

whether the heuristic estimate function

h satisﬁes the

so-called admissibility condition, i.e.,

h is optimistic.

Now, we deﬁne an admissible heuristic function able

ModellingandAnalysingSocialNetworksthroughFormalMethodsandHeuristicSearches

337

to ﬁnd the minimum social distance from p to the

node N. The heuristic we present is based on counting

the number of actions that a process can perform be-

fore reaching a node N. Remember that in the CCS

formalisation a node N has outgoing arcs labelled

with n. We suppose that each edge has cost equal to

1. We formally deﬁne the function

(p).

Deﬁnition 1 (

(p)). Let p be a CCS process, S , S

′

sets of visible actions, n the visible action of the node

N to reach, and C a set of pairs {hx,S

′

i}, where x is

a constant occurring in p. First, we deﬁne the auxil-

iary function

(p,S ,C ), inductively on p, as follows.

Then,

(p) =

(p,

0).

R1.

(nil, S , C ) = ∞.

For p = nil the function

returns ∞ as this is not the

node N.

R2. if α ∈ S ∪ {n} then

(α.p,S ,C ) = 0

else

(α.p,S ,C ) = 1+

(p,S ,C ).

When applied to α.p the function returns 0 if α is ei-

ther a restricted action (i.e., α ∈ S ) or the desired ac-

tion n, otherwise we recursively apply the function to

ﬁnd, if any, the node N. Roughly speaking, if α is re-

stricted by S then α.p could not be able to move; thus,

we optimistically return 0.

R3.

(p+ q,S ,C ) = min(

(p,S ,C ),

(q,S ,C )).

R4.

(p|q, S ,C ) = min(

(p,S ,C ),

(q,S ,C )).

When either the choice or the parallel composition of

two processes is encountered the minimum number of

actions between the two components is returned.

R5.

(p\L, S ,C ) =

(p,S ∪ L∪

L,C ).

The function is initially applied to a process with S =

0 which is modiﬁed when the function is applied to

p\L adding the actions in L∪

L, to S .

R6.

(p[ f],S ,C ) =

(p, f

−1

(S ∪ {n}),C ).

When considering a relabelled process we must take

as set of actions the set f

−1

(ρ) = {α | f(α) ∈ ρ},

since now the interesting actions are also those re-

labelled by f into actions in S or in n.

R7. if x

def

= p

, hx,S i ∈ C then

(x,S ,C ) = ∞

else

(x,S ,C ) =

(x,S ,C ∪ {hx,S i})

We expand the body of each constant x only once since

each constant already expanded is stored in C . Ini-

tially, C is equal to the empty set. We return ∞ when

we encounter a constant already expanded.

The following theorem ensure the admissibility of

the heuristic function.

Theorem 1. Let p be a CCS process and s be a state.

It holds that:

(s) ≤ h

∗

(s), where h

∗

(s) is the actual

cost of a preferred path from s to a goal node.

4 AN EXAMPLE

Let us apply our approach to a social network to ob-

tain the minimal social distance between two actors.

The node expansion terminates when a goal node is

reached. The CCS deﬁnition of the social network p

is the parallel composition of two sub-nets A and X

with synchronisation on the action c.

def

= a.B B

def

= b.F + b.C C

def

= c.D+ c.H + c.E

def

= h.I F

def

= f.M M

def

= m.N

def

= n.Q Q

def

= q.E D

def

= d.C

def

= i.E E

def

= e.I X

def

= x.Z

def

= z.E p

def

=(A|X[

c/z])\{c}

(A | X [c/z ])\{c}

a x

(B | X [c/z ])\{c} (A | Z [c/z ])\{c}

(B | Z [c/z ])\{c}

(F | Z [c/z ])\{c}(C | Z [c/z ])\{c}

(D | E [c/z ])\{c}

(H | E [c/z ])\{c} (E | E [c/z ])\{c}

(D | I [c/z ])\{c}

τ τ

Figure 2: A simple example with Greedy and

Suppose that we want to obtain the minimal social

distance between A and E. We apply the Greedy strat-

egy. The application of A

∗

is immediate: it is sufﬁ-

cient to use the evaluationfunction f(s) = g(s)+

h(s).

In Fig. 2 the

-value of each node is reported and the

shaded nodes represent expanded states. It holds that

actor E can be reached with the minimal path with

length equal to 5 (i.e. x− a− b− τ− e). There exists

another path reaching actors E with length equal to 7.

We have obtained this result generating only 10 states

while the standard transition system of p has 24 states

and 39 transitions.

The method has been successfully applied to a real

case study where several functionalities of social net-

works have been modelled such as: registration, add

as friend, news feed.

ICSOFT2012-7thInternationalConferenceonSoftwareParadigmTrends

338

5 RELATED WORK AND

CONCLUSIONS

The paper propose several contributions: (i) an alge-

braic description of social networks through CCS; (ii)

the deﬁnition of several properties of social networks

(minimum distance, reachability, cycles) using a tem-

poral logic, so that they can be veriﬁed trough model

checking; (iii) an admissible heuristic function to ver-

ify social distance easy to compute and automatically

calculated. A consequence of point (i) is that all the

efﬁcient techniques developed in CCS veriﬁcation en-

vironments can be used, as for example compositional

analysis (Clarke et al., 1989; Santone, 2002). More-

over, the result of point (ii) is that we offer a query

language based on temporal logic instead of just a set

of ﬁxed properties; thus the approach allows the spec-

iﬁcation and veriﬁcation of any interesting property

(also not built-in in traditional tools).

At present, the most popular tool for Network

analysis is UCINET

based on mathematical opera-

tions on matrices. For example, to prove the prop-

erty “X is reachable from Y not crossing node Z”,

UCINET must found all paths from Y to X and check

that Z does not occur in any path. In the presented

approach the property is deﬁned in the temporal logic

and then automatically veriﬁed in the model checking

environment.

To evaluate the effectiveness of the proposed

method, we have to consider both complexity and

scalability. From the complexity point of view, it is

easy to show that the complexity of the calculation of

h(n) is linear in the length of the CCS speciﬁcation.

The heuristic function is simple to calculate since it

is syntactically deﬁned, i.e., based on the CCS syntax

only; moreover, there is no need for user intervention

or manual efforts to compute it. From the scalabil-

ity point of view, in the work (Gradara et al., 2005)

an experimental study has been carried out to prove

the better scalability of the heuristic-based method

with respect to other techniques generally used (BFS,

DFS, etc.). Even if in that work a different property

(deadlock-freeness in concurrent systems) has been

considered, the results can be useful for a valid com-

parison. In some case studies a reduction of the state

space of 55% has been reached and a consequently

reduction in time with respect to BFS, for instance.

Moreover, while it is well-known that most current

formal methods are successfully applicable to small-

scale systems, but do not scale up well, in this paper,

directed model checking proposed in (Santone, 2003)

has been proved able to allow formal methods to scale

http://www.analytictech.com/ucinet/

up.

Recently, the authors in (He et al., 2007) also pro-

pose a process algebraic approach to modeling and

verifying the collective behaviors in social networks

using MWB (Victor and Moller, 1994) with no con-

sideration of the scalability. In (Jamali and Abolhas-

sani, 2006) a state of the art survey of the works done

on social network analysis, ranging from pure mathe-

matical analyses in graphs to analyzing the social net-

works in Semantic Web, is given. The main goal is to

provide a road map for researchers working on differ-

ent aspects of Social Network Analysis.

REFERENCES

Barbuti, R., Francesco, N. D., Santone, A., and Vaglini,

G. (1999). Selective mu-calculus and formula-based

equivalence of transition systems. J. Comput. Syst.

Sci., 59(3):537–556.

Clarke, E. M., Grumberg, O., and Peled, D. (2001). Model

checking. MIT Press.

Clarke, E. M., Long, D. E., and McMillan, K. L. (1989).

Compositional model checking. In LICS, pages 353–

362.

Edelkamp, S., Lluch-Lafuente, A., and Leue, S. (2001). Di-

rected explicit model checking with hsf-spin. In SPIN,

pages 57–79.

Gradara, S., Santone, A., and Villani, M. L. (2005). Us-

ing heuristic search for ﬁnding deadlocks in concur-

rent systems. Inf. Comput., 202(2):191–226.

Hanneman, R. and Riddle, M. (2005). Introduction to

social network methods. Riverside, CA: University

of California, Riverside (published in digital form at

http://www.faculty.ucr.edu/ hanneman/nettext).

He, Z., Yuan, L., and Zeng, G. (2007). A process algebraic

approach to modeling collective behaviors in social

networks. In Proceedings of the Third International

Conference on Semantics, Knowledge and Grid (SKG

2007).

Jamali, M. and Abolhassani, H. (2006). Different aspects

of social network analysis. In Web Intelligence, pages

66–72.

Milner, R. (1989). Communication and concurrency. PHI

Series in computer science. Prentice Hall.

Pearl, J. (1984). Heuristics - intelligent search strategies for

computer problem solving. Addison-Wesley series in

artiﬁcial intelligence. Addison-Wesley.

Santone, A. (2002). Automatic veriﬁcation of concur-

rent systems using a formula-based compositional ap-

proach. Acta Inf., 38(8):531–564.

Santone, A. (2003). Heuristic search + local model check-

ing in selective mu-calculus. IEEE Trans. Software

Eng., 29(6):510–523.

Victor, B. and Moller, F. (1994). The mobility workbench -

a tool for the pi-calculus. In CAV, pages 428–440.

ModellingandAnalysingSocialNetworksthroughFormalMethodsandHeuristicSearches

339