INEXACT GRAPH MATCHING THROUGH GRAPH COVERAGE

Lorenzo Livi, Guido Del Vescovo and Antonello Rizzi

Department of Information Engineering, Electronics and Telecommunications, SAPIENZA University,

Via Eudossiana 18, 00184 - Rome, Italy

Keywords:

Inexact graph matching, Graph kernels, Tensor product, Graph classiﬁcation system.

Abstract:

In this paper we propose a novel inexact graph matching procedure called graph coverage, to be used in su-

pervised and unsupervised data driven modeling systems. It relies on tensor product between graphs, since

the resulting product graph is known to be able to encode the similarity of the two input graphs. The graph

coverage is deﬁned on the basis of the concept of graph weight, computed on the weighted adjacency matrix

of the tensor product graph. We report the experimental results concerning two distinct performance evalu-

ations. Since for practical applications the computing time of any inexact graph matching procedure should

be feasible, the ﬁrst tests have been conceived to measure the average computing time when increasing the

average size of a random sample of fully-labeled graphs. The second one aims to evaluating the accuracy of

the proposed dissimilarity measure when used as the core of a classiﬁcation system based on the k-NN rule.

Overall the graph coverage shows encouraging results as a dissimilarity measure.

1 INTRODUCTION

A great number of Pattern Recognition problems

coming from real world applications must cope with

structured patterns, such as digital images, audio sig-

nals and chemical compounds, for instance. As a

result, each pattern can be represented as a labeled

graph, where vertices and edges are equipped with

complex labels, able to encode different kind of in-

formation. Indeed, developing classiﬁcation systems

able to cope with labeled graphs is a fundamental

step. Consequently, the ﬁeld of Graph-based Pattern

Recognition is growing fast, and is aimed to the estab-

lishment of efﬁcient and effective Pattern Recognition

techniques on the domain of graphs. The generaliza-

tion capability of any classiﬁcation system (being a

particular data driven modeling problem) strictly de-

pends on the way the inductive logic inference is de-

ﬁned and computed, which in turn is ﬁxed by choos-

ing a dissimilarity (or similarity) measure between in-

put patterns. Such a measure is therefore the most

important procedure in any inductive modeling sys-

tem. When dealing with graphs as input patterns, the

way we deﬁne a dissimilarity measure is not trivial,

especially in the case of fully-labeled graphs, where,

in general, vertices and edges can be even labeled

with complex data structures (text excerpts, audio se-

quences, images, and so on). Therefore, any dissim-

ilarity measure for graphs must be able to cope with

both topological and labels related information.

A labeled graph is a tuple G = (V, E, µ, ν) , where

V is the (ﬁnite) set of vertices, E ⊆ V ×V is the set of

edges, µ : V → L

is the vertex labeling function, with

the vertex-labels set and ν : E → L

is the edge la-

beling function, with L

the edge-labels set. The car-

dinalities of V and E are called the order and the size

of the graph, respectively. The adjacency matrix of

the graph is denoted with A, and if vertices, say v

, v

are connected by an edge e

i j

, we have A

i j

= 1, oth-

erwise 0. A (labeled) graph is said to be weighted if

⊆ R, with W

i j

= ν(e

i j

) known as the weighted ad-

jacency matrix. In the current scientiﬁc literature it is

possible to distinguish three mainstream approaches

for the inexact graph matching problem: Graph Edit

Distance (Neuhaus et al., 2006; Riesen and Bunke,

2009), Graph Embedding (Del Vescovo and Rizzi,

2007; Riesen and Bunke, 2010) and Graph Kernels

(Borgwardt et al., 2005; G

artner, 2008). In the fol-

lowing we propose a dissimilarity measure between

graphs named Graph Coverage by deﬁning a particu-

lar way to perform an inexact graph matching proce-

dure belonging to the graph kernels based family.

The paper is organized as follows. We explain the

proposed inexact graph matching procedure in Sec-

tion 2. In Section 3 we describe and discuss the tests

performed to evaluate performances considering both

269

Livi L., Del Vescovo G. and Rizzi A. (2012).

INEXACT GRAPH MATCHING THROUGH GRAPH COVERAGE.

In Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods, pages 269-272

DOI: 10.5220/0003732802690272

 SciTePress

qualitative and quantitative indexes. Conclusions are

drawn in Section 4, where we give also possible future

directions for enhancing the computation efﬁciency.

2 THE PROPOSED INEXACT

GRAPH MATCHING MEASURE

In this section we will explain the proposed inexact

graph matching method, that ﬁts well in the domain

of graph kernels. For this purpose, we will give a

brieﬂy introduce the context of graph kernels and ten-

sor product of graphs.

A symmetric continuous function k : X × X → R

is called a positive deﬁnite kernel if

∑

i, j

k(x

, x

) ≥

0 holds, with c

, c

∈ R, for each ﬁnite n ≥ 2. Graph

kernels are functions deﬁned over the domain of

graphs, X = G. They rely on the representation of the

graph in an implicitly deﬁned high dimensional fea-

ture space. The reproducing property of valid kernel

functions is of fundamental importance in Machine

Learning and Pattern Recognition domains, and the

relation k(x, z) = hΦ(x), Φ(z)i, ∀x, z ∈ X , is referred

as the kernel trick (Sch

olkopf and Smola, 2002). So,

the effort should be focused only to the deﬁnition of a

valid kernel function for the speciﬁc input domain X

(e.g. graphs, X = G).

The tensor product (also called direct product)

(Imrich and Klav

zar, 2000) of two graphs G

and G

denoted with G

= G

⊗ G

, is deﬁned as

={(v

, u

) : v

∈ V

, u

∈ V

} (1)

={((v

, u

), (v

, u

)) : (v

, v

) ∈ E

∧ (u

, u

) ∈ E

}

In the following, we will call G

the tensor prod-

uct graph. In the context of Pattern Recognition, the

key issue is to deﬁne a meaningful dissimilarity mea-

sure between patterns. Consequently, the standard

tensor product formulation have been adapted, taking

into account also the similarities of the labels of both

vertices and edges (Borgwardt et al., 2005). The re-

sulting adjacency matrix of the tensor product graph

can be computed using a valid kernel function,

k(·, ·), deﬁned as a product of three different valid ker-

nel functions for vertex and edge labels

k((u

, u

), (v

, v

)) = (2)

= k

, u

) · k

((u

, v

), (u

, v

)) · k

, v

)

For example, k

(·, ·) and k

(·, ·) can be evaluated

as Gaussian Radial Basis kernels of the type

, u

) = exp



−

d(µ(u

), µ(u

))

2σ



(3)

where d(·, ·) is a suitable dissimilarity for the speciﬁc

labels domain.

2.1 Graph Coverage

Given two input arbitrarily labeled graphs, say G

and G

, we compute the tensor product graph G

shown in Equation 2, using the product of three Gaus-

sian kernel functions, as the basic similarity scheme.

If G

is the resulting graph of this operation, we an-

alyze its characteristics with respect to the optimal

tensor product graph achievable from the two input

graphs. Given two input graphs G

and G

, three

tensor product graphs are possible, namely G

(1,2)

⊗G

, G

(1,1)

= G

⊗G

and G

(2,2)

= G

⊗G

. The

graphs G

(1,1)

and G

(2,2)

are the best matching tensor

product graphs, that is, they represent the cases where

is exactly equal to G

and viceversa. Now we in-

troduce the notion of weight for a graph.

Deﬁnition 1. (Weight of the Graph). Given a

weighted graph G of order n, we deﬁne its weight,

denoted with W (G), as

W (G) =

∑

i=1

∑

j=1

i j

≥ 0

Our tensor product graph G

is actually a

weighted graph, with ν(e

i j

) ∈ [0, 1], ∀e

i j

∈ E(G

The maximum theoretical achievable weight for such

a tensor product graph G

is clearly upper bounded

by its size, that is W(G

) ≤ |E

|. Given two in-

put labeled graphs G

and G

, the maximum theo-

retical weight for G

= G

⊗ G

is certainly achiev-

able if and only if |L

| = |L

| = 1 holds. The aim of

the graph coverage measure is in understanding how

much the two given input graphs G

and G

over-

lap, considering both the topologies and labels. This

is done confronting the weight of the tensor product

graph G

(1,2)

with respect to the real maximum achiev-

able weight considering G

and G

. Formally,

Deﬁnition 2. (Graph Coverage). Given two arbi-

trarily labeled graphs G

and G

, let G

(1,2)

, G

(1,1)

and G

(2,2)

be their respective possible tensor product

graphs. We deﬁne the coverage of G

and G

κ(G

, G

) =

W (G

(1,2)

)

max{W (G

(1,1)

), W(G

(2,2)

)}

(4)

It is easy to see that the function shown in Equa-

tion 4 is symmetric and normalized in [0, 1]. It is sym-

metric because the tensor product ⊗ and max{·, ·} op-

erators are both commutative. This is indeed a valid

kernel function, because it basically consists in prod-

ucts and sums between valid kernel functions of the

type shown in Equation 2. It can be also converted

ICPRAM 2012 - International Conference on Pattern Recognition Applications and Methods

270

very easily into a dissimilarity just setting, for exam-

ple

d(G

, G

) = 1 −κ(G

, G

) (5)

If two graphs are equal, their coverage is maximal,

that is 1. Conversely, if they are completely different,

their coverage tends to 0. A dual behavior is valid

for the dissimilarity formulation shown in Equation

5. The number of parameters on which it depends is

derived from the nature of the basic similarity func-

tions, that is from the particular kernel functions used

in Equation 2. The computational complexity of the

graph coverage is dominated by the computation of

the tensor product between the two input graphs, that

is between their adjacency matrices. If n and m are,

respectively, the order of G

and G

, then the compu-

tational complexity of the tensor product between G

and G

is of the order O(n

). Therefore, the com-

putational complexity of the graph coverage is given

by O(d(n

)), where d is the cost for each

basic kernel computation. So, from the computational

complexity point of view, the graph coverage formu-

lation is faster than other graph kernels based on the

convolution of (random) walks. In Figure 1 is shown

the computing time achieved over a random sample

of graphs with 10 and 20-dimensional real vectors for

vertices and edges labels, respectively. As it is possi-

ble to observe, up to an order of 45, with an average

size of 256, the average computation time remains un-

der 1 second.

Figure 1: Time Performance Plot.

3 K-NN CLASSIFICATION

In this section, we provide the evaluations over the

IAM dataset (Riesen and Bunke, 2008). The tests are

executed on a machine with an Intel(R) Core(TM)2

Quad CPU Q6600 2.40GHz and 4 Gb of RAM over

a Linux OS. The graph coverage is implemented in

C++, as well as the other software components of

the SPARE library (Del Vescovo et al., ). The exe-

cution time in each test is computed using the clock()

function deﬁned in ctime. To be able to evaluate the

pure performance of the graph coverage measure, we

have tested the accuracy of the classiﬁcation using

the k-NN classiﬁer of the SPARE library over some

IAM datasets. The k-NN classiﬁer is a very sim-

ple tool that relies totally on the dissimilarity evalu-

ation of the input patterns, so the classiﬁcation per-

formance of the graph coverage can be evaluated di-

rectly. In Table 1 are shown the results achieved over

some IAM datasets, namely Letter LOW, Letter MED,

Letter HIGH, AIDS, Fingerprint and COIL-DEL. The

datasets Letter LOW, Letter MED and Letter HIGH

are composed of a triple of training, validation and

test sets, each of 750 patterns. The ﬁrst dataset is

composed of letters with a low level of distortion. The

patterns of the second and the third dataset are af-

fected by medium and high level of distortions. Each

dataset contains equally-distributed patterns from 15

different classes. The AIDS dataset is a not-equally

distributed two-class set of graphs with 250, 250 and

1500 samples for the training, validation and test

set, respectively. The Fingerprints dataset contains

graphs from four different classes with 500, 300 and

2000 samples for the training, validation and test set,

respectively. The graphs are not equally distributed

among the four different classes. Finally, the COIL-

DEL dataset is composed of 2400, 500 and 1000

graphs for the training, validation and test set, re-

spectively, that are equally distributed among 100

classes. The last column of Table 1, labeled ms, con-

tains the average computation time, expressed in mil-

liseconds, for each single graph coverage computa-

tion. The results are shown for ten different values

of the k parameter of the nearest neighborhood clas-

siﬁer (for k = 1 → 10). The results shown in Table

1 are obtained learning the three different σ parame-

ters of the Gaussian RBF of Equation 3 with a genetic

algorithm-based optimization procedure carried out

over the validation set. It is possible to see a generic

stable behavior for different values of k. In particu-

lar we observe very good results on the three Letters

and the AIDS datasets. Conversely, on COIL-DEL and

Fingerprint we observe a low classiﬁcation accuracy.

This result is, in some sense, not expected, consider-

ing the average achieved results. However, it is worth

to perform a deeper analysis and we will proceed with

further investigations aiming to better characterize the

effectiveness of the proposed graph matching proce-

dure with respect to classiﬁcation problems proper-

ties.

In Table 2 are shown the best results, taken

from (Riesen and Bunke, 2009, Table 2), achieved

by different algorithms over some of the IAM

graphs databases. The algorithm named Heuristic-

∗

(Neuhaus et al., 2006) computes the exact GED

INEXACT GRAPH MATCHING THROUGH GRAPH COVERAGE

271

Table 1: Classiﬁcation Accuracy with k-NN.

Dataset K=1 K=3 K=5 K=7 K=10 ms

Letter L 98.80 98.40 97.73 98.93 98.80 0.103

Letter M 80.53 79.60 79.86 81.86 82.93 0.105

Letter H 74.00 71.86 75.60 75.87 78.40 0.153

AIDS 96.06 94.40 93.53 93.00 89.94 2.338

Fingerprints 39.67 37.38 38.95 39.15 38.24 0.573

COIL-D 37.50 12.50 12.50 12.50 12.50 32.94

using heuristic information. The algorithm named

Beams(10) is one of its approximated variation pro-

posed in (Neuhaus et al., 2006). Note that 10 is the

value of the parameter s chosen for the algorithm ex-

ecution. The algorithm named BP is a fast bipartite

graph matching procedure proposed in (Riesen and

Bunke, 2009). Finally, GC stands for the Graph Cov-

erage algorithm. Note that in Table 2 the Heuristic-

∗

algorithm is unable to achieve any result for some

dataset, due to its computational limit.

The achieved results, using the k-NN rule over the

IAM datasets, clearly show the validity of the pro-

posed method, with respect to some of the state of

the art methodologies.

Table 2: Classiﬁcation Results over the Letter LOW (L-L),

Letter MED (L-M), Letter HIGH (L-H), AIDS, Fingerprints

(F) and COIL (C) Datasets.

Algorithm Datasets

L-L L-M L-H AIDS F C

Heuristic-A

∗

91.0 77.9 63.0 - - 93.3

Beam(10) 91.1 78.5 63.9 96.2 84.6 93.3

BP 91.1 77.6 61.6 97.0 78.7 93.3

GC 98.9 83.2 78.4 96.0 39.6 37.5

4 CONCLUSIONS AND FUTURE

DIRECTIONS

In this paper we have proposed a novel inexact graph

matching procedure. It is simple in its formulation

and at the same time effective, relatively fast and

ﬂexible. This is, indeed, the real interesting contri-

bution introduced by the proposed method, consid-

ering the other available graph kernels. In fact, it

is worth to stress that the graph coverage is able to

deal also with fully-labeled graphs, where vertices

and edges labels can be even complex data struc-

tures, once valid kernel functions deﬁned in these do-

mains, to be used as similarity measures, are pro-

vided. The proposed procedure shows interesting pre-

liminary results, considering both classiﬁcation accu-

racy and computational performance. We are plan-

ning to test this algorithm over more shared bench-

marking graph-based datasets. Moreover, the pro-

posed inexact graph matching procedure is based on

tensor product between graphs. This product is a

mathematically solid and properties-rich operation,

that is basically founded on multiple product between

matrices and scalars. Therefore, the procedure is well

suited to be implemented in relatively inexpensive

parallel computing devices, such as Graphic Process-

ing Units (GPUs) or Field Programmable Gate Array

(FPGA). Taking advantage of these technologies, our

effort will be focused on the formulation of a more

efﬁcient graph coverage procedure, able to deal with

graphs of order of 200 and beyond in a reasonable

computing time.

REFERENCES

Borgwardt, K. M., Ong, C. S., Sch

onauer, S., Vish-

wanathan, S. V. N., Smola, A. J., and Kriegel, H.-P.

(2005). Protein function prediction via graph kernels.

Bioinformatics, 21:47–56.

Del Vescovo, G., Livi, L., Rizzi, A., and Frattale Mascioli,

F. M. Spare: Something for pattern recognition. Sub-

mitted for publication at: Journal of Machine Learn-

ing Research, Microtome Publishing.

Del Vescovo, G. and Rizzi, A. (2007). Automatic classi-

ﬁcation of graphs by symbolic histograms. In Pro-

ceedings of the 2007 IEEE International Conference

on Granular Computing, GRC ’07, pages 410–416,

San Jose, CA, USA. IEEE Computer Society.

artner, T. (2008). Kernels for structured data. Number v.

72 in Kernels For Structured Data. World Scientiﬁc.

Imrich, W. and Klav

zar, S. (2000). Product graphs, struc-

ture and recognition. Wiley-Interscience series in dis-

crete mathematics and optimization. Wiley.

Neuhaus, M., Riesen, K., and Bunke, H. (2006). Fast sub-

optimal algorithms for the computation of graph edit

distance. In Structural, Syntactic, and Statistical Pat-

tern Recognition. LNCS, pages 163–172. Springer.

Riesen, K. and Bunke, H. (2008). Iam graph database repos-

itory for graph based pattern recognition and machine

learning. In Proceedings of the 2008 Joint IAPR Inter-

national Workshop on Structural, Syntactic, and Sta-

tistical Pattern Recognition, SSPR & SPR ’08, pages

287–297, Berlin, Heidelberg. Springer-Verlag.

Riesen, K. and Bunke, H. (2009). Approximate graph

edit distance computation by means of bipartite graph

matching. Image Vision Comput., 27:950–959.

Riesen, K. and Bunke, H. (2010). Graph Classiﬁcation and

Clustering Based on Vector Space Embedding. Se-

ries in Machine Perception and Artiﬁcial Intelligence.

World Scientiﬁc Pub Co Inc.

Sch

olkopf, B. and Smola, A. (2002). Learning with ker-

nels: support vector machines, regularization, opti-

mization, and beyond. Adaptive computation and ma-

chine learning. MIT Press.

ICPRAM 2012 - International Conference on Pattern Recognition Applications and Methods

272