Shape-based Object Retrieval and Classiﬁcation with Supervised

Optimisation

Cong Yang

, Oliver Tiebe

, Pit Pietsch

, Christian Feinen

, Udo Kelter

and Marcin Grzegorzek

Research Group for Pattern Recognition, University of Siegen, Siegen, Germany

Software Engineering and Databases Group, University of Siegen, Siegen, Germany

Keywords:

Object Retrieval, Object Classiﬁcation, Optimisation, Shape Features.

Abstract:

In order to enhance the performance of shape retrieval and classiﬁcation, in this paper, we propose a novel

shape descriptor with low computation complexity that can be easily fused with other meaningful descriptors

like shape context, etc. This leads to a signiﬁcant increase in descriptive power of original descriptors without

adding to much computation complexity. To make the proposed shape descriptor more practical and general,

a supervised optimisation strategy is introduced. The most signiﬁcant scientiﬁc contributions of this paper

includes the introduction of a new and simple feature descriptor with supervised optimisation strategy leading

to the impressive improvement of the accuracy in object classiﬁcation and retrieval scenario.

1 INTRODUCTION

Shape retrieval and classiﬁcation are very important

topics in computer vision. In order to improve the ac-

curacy of shape matching a number of new shape de-

scriptors have been proposed in recent years (Zhang

and Lu, 2004) to effectively ﬁnd perceptually simi-

lar shapes from a database. In addition, some learn-

ing methods (Bai et al., 2010) have been employed on

the top of shape matching algorithms to involve more

shape context information.

However, it is a difﬁcult task to develop appro-

priate shape descriptors and matching algorithms.

Firstly, a desirable shape descriptor should be in-

variant to shape rotation, translation and scaling.

Though skeleton-based matching approaches (Goh,

2008; Hedrich et al., 2013; Bai and Latecki, 2008)

can effectively classify shapes, most of them require

heavy calculation for skeleton pruning (Bai et al.,

2007) and for determining correspondences (Bai and

Latecki, 2008). Secondly, for both shape descrip-

tors and matching algorithms, there are many uncer-

tain parameters that are involved in the feature gener-

ation and matching processes which are hard to be

optimised by experiences among different datasets.

Over the past decade or so, computer scientists have

proposed many meaningful shape descriptors like In-

ner Distance (Ling and Jacobs, 2007), Shape Con-

text (Belongie et al., 2002). However, in order to re-

duce the computing complexity, researchers usually

employ only a partial number of points which are ran-

domly selected or sampled with ﬁxed distances along

the shape contour. This strategy could easily lose crit-

ical features since some vertexes or partial deforma-

tions might be overlooked.

Motivated by the above mentioned problems, in

this paper, we propose a new shape descriptor which

involves both global and partial shape features with

low computation complexity. Moreover, the proposed

shape descriptor can be easily fused with other mean-

ingful descriptors (Belongie et al., 2002; Bai and

Latecki, 2008; Ling and Jacobs, 2007; Chang and

Kimia, 2009; Siddiqi et al., 1998) to enhance the

shape matching and classiﬁcation performance with-

out increasing too much the computation complexity.

The contribution of this paper addresses as well the

challenges mentioned above. Firstly, we introduce a

new shape descriptor in form of a 10-dimensionalfea-

ture vector. This shape descriptor integrates geomet-

rical and topological features with low computation

complexity and is robust to shape deformation. Sec-

ondly, we introduce a matching algorithm in which

feature weights are optimised by a supervised opti-

misation strategy. This strategy can efﬁciently adopt

our feature vector to diverse datasets. Experimental

results demonstrate that the optimised SVM classiﬁer

obtains much better accuracy than the non-optimised

one.

204

Yang C., Tiebe O., Pietsch P., Feinen C., Kelter U. and Grzegorzek M..

Shape-based Object Retrieval and Classiﬁcation with Supervised Optimisation.

DOI: 10.5220/0005186402040211

In Proceedings of the International Conference on Pattern Recognition Applications and Methods (ICPRAM-2015), pages 204-211

ISBN: 978-989-758-076-5

 2015 SCITEPRESS (Science and Technology Publications, Lda.)

2 RELATED WORK

In this section, some existing approaches relating to

shape descriptors for object retrieval and classiﬁca-

tion are discussed. We categorise them into two

groups: contour-based descriptors and skeleton-based

descriptors. The relevant matching algorithms are

also introduced in this section.

2.1 Contour-based Shape Descriptors

Nguyen et al. (Nguyen et al., 2013) propose a shape-

based local binary descriptor for object detection that

has been tested in the task of detecting humans from

static images. In (Cao et al., 2011), an algorithm for

partial shape matching with mildly non-rigid defor-

mations using Markov chains and the Monte Carlo

method is introduced. Both of these methods do

not involve context information for object matching.

Shotton et al. (Shotton et al., 2005) present a cat-

egorical object detection scheme that uses only lo-

cal contour-based features and is realised in a partly

supervised learning framework. However, in this

method many parameters need to be conﬁgured man-

ually. Yang et al. (Yang et al., 2012) formulate the

contour-based object detection as a matching problem

between model contour parts and image edge frag-

ments. They treat this problem as the task of ﬁnding

dominant sets in weighted graphs. Though insensi-

tive to noise and outliers, the approach is not rotation

invariant. Different from the previous contour-based

descriptors, our proposed descriptor combines both

global and local shape features. Moreover,all features

in our descriptor have corresponding weights which

can be easily adapted to different datasets. Lastly, pa-

rameters in our method are automatically optimised

by a supervised optimisation strategy.

2.2 Skeleton-based Shape Descriptors

Compared to contour-based descriptors, skeleton-

based shape descriptors feature lower sensitivity

to occlusion, limb growth, and articulation (Goh,

2008). However, they are computationally more

complex (Sebastian and Kimia, 2001) and still have

not been fully successfully applied to real images.

Baseski et al. (Baseski et al., 2009) present a tree-

edit-based shape matching method that uses a recent

coarse skeleton representation. Their dissimilarity

measure obtains a better result within groups than be-

tween group separation which mimics the asymmet-

ric nature of human similarity judgements. To the

best of our knowledge, the best performing skeleton-

based object matching algorithm has been proposed

by Bai et al. (Bai and Latecki, 2008). Their main

idea is to match skeleton graphs by comparing the

geodesic paths between skeleton endpoints. Unfor-

tunately, the performance of this method is limited

to the presence of large protrusions, since they re-

quire skipping a large number of skeleton endpoints.

Moreover, skeleton pruning and correspondence es-

timation increase the computation complexity of the

method. Based on skeleton, Shock Graphs (Siddiqi

et al., 1998) and Medial Scaffolds (Chang and Kimia,

2009) are proposed for shape matching. In this paper,

skeleton is also used for our proposed feature gener-

ation. However, we only use skeleton length as one

feature for global shape description and do not per-

form any pruning or correspondence estimation.

2.3 Matching Algorithms

Although Hausdorff distance (Mmoli, 2007) is one of

the classical shape matching method, we cannot di-

rectly use it since it is a correspondence-basedmethod

and has often been used to locate objects in an im-

age. Shape contexts (Belongie et al., 2002; Bai et al.,

2010) is an improvement to traditional Hausdorff dis-

tance based methods (Mmoli, 2007; Siddiqi et al.,

1998; Del Bimbo and Pala, 1997). The matching of

two shapes is done by matching two context maps of

the shapes, which is a matrix-based matching (Be-

longie et al., 2002). However, considering the trade-

off between accuracy and efﬁciency, involving ma-

trix operations is too expensive for our experiment.

Bimbo (Del Bimbo and Pala, 1997) proposed the use

of elastic matching. This approach is not practical for

on-line image retrieval, mainly because of the compu-

tation and matching complexity. In contrast to previ-

ous methods, in this paper,we design our matching al-

gorithm with low complexity and fully exert merits of

each feature in our feature space by individual weight-

ing and global optimisation. Moreover,by supervised

optimisation strategy on matching algorithm, our pro-

posed shape descriptor can be easily adapted to differ-

ent databases by selecting weights for each features.

3 OBJECT REPRESENTATION

Prior to feature extraction, we adjust the orientation of

each object by rotating it to the point, that the straight

line connecting its two maximally distant contour

points becomes vertical and the majority of contour

points lie on the left side of this line (See Figure 1). If

the number of contour points on both sides of the line

′

are the same, we will adjust the orientation to the

point, that the straight line connecting its two maxi-

Shape-basedObjectRetrievalandClassificationwithSupervisedOptimisation

205

′

Figure 1: Shape bounding box and equally high sub-boxes

= h

) used for feature extraction; A

, A

, and A

are the areas of the top, middle, and bottom sub-objects,

respectively.

mally distant contour points becomes vertical and the

majority of contour points lies on the upper h/2 side.

If the object is star-like or circle-like shape, we will

select one straight line connecting its two maximally

distant contour points and rotate the object so that the

straight line becomes vertical.

An object shape is described by a 10-dimensional

feature vector c

′

. For this, we use the bounding box

of the whole shape as well as the three equally high

sub-boxes shown in Figure 1. Here we subdivide the

bounding box into 3 equally high sub-boxes, this is

based on the trade-off between conﬁguration and ﬁne-

ness of subdivision. If we decompose the bounding

box into more sub-boxes, each shape sub-component

located in sub-boxes are tending to be similar. This

could give rise to miss-corresponding during object

matching process. Based on experiments,3 sub-boxes

selection achieves the best performance in terms of

accuracy and robustness.

The ﬁrst element c

′

and the last element c

′

of the

feature vector express the length of the object contour

and the length of object skeleton s, respectively. The

remaining elements are computed as follows:

′

, c

′

, c

′

, c

′

, c

′

, c

′

= A

+ A

, c

′

= l

. (1)

Subsequently, we perform two feature normalisation

steps. First, in order to ensure scale invariance, we

divide the non-ratio elements of the feature vector by

a half of the bounding box perimeter:

⋆

′

w+ h

= (c

⋆

)

. (2)

Second, we linearly scale the feature values to the

range (0,1]:

c =

⋆

− min{c

⋆

,... ,c

⋆

} + 1

max{c

⋆

,. .. ,c

⋆

} − min{c

⋆

,. .. ,c

⋆

} + 1

. (3)

In order to avoid the situation that c

c = 0 and zero de-

nominator, we add value 1 to both numerator and de-

nominator. The scaling is needed for the Support Vec-

tor Machines applied in the classiﬁcation step. The

main advantage of scaling is to avoid attributes in

greater numeric ranges dominating those in smaller

numeric ranges. Another advantage is to avoid nu-

merical difﬁculties during the calculation. Because

kernel values usually depend on the inner products of

feature vectors (e.g., the linear kernel and the poly-

nomial kernel), large attribute values might cause nu-

merical problems.

4 OBJECT RETRIEVAL AND

CLASSIFICATION

In this section, we propose a similarity function on

our feature vector for object retrieval. For object clas-

siﬁcation, we introduce the way for classiﬁer building

and kernel function selection. The supervised optimi-

sation method will be described in the last part of this

section.

4.1 Object Retrieval

In order to solve the object retrieval problem, we in-

troduce a similarity measure between contours. As-

sume C

⋆

and C

⋄

are two objects represented by our

proposed feature vectors:

⋆

= (c

⋆

,. .. ,c

⋆

,. .. ,c

⋆

)

⋄

= (c

⋄

,. .. ,c

⋄

,. .. ,c

⋄

)

. (4)

Now, we introduce a dissimilarity measure for feature

vectors belonging to different objects C

⋆

and C

⋄

d(C

⋆

⋄

) =

∑

m=1

⋆

− c

⋄

⋆

+ c

⋄

, (5)

where σ

is the weight for each feature achieved in

an optimisation process explained later in this sec-

tion. σ

can be optimised to adapt the proposed fea-

ture vector to different datasets. Moreover, it helps

the proposed feature to avoid the overﬁtting prob-

lem by applying a proper σ

to different features.

Our dissimilarity measure has been inspired by Chi-

Square kernel (Hazewinkel, 2001), which comes from

the Chi-Square distribution. Since our shape descrip-

tor contains a bag of features that are discretely dis-

tributed and Chi-Square kernel can effectively model

the overlap among them. The values of the dissimilar-

ity function (5) belong to the range d(C

⋆

⋄

) ∈ [0, 1]

which enables their easy conversion to similarity val-

ues:

s(C

⋆

⋄

) = 1− d(C

⋆

⋄

) . (6)

4.2 Object Classiﬁcation

In this experiment, we selected an SVM which ex-

tracts a decision boundary between shapes of differ-

ICPRAM2015-InternationalConferenceonPatternRecognitionApplicationsandMethods

206

ent classes based on the margin maximisation prin-

ciple. Due to this principle, the generalisation error

of the SVM is independent of the number of feature

dimensions. Furthermore, a complex (non-linear) de-

cision boundary can be extracted using a non-linear

SVM. In this process, images in a high-dimensional

feature space are mapped into a higher-dimensional

feature space using a kernel trick. In this experi-

ment, we choose Radial Basis Function (RBF) as ker-

nel function for three reasons. Firstly, RBF kernel

non-linearly maps samples into a higher dimensional

space so, unlike the linear kernel, it can handle the

case in which the relation between class labels and

attributes is nonlinear. Secondly, The RBF kernel

has less hyper parameters than the polynomial ker-

nel which reduces the complexity of model selection.

In our case, there are only two parameters (C,γ) that

need to be determined and we optimise them by our

proposed optimisation method. Thirdly, as the num-

ber of instances is much larger than the number of

features, the RBF kernel has fewer numerical difﬁcul-

ties and leads to shorter training time.

In our work, we apply a multi-class Support

Vector Machine (mSVM) using its one-against-one

(1vs1) version which works with a voting strategy. It

uses a two-class SVM for each pair from a set of all

considered classes {ω

,ω

,.. . ,ω

}. Thus, if there

are K classes in total, K(K − 1)/2 two-class classi-

ﬁers have to be used. First, a sample pattern (query

pattern) is classiﬁed using all these two-class SVMs.

The ﬁnal classiﬁcation result is determined by count-

ing to which class the sample pattern has been as-

signed most frequently.

4.3 Supervised Optimisation

The performance of matching and kernel functions

in retrieval and classiﬁcation systems is heavily de-

pendent on the choice of appropriate parameters.

These parameters are mutually dependent and there-

fore need to be optimised simultaneously. In prac-

tice, parameters are selected and optimised manually,

based on the knowledge of experts. Obviously, this is

an exhaustive and tedious process. In this section we

propose the use of an effective, supervised optimisa-

tion strategy to automatically improve the quality of

retrieval and classiﬁcation systems.

Traditional optimisation methods use iterative

strategies, which do not produce satisfactory results

when applied to high dimensional problems. How-

ever, heuristic methods are well suited for such opti-

misation problems where multiple parameters have to

be optimised simultaneously. In this paper we employ

a combination of two heuristic optimisation meth-

ods: Gradient Hill Climbing (Russell and Norvig,

2009) integrated with Simulated Annealing (Kirk-

patrick et al., 1983).

The Gradient Hill Climbing method starts with

randomly selected parameters. Then it changes sin-

gle parameters iteratively to ﬁnd a better set of pa-

rameters. A ﬁtness function then evaluates whether

the new set of parameters performs better or worse.

The Simulated Annealing strategy impacts the degree

of the changes. In later iterations, the changes to the

parameters are getting smaller. This strategy can ef-

ﬁciently reduce the computational complexity of our

optimisation method.

By in- and decreasing all parameters separately

with a speciﬁed magnitude that describes a conver-

gent zero series, the gradient for maximum enhance-

ment is computed. Adding this gradient to the previ-

ous parameters results in the parameters for the next

iteration.

To use this heuristic strategy we have to deﬁne ﬁt-

nessfunctions. A ﬁtnessfunction evaluates the quality

of a result for a set of given parameters. In case of

classiﬁcation systems the results are good when high

classiﬁcation rates are achieved. Hence, the ﬁtness-

function for our classiﬁcation system simply com-

putes the classiﬁcation rate. The case is a bit more

difﬁcult for retrieval systems.

In our work we evaluate a retrieval result in two

ways. One way is to calculate the bull’s eye retrieval

rate (see 5.1). The other way is to analyse the re-

sulting similarity table for all possible query objects,

and count how often the query class appears on ev-

ery position. The result of this computation is then

condensed to a single value s by application of the

following methods:

The function f transforms the position of an ob-

ject o in the similarity ranking of an object j to the in-

dex of object o. The indices are grouped by category,

objects of the same category have following indices.

f : N × N → N (7)

Using f we can evaluate single lines of our rank-

ing tables using the function p(i, j, n) by taking i as

index of query image, j as rank in ascending simi-

larity tables and n as number of images per category.

We count points for every object with the same class

as the query object. To put a higher emphasis on the

ﬁrst positions of the ranking in the resulting score we

give quadratic points for every right object.

p(i, j,n) =

(

⌊i/n⌋ = ⌊ f(i, j)/n⌋

0 ⌊i/n⌋ 6= ⌊ f(i, j)/n⌋

(8)

To calculate the ﬁnal score an addition of all points

in the last 2n lines is needed because the best 2n im-

Shape-basedObjectRetrievalandClassificationwithSupervisedOptimisation

207

Table 1: Retrieval results on MPEG-7 dataset. Results are summarised as the number of shapes from the same class among

the ﬁrst top 1-10 shapes. No Opt shows results from our matching algorithm without optimisation. CS Opt shows results

using our matching algorithm with Graph Transduction optimisation. Our Opt shows results using our matching algorithm

with supervised optimisation.

1st 2nd 3rd 4th 5th 6th 7th 8th 9th 10th

No Opt 700 647 600 567 521 488 447 426 405 342

GT Opt (Bai et al., 2010)

640 584 552 501 463 424 398 381 303 145

Our Method 700 657 615 591 553 518 475 467 420 363

ages are taken into consideration for bull’s eye re-

trieval rate. Following is the corresponding formula:

s =

∑

i=1

∑

j=nc−2n

p(i, j,n) (9)

where c denotes the number of classes.

5 EXPERIMENTS AND RESULTS

To evaluate the performance of the proposed method,

we have performed experiments in an object re-

trieval and a classiﬁcation scenario using four differ-

ent datasets. Moreover, we also conducted an experi-

ment with a fused descriptor to evaluate the improve-

ment of existing shape descriptors that are fused with

our proposed method.

5.1 Shape-based Object Retrieval

First, we illustrate our proposed algorithm for object

retrieval on the MPEG-7 dataset, which has 70×20=

1,400 shapes. In order to optimise the parameters in-

volved in our matching algorithm, we randomly se-

lect 10 objects from each category, there are in total

10 × 70 = 700 objects used for supervised optimisa-

tion. After optimising all parameters, we employ the

remaining 700 objects for testing. Table 1 shows the

retrieval results with and without supervised optimi-

sation process.

We employ Retrieval Rate for results comparison.

The retrieval rate is measured by the so-called bulls-

eye score. Every shape in the database is compared

to all other shapes, and the number of shapes from

the same class among the 20 most similar shapes is

reported. The bulls-eye scores from other references

are on all 1400 images, while here we are on 700 im-

ages. Therefore, in order to ensure the correctness of

our result, we iterativelydo the train/test split multiple

times and an average value is reported. The proposed

supervised optimisation with our matching algorithm

achieves 94.0% retrieval rate on MPEG-7 dataset.

Although Michael (Donoser and Bischof, 2013)

has already achieved a 100% bullseys score on

MPEG-7 dataset, the purpose of their method is dif-

ferent from ours. In this paper, we propose a simple

and effective shape descriptor that can be fused with

other shape descriptors. In order to make our descrip-

tor more ﬂexible and equip it with a higher adaptive-

ness, a supervised optimisation strategy is introduced.

We do not involve any manifold structure deﬁned by

pairwise afﬁnity matrices. On the contrary, Graph

Transduction(Bai et al., 2010) and diffusion (Donoser

and Bischof, 2013) can be used after our method to

improve subsequent applications like retrieval.

Our algorithm achieves 94.0% retrieval rate on

MPEG-7 dataset. This suggests that the MPEG-7

might not be sufﬁcient to judge the quality of our pro-

posed method. Therefore, we conducted another set

of evaluation on dataset MEPG-400 which is a subset

of the MPEG-7 collection, consisting of 400 objects

categorised in 20 classes (second row in Figure 2).

These shapes have much larger intra-class variations

and inter-class similarities than the MPEG-7 dataset.

We performed a comparison to the algorithm using

contour segments corresponding. Since this database

is the subset of MPEG-7, in this experiment, we em-

ploy the same parameter values from the ﬁrst exper-

iment, and implement the object retrieval process on

MPEG-400. To evaluate the behaviour of our pro-

posed feature space and matching algorithm, we ap-

ply the experiment in both with and without any opti-

misation. Experiment results show that our proposed

method performs better than contour segment. More-

over, from Table 2, we can clearly observe that our

proposed matching algorithm with supervised optimi-

sation made a signiﬁcant progress on MPEG-400.

In order to proof applicability of our approach to

enhance the performance of existing shape descrip-

tors, we conducted a set of evaluations on Kimia-216

and MPEG-400 dataset. We ﬁrst perform the experi-

ment only with Shape Context and then the fused de-

scriptor between Shape Context (fused weight: 0.7)

and proposed feature vector (fused weight: 0.3).

The fused weights are also learned from our super-

vised optimisation method. Table 4 shows that the

fused descriptor achieves signiﬁcant progress on both

databases.

ICPRAM2015-InternationalConferenceonPatternRecognitionApplicationsandMethods

208

Figure 2: Example shapes from the experimental datasets: MPEG-7 (ﬁrst row), MPEG-400 (second row), Kimia-216 (third

row) and EM-200 (fourth row with original, ﬁfth row with manually- and sixth row with semi-automatically segmented).

Table 2: Object retrieval on MPEG400 dataset. Our Method (No-Opt) represents the retrieval results using our matching

algorithm with default parameters; Our Method (Opt) shows the retrieval results using our matching algorithm with supervised

optimised parameters.

1st 2nd 3rd 4th 5th 6th 7th 8th 9th 10th

Contour Segments (Yang et al., 2014b) 375 348 333 325 317 311 300 295 276 275

Our Method (No-Opt)

381 355 341 320 322 316 304 295 269 260

Our Method (Opt) 381 370 365 354 337 342 328 315 300 301

5.2 Shape-based Object Classiﬁcation

In this part, we implement the experiment of classiﬁ-

cation on MPEG-7. As already discussed in Section

4, here we use Radial Basis Function (RBF) as the

kernel function for support vector machine. Like in

the previousexperiment, we randomly split the shapes

into half for training and half for testing. During the

training phase, we are not only able to build clas-

siﬁers with training data, but also seeking for best

parameters with optimisation strategy. All of these

tasks are done by 700 training images. The train-

ing set includes all the object classes. Another 700

testing objects are only used for testing. In this ex-

periment, there are two parameters that need to be

optimised, C for cost of SVM and γ in RBF kernel

function. The default values are C = 1 and γ = 1/n,

where n is the number of features. In our case, the

default values are n = 10 and γ = 0.1. As shown in

Table 3, in order to increase statistical relevance, we

repeated the selection process 10 times which led to

10 different training datasets and corresponding test-

ing datasets. Each column refers to an experiment

with different training datasets. Experiments are per-

formed for all these datasets and mean classiﬁcation

rates are reported. SSDP shows the results of scaled

datasets with default parameters. Results shows that

our method achieves signiﬁcant progress in this ex-

periment. Actually, SVM and RBF kernel are highly

related to parameters selection and our method can

improve its performance sufﬁciently.

We also compare our optimised classiﬁcation rate

to existing methods with the MPEG-7 dataset. As

shown in Table 5, our method yields a promising

score compared to existing methods. The Skeleton

Path achieves similar results as ours. However, the

computation complexity of our feature vector is much

lower than that of the skeleton path which needs to in-

volve skeleton pruning for feature generation. More-

over, in (Bai et al., 2009), fused contour segment and

skeleton path achieved 96.6% classiﬁcation rate, but

it is meaningless to compare it to our result since our

descriptor is isolated.

5.3 EM Classiﬁcation

To validate the idea of our proposed method for ex-

isting application, we proved the applicability of our

new method to a real-world problem, namely the au-

tomatic classiﬁcation of Environmental Microorgan-

isms (EMs). EMs and their species are very impor-

tant indicators to evaluate environmental quality, but

their manual classiﬁcation is very time-consuming (Li

et al., 2013). Thus, automatic analysis techniques

Shape-basedObjectRetrievalandClassificationwithSupervisedOptimisation

209

Table 3: Object classiﬁcation for the whole MPEG-7 dataset. SSDP represents the results of scaled datasets with default

parameters. Our Method achieves signiﬁcant progress in this experiment.

1st 2nd 3rd 4th 5th 6th 7th 8th 9th 10th Average

SSDP(%) 47.6 45.6 50.7 47.6 46 44.4 46.4 46.0 45.9 44.9 46.5

Our Method(%)

85.7 86.0 85.9 85.7 84.1 86.6 85.1 87.7 87.9 88.0 86.3

Table 4: Experimental comparison of Shape Context descriptor to its fused descriptor with our method using the Kimia-216

and the MPEG-400 datasets as well as the proof of applicability of our approach to enhance the performance of existing shape

descriptors. Results are summarised as the number of shapes from the same class among the ﬁrst top 1-10 shapes.

Retrieval Results for Kimia-216 1st 2nd 3rd 4th 5th 6th 7th 8th 9th 10th

Shape Context (Belongie et al., 2002) 216 212 201 187 186 175 171 164 162 146

Fused Method

216 214 207 204 201 204 191 188 192 185

Retrieval Results for MPEG-400 1st 2nd 3rd 4th 5th 6th 7th 8th 9th 10th

Shape Context (Belongie et al., 2002) 400 370 343 310 302 277 272 265 264 239

Fused Method

400 389 374 368 368 358 347 344 339 346

for microscopic images of EMs would be very ap-

preciated by environmental scientists. We have tested

our methodology for the application using the EM-

200 dataset. Since some EM-200 objects can hardly

be skeletonised (e.g., the ﬁrst two objects in the

third row of Figure 2), Chen (Li et al., 2013) and

Cong (Yang et al., 2014a) proposed some methods us-

ing the whole microorganism contours for object de-

scription. Here we employ the same feature vector

proposed by Cong (Yang et al., 2014a), but classiﬁers

will be built with our optimised C and γ. The impres-

sive results for the EM-200 dataset (see Table 6) con-

ﬁrm the power of our supervised optimisation method

to real-world applications.

5.4 Feature Analysis

In this part, we will employ Discriminate Analysis

(DA) to evaluate the discriminating properties of the

feature space. Especially, we applied Fisher Lin-

ear Discriminant Analysis (Viola and Jones, 2004)

on MPEG-7 dataset and achieved the following re-

sults: λ(c

′

) = 0.250, λ(c

′

) = 0.147, λ(c

′

) = 0.237,

λ(c

′

) = 0.173, λ(c

′

) = 0.232, λ(c

′

) = 0.280, λ(c

′

) =

0.360, λ(c

′

) = 0.282, λ(c

′

) = 0.102, λ(c

′

) = 0.413

for the different dimensions of the feature space, re-

spectively. The value λ(c

′

) expresses the overall dis-

crimination power for the feature space dimension c

′

and is calculated as the ratio of the determinant of

intra-class covariance matrix to the determinant of

the total covariance matrix. The value of λ(c

′

) be-

longs to the range [0,1], whereas 0 corresponds to

perfect discriminativeproperties and 1 denotes no dis-

crimination. According to the analysis of the feature

Table 5: Experimental comparison of our methodology to

the related algorithm for classiﬁcation on MPEG-7.

Method Score

Skeleton Path (Bai et al., 2009) 86.7%

CS (Sun and Super, 2005)

75.4%

IDSC (Ling and Jacobs, 2007) 76.5%

Our Method

86.3%

Table 6: Experimental comparison of our method to related

algorithms for classiﬁcation of Microorganisms as the proof

of applicability of our optimisation approach to real world

problems using the EM-200 dataset. (MS: Manually Seg-

mented, SAS: Semi-Automatically Segmented).

Method MS SAS

SFSVM (Li et al., 2013) 89.7% 66.0%

NSFSVM (Yang et al., 2014a) 92.5% 79.5%

Our Method

95.0% 83.5%

space described above, the last feature c

′

possesses

the strongest discriminative power.

6 CONCLUSION

In this paper, we propose a simple and effective shape

descriptor that can be easily fused with other descrip-

tors. In order to make our descriptor more ﬂexi-

ble and equip it with a higher adaptiveness, a super-

vised optimisation strategy is introduced. The pro-

posed method can easily adapt to a concrete appli-

ICPRAM2015-InternationalConferenceonPatternRecognitionApplicationsandMethods

210

cation domain by optimising parameters assigned to

different dimensions of the feature space and kernel

function. Its promising performance has been proven

in a meaningful experimental set-up. In the future, we

will investigate possibilities of fusing more meaning-

ful shape features into our feature space.

ACKNOWLEDGEMENTS

Research activities leading to this work have been

supported by the China Scholarship Council (CSC)

and the German Research Foundation within the Re-

search Training Group 1564 (GRK 1564). We greatly

thank M.Sc. Chen Li from University of Siegen and

Prof. Dr. Beihai Zhou and M.Sc. Fangshu Ma from

the University of Science and Technology Beijing for

providing us with Environmental Microorganism im-

age dataset for experiments.

REFERENCES

Bai, X. and Latecki, L. (2008). Path similarity skeleton

graph matching. PAMI, 30(7):1282–1292.

Bai, X., Latecki, L., and yu Liu, W. (2007). Skeleton prun-

ing by contour partitioning with discrete curve evolu-

tion. PAMI, 29(3):449–462.

Bai, X., Liu, W., and Tu, Z. (2009). Integrating contour and

skeleton for shape classiﬁcation. In ICCV Workshops,

pages 360–367.

Bai, X., Yang, X., Latecki, L., Liu, W., and Tu, Z. (2010).

Learning context-sensitive shape similarity by graph

transduction. PAMI, 32(5):861–874.

Baseski, E., Erdem, A., and Tari, S. (2009). Dissimilar-

ity between two skeletal trees in a context. Pattern

Recognition, 42(3):370–385.

Belongie, S., Malik, J., and Puzicha, J. (2002). Shape

matching and object recognition using shape contexts.

PAMI, 24(4):509–522.

Cao, Y., Zhang, Z., Czogiel, I., Dryden, I., and Wang,

S. (2011). 2d nonrigid partial shape matching us-

ing mcmc and contour subdivision. In CVPR, pages

2345–2352.

Chang, M.-C. and Kimia, B. (2009). Measuring 3d shape

similarity by matching the medial scaffolds. In ICCV,

pages 1473–1480.

Del Bimbo, A. and Pala, P. (1997). Visual image retrieval by

elastic matching of user sketches. PAMI, 19(2):121–

132.

Donoser, M. and Bischof, H. (2013). Diffusion processes

for retrieval revisited. In CVPR, pages 1320–1327.

Goh, W.-B. (2008). Strategies for shape matching using

skeletons. CVIU, 110(3):326–345.

Hazewinkel, M. (2001). Chi-squared Distribution. Ency-

clopedia of Mathematics, Springer.

Hedrich, J., Yang, C., Feinen, C., Schaefer, S., Paulus, D.,

and Grzegorzek, M. (2013). Extended investigations

on skeleton graph matching for object recognition. In

ICCRS, pages 371–381. Springer LNCS.

Kirkpatrick, S., Gelatt, C. D., and Vecchi, M. P. (1983).

Optimization by simulated annealing. Science, pages

671–680.

Li, C., Shirahama, K., Grzegorzek, M., Ma, F., and Zhou,

B. (2013). Classiﬁcation of environmental microor-

ganisms in microscopic images using shape features

and support vector machines. In ICIP, pages 2435–

2439. IEEE Computer Society.

Ling, H. and Jacobs, D. (2007). Shape classiﬁcation using

the inner-distance. PAMI, 29(2):286–299.

Mmoli, F. (2007). On the use of gromov-hausdorff distances

for shape comparison. In SPBG, pages 81–90.

Nguyen, D. T., Ogunbona, P. O., and Li, W. (2013). A

novel shape-based non-redundant local binary pattern

descriptor for object detection. Pattern Recognition,

46(5):1485–1500.

Russell, S. and Norvig, P. (2009). Artiﬁcial Intelligence: A

Modern Approach. Prentice Hall Press, 3rd edition.

Sebastian, T. and Kimia, B. (2001). Curves vs skeletons in

object recognition. In ICIP, volume 3, pages 22–25.

Shotton, J., Blake, A., and Cipolla, R. (2005). Contour-

based learning for object detection. In ICCV, vol-

ume 1, pages 503–510.

Siddiqi, K., Shokoufandeh, A., Dickenson, S., and Zucker,

S. (1998). Shock graphs and shape matching. In

ICCV, pages 222–229.

Sun, K. and Super, B. (2005). Classiﬁcation of contour

shapes using class segment sets. In CVPR 2005, vol-

ume 2, pages 727–733.

Viola, P. and Jones, M. J. (2004). Robust real-time face

detection. Int. J. Comput. Vision, 57(2):137–154.

Yang, C., Li, C., Tiebe, O., Shirahama, K., and Grzegorzek,

M. (2014a). Shape-based classiﬁcation of environ-

mental microorganisms. In ICPR, pages 3374–3379.

Yang, C., Tiebe, O., Pietsch, P., Feinen, C., Kelter, U., and

Grzegorzek, M. (2014b). Shape-based object retrieval

by contour segment matching. In ICIP, pages 2202–

2206.

Yang, X., Liu, H., and Latecki, L. J. (2012). Contour-based

object detection as dominant set computation. Pattern

Recognition, 45(5):1927–1936.

Zhang, D. and Lu, G. (2004). Review of shape representa-

tion and description techniques. Pattern Recognition,

37:1–19.

Shape-basedObjectRetrievalandClassificationwithSupervisedOptimisation

211