A Step Towards Learning Contraction Kernels for Irregular Image

Pyramid

Darshan Batavia

1 a

, Rocio Gonzalez-Diaz

2 b

and Walter G. Kropatsch

1 c

TU Wien, Pattern Recognition and Image Processing Group 193/03, Vienna, Austria

University of Seville, Department of Applied Math I, Seville, Spain

Keywords:

Cost of Contraction Kernels, Dictionary for Contraction Kernel, Irregular Image Pyramid, Slope Region.

Abstract:

A structure preserving irregular image pyramid can be computed by applying basic graph operations (contrac-

tion and removal of edges) on the 4-adjacent neighbourhood graph of an image. In this paper, we derive an

objective function that classiﬁes the edges as contractible or removable for building an irregular graph pyra-

mid. The objective function is based on the cost of the edges in the contraction kernel (sub-graph selected for

contraction) together with the size of the contraction kernel. Based on the objective function, we also provide

an algorithm that decomposes a 2D image into monotonically connected regions of the image surface, called

slope regions. We proved that the proposed algorithm results in a graph-based irregular image pyramid that

preserves the structure and the topology of the critical points (the local maxima, the local minima, and the

saddles). Later we introduce the concept of the dictionary for the connected components of the contraction

kernel, consisting of sub-graphs that can be combined together to form a set of contraction kernels. A favorable

contraction kernel can be selected that best satisﬁes the objective function. Lastly, we show the experimental

veriﬁcation for the claims related to the objective function and the cost of the contraction kernel. The outcome

of this paper can be envisioned as a step towards learning the contraction kernel for the construction of an

irregular image pyramid.

1 INTRODUCTION

Graph-based representations are the primary object

of study for many pattern recognition and computer

vision application areas. Graphs are capable of rep-

resenting both structured and unstructured data as-

sociated with applications ranging from processing

2D images to social network analysis and climate

data analysis. Out of several approaches to encounter

graph-based processing, this paper focuses on the

construction of an irregular image pyramid computed

from a 2D image, such that the topology and the struc-

ture of the image are preserved blue in a concise man-

ner at the higher levels of the pyramid. The pro-

posed objective function is used to predict an opti-

mal contraction kernel that governs the construction

of an irregular image pyramid. In the broader con-

text, the proposed algorithm decomposes a 2D Eu-

clidean space into cells that are represented by critical

https://orcid.org/0000-0003-0062-3411

https://orcid.org/0000-0001-9937-0033

https://orcid.org/0000-0003-4915-4118

points (the local maximum, the local minimum, and

the saddle) and the connections between them. As

mentioned in (Helman and Hesselink, 1991), a com-

pact representation of a 2D image can be achieved by

using a surface topology-based data structure.

There is an extensive literature on similar ap-

proaches for the decomposition of the 2D spaces, es-

pecially for the decomposition of the Morse-Smale

complex (MS-complex). As mentioned in (Stein

et al., 1963), the Morse-Smale complex is a collec-

tion of the Morse cells that follow a smooth function

on a manifold, h : M 7→ R such that all the critical

points are non-degenerated. Nackman Lee in (Lee,

1984) represents a surface in form of graphs of crit-

ical points, subdividing the surface into slope dis-

tricts. Other approaches include: Reeb graph (Shina-

gawa et al., 1991), hierarchical decomposition of MS-

complex into piece-wise linear 2-manifolds (Edels-

brunner et al., 2003). Many efﬁcient algorithms

can be found in the literature to compute consis-

tent MS complexes. For example, in (Gyulassy

et al., 2014), the authors presented an efﬁcient al-

gorithm to compute consistent MS complexes using

Batavia, D., Gonzalez-Diaz, R. and Kropatsch, W.

A Step Towards Learning Contraction Kernels for Irregular Image Pyramid.

DOI: 10.5220/0010840900003122

In Proceedings of the 11th International Conference on Pattern Recognition Applications and Methods (ICPRAM 2022), pages 60-70

ISBN: 978-989-758-549-4; ISSN: 2184-4313

a divide-and-conquer strategy for dealing with large

data. In (Comic et al., 2010), the authors computed

maximal cells of the ascending and the descending

Morse complexes through a watershed approach.

In reality, 2D images may contain degenerated

critical points and thus they cannot be treated as MS-

complexes before any pre-processing. Therefore, in

this paper, we focus on a more generic framework

that allows the presence of degenerated critical points.

In (Gonzalez-Diaz et al., 2021), the authors explained

a hierarchical approach for the decomposition of any

surface into cells called slope regions and proved the

necessary and sufﬁcient conditions for the existence

of the slope regions. The authors in (Batavia et al.,

2019) described the partitioning of an image into

slope regions. Utilizing these theories, in (Batavia

et al., 2020), the authors used the combinatorial pyra-

mid framework to build irregular image pyramids and

displayed results for over-segmented images and im-

age reconstruction from the top level of the graph

pyramid while preserving the texture information. A

similar framework were used by Morales-Gonzalez et

al. in (Morales-Gonz

alez and Garc

ıa-Reyes, 2013)

for the application of object recognition and object

matching.

The main goal of this paper is to express the al-

gorithm mentioned in (Batavia et al., 2020) for irreg-

ular image pyramid in form of an objective function

and introduce the dictionary for the connected com-

ponents of the contraction kernel. The paper is or-

ganized as follows: Section 2 introduces and deﬁnes

the basic terminology and the motivation behind this

work. Then, we derive the objective function and

explain the algorithm for building the irregular im-

age pyramid. In Section 4, we show a novel concept

of a dictionary for the connected components of the

contraction kernel, consisting of a generic set of sub-

graphs that can be used in combination to construct

a favorable contraction kernel. Experimental veriﬁca-

tion and statistical evidences are shown in Section 5

for the claims made in Section 3. Then, we conclude

the paper with the conclusions and the future work.

2 DEFINITIONS AND

MOTIVATION

A 2D image P can be visually perceived as a sampled

version of a geographical terrain model in 2.5D with

a continuous height (intensity) map h : R

→ [0, 1].

A pixel p ∈ P is a discrete sample obtained by sam-

pling the 2.5D continuous surface. The digital im-

age P can be effectively represented by a planar 4-

connected neighbourhood graph G

= (V

, E

), where

every pixel p in the image P corresponds to a vertex

v ∈ V

with gray value g(v) := h(p) ∈ [0, 1].

With help of basic graph operations, namely the

edge contraction and the edge removal, we can reduce

the graph size. Stacking up the reduced graphs will

result in a formation of an irregular graph pyramid.

The k

level of an irregular graph pyramid is denoted

by G

, E

) where V

and E

are the set of vertices

and edges respectively, and k ∈ {0, 1, . . . , n} outlines

the level of the graph pyramid. |E

| and |V

| denotes

the cardinality of the edge set E

and of the vertex set

, respectively.

For each edge (v, w) ∈ E

, where v, w ∈ V

there is an attribute contrast c(e) := |g(v) − g(w)|.

The edges are oriented from vertex v to vertex

w if g(v) > g(w), if g(v) = g(w) they are non-

oriented. A connected sub-graph of G

having

the same gray value for all the vertices is re-

ferred to as a plateau region (connected sub-graph

of degenerated vertices). A path π(v

, v

, . . . , v

) =

, E

) is deﬁned as a nonempty sub-graph of

, E

), where V

= {v

, v

, . . . , v

} ⊆ V

and E

{(v

, v

), (v

, v

), . . . , (v

r−1

, v

)} ⊆ E

. The path π is

monotonic if all the oriented edges of E

have the

same orientation, i.e. from v

to v

or from v

to v

The path π is a level curve if g(v

) = g(v

i+1

), ∀i ∈

{1, 2, . . . , r − 1}. A level curve can be a part of a

monotonic path.

We utilize the orientation of the incident edges

to categorize the vertices into critical vertices (local

maxima, local minima, and saddles), and non-critical

(slope) vertices. A vertex v ∈ V

is a local minimum if

all the edges incident to v are oriented inward. A ver-

tex v ∈ V

is a local maximum if all the edges incident

to v are oriented outward. A vertex v ∈ V

is a slope

vertex (non-critical vertex) if there are exactly two

changes in the orientation of edges incident to v, when

traversed circularly (clockwise or anti-clockwise di-

rection). A vertex v ∈ V

is a saddle if it is not a local

maximum, nor a local minimum, neither a slope ver-

tex.

A 2D image is called well-composed (Latecki

et al., 1995) if it does not contain the following non-

well-composed conﬁguration of pixels as shown in the

left image of Fig.1. The non-well-composed conﬁgu-

ration follows: g(a) < g(b), g(a) < g(d), g(c) < g(b)

and g(c) < g(d) visualized with the help of oriented

edges in Fig.1.

In order to remove all the non-well-composed

conﬁgurations, we insert the hidden saddle as a ver-

tex r adjacent to vertices a, b, c and d such that

max(g(a), g(c)) < g(r) < min(g(b), g(d)) as shown in

the right picture of Fig.1. A face in a surface embed-

ded plane graph G

is a slope region S if all the pairs

A Step Towards Learning Contraction Kernels for Irregular Image Pyramid









































































d d

a a

c c

b b











=⇒

Figure 1: Inserting a hidden saddle.

of points in the face can be connected by a continuous

monotonic curve inside the face. In (Batavia et al.,

2019), the authors showed that a surface sampled by

a 2D image can be partitioned into slope regions while

preserving the structural and topological properties of

the surface. Following are the basic properties and

advantages of slope regions:

1. Slope regions have an easy graph-based represen-

tation comprising of one local maximum, one lo-

cal minimum, and the saddle vertices along the

slope’s boundary connected by monotonic paths

or level curves only.

2. The holes that are geometrically inside the slope

region can be modeled with the help of folded

boundaries, such that the holes are topologically

excluded and the slope region remain homeomor-

phic to a disc. Please refer (Batavia et al., 2019)

for further details.

3. There is no unique solution for partitioning an

image into a combination of slope regions. This

provides ﬂexibility and scope for optimization to

achieve better application-speciﬁc results.

4. Unlike most machine learning approaches, the

slope regions can be computed without the com-

putation of convolutions.

Motivation: The authors in (Batavia et al., 2020)

implemented the TIIP (Topology preserving Irregu-

lar Image Pyramid) algorithm for the construction of

an irregular image pyramid that partitions the image

into slope regions while preserving the structure of

an image. The main contribution of the TIIP algo-

rithm is the selection criteria for the contraction kernel

that dominates the construction of an irregular pyra-

mid. The TIIP algorithm follows a well-deﬁned set of

rules that can be fulﬁlled by several implementation

approaches. Therefore the modiﬁcations related to

the application-speciﬁc optimizations require a deep

understanding of the implementation. To avoid these

drawbacks, in this paper, we derive an objective func-

tion to replicate the TIIP algorithm and to optimize

the selection of contractible edges that satisfy the ba-

sic rules of selection of contraction kernel mentioned

in (Batavia et al., 2020). Furthermore, we design a

dictionary of sub-graphs whose elements can be com-

bined to form a favourable contraction kernel. Alter-

natively, a favorable kernel can be selected from a set

of contraction kernels by computing the cost of the

kernels using the cost curve deﬁned later in this pa-

per. The outcome of this paper can be envisioned as

a step towards learning the contraction kernel for an

irregular graph pyramid.

3 DERIVING THE OBJECTIVE

FUNCTION

As mentioned previously, the construction of an irreg-

ular pyramid is controlled by a contraction kernel K .

A contraction kernel K =



∈ E

: i ∈ {1, 2, ..., n}



is a set of edges selected for the contraction. A con-

traction kernel K is accompanied with a set of sur-

viving vertices S =



∈ V

: i ∈ {1, 2, ..., n}



is a

set of surviving vertices where vertex v

corresponds

to edge e

∈ K for i ∈ {1, 2, ..., n}. The set of surviv-

ing vertices avoids the confusion of the surviving and

the non-surviving vertex during the contraction of an

edge. Every contraction of an edge removes one edge

and one vertex. The contraction kernel K is com-

posed of several connected components denoted by

; i = {1, 2, ..., c}. Note that each connected com-

ponent C

has a single surviving vertex. The cardi-

nality |C

| of a connected component C

is deﬁned

as the number of edges in the connected component.

Then, the cardinality of the contraction kernel K is

|K | := n =

∑

i=1

In this section, we aim to derive the objective

function that replicates the rules for selecting the con-

traction kernels mentioned in (Batavia et al., 2020;

Gonzalez-Diaz et al., 2021). Following are the ba-

sic rules mentioned in (Batavia et al., 2020) for deter-

mining a contraction kernel K to build a topology-

preserving irregular image pyramid partitioning an

image into slope regions:

1. The edges with lower contrast will be given prior-

ity over the edges with higher contrast.

2. To allow parallel processing, the connected com-

ponents of the contraction kernel should be inde-

pendent of each other, considering the data struc-

ture in effect.

3. The critical vertices should always survive and the

edges connecting two critical vertices should be

excluded from the contraction.

Further rules can be added depending on the ap-

plication and the requirements from the output. Since

ICPRAM 2022 - 11th International Conference on Pattern Recognition Applications and Methods

these rules are incomprehensible by the machines,

they are manually programmed which restricts further

modiﬁcation and optimization. The following ob-

jective function tries to capture the above-mentioned

rules, making them understandable to machines.

The objective function stated in this paper oper-

ates on the cost ξ(e) associated with an edge e. Hence,

before diving into the objective function, we will de-

ﬁne the cost ξ(e) → [0, 1] associated with an edge

e ∈ E

as follows:

ξ(e) = 1 − exp(m · ln (1 − c(e))) (1)

where c(e) ∈ [0, 1] corresponds to the contrast of an

edge e ∈ E

and the multiplier m satisﬁes m > 0. The

multiplier m controls the skewness of the curve ξ(e)

vs. c(e) as shown in Fig. 2, with different values of

multiplier m. The cost ξ(e) associated with an edge e

is proportional to the contrast c(e). The cost ξ(e) is

bounded between [0, 1] for the different values of c(e)

as follows:

ξ(e) = 0 for c(e) = 0

ξ(e) = 1 for c(e) = 1

ξ(e) ∈ (0, 1) for 0 < c(e) < 1

(2)

The value of the multiplier m adds a non-linear

factor to the cost of the edges. As displayed in Fig.2,

the cost of edges with contrast ranging between (0, 1)

is signiﬁcantly lower for m = 0.25 as compared to the

cost with multiplier m = 1. The maximum and the

minimum value of the cost ξ(e) remains unaffected

by different values of the multiplier m.

Figure 2: ξ(e) vs. c(e) = [0, 1] for different values of m.

The cost curve β(K ) 7→ R

provides the cost as-

sociated with a contraction kernel K = {e

∈ E

: j ∈

{1, 2, ..., |K |}} for the k

level of the graph pyramid

β(K ) =

|K |

∑

j=1

ξ(e

) + λ



| − |K |



(3)

The ﬁrst term in Equation (3) computes the sum

of the cost associated with all the edges e

∈ K . From

Equation (2), the range of the ﬁrst term

|K |

∑

j=1

ξ(e

) is

bounded between [0, |K |]. The second half of the

Equation (3) assists in controlling the kernel size.

Further signiﬁcance of the second term is made

evident in the explanation of the following objective

function.

Given a set of all the possible contraction kernels S =

, K

, . . . , K

}. The contraction kernel

K ∈ S that

helps to select the edges with lower contrast and a

favorable size of contraction kernel is given by:

K = arg min

∈S

β(K

) (4)

In Equation (4) we have two parameters: multiplier

m and the Lagrange multiplier λ as shown in Equa-

tions (1,3). From Fig. 2, we can observe that the edges

with a lower contrast have a smaller cost as compared

to the edges with the higher contrast. Hence by mini-

mizing the objective function in Equation (4), we are

giving priority to the edges with the lower contrast,

satisfying the rule mentioned in (Batavia et al., 2020).

The Lagrange multiplier λ penalizes the size of the

contraction kernel. The size of the contraction ker-

nel |K | is directly proportional to λ. Ideally, a larger

contraction kernel is desired as it lowers the height

of the pyramid. In absence of the second term in

Equation (3), the objective function in Equation (4)

might land to a trivial solution with |

K | = 1 consist-

ing of a single edge e ∈ E

with the lowest contrast. In

the worse case, for a graph G

= (V

, E

) at the base

level, where c(e) = 0 for all edges e ∈ E

, this will

result in a pyramid with height |V

| and linear com-

plexity for construction of the pyramid (ignoring the

pyramid levels required for the removal of the redun-

dant edges) as in (Cerman et al., 2016). Conversely,

a larger value of λ will result in a larger size of the

contraction kernel, which will eventually reduce the

height of the pyramid (ignoring the pyramid levels re-

quired for the removal of the redundant edges).

Proposition 1. If the edges are selected in the ascend-

ing order of ξ(e), ∀e ∈ E

to construct the contraction

kernel K for 0 < |K | ≤ |E

|, then the curve |K | vs.

β(K ) will not contain any local maximum.

Proof. The ﬁrst term

∑

|K |

j=1

ξ(e

) of Eq. 3 is a summa-

tion of the edges in ascending order of ξ(e). ξ(e) is

A Step Towards Learning Contraction Kernels for Irregular Image Pyramid

bounded between [0, 1] as mentioned in Eq. 2. Hence

the ﬁrst term of Eq. 3 will result in a convex curve

with no local maximum. The value of the second term



|−|K |



of Eq. 3 is inversely proportional to the

size of the contraction kernel and will result in a linear

curve for 0 < |K | ≤ |E

|. The sum of both the terms

will not result in formation of a local maximum. 

As a consequence of Prop. 1, Eq. 4 is eligible for con-

vex optimization.

Algorithm 1 builds an irregular image pyramid

that represents the structure of an image with a graph

of critical vertices on its top level.

Algorithm 1: Objective function based selection of the con-

traction kernel.

1: Input: A 2D image P.

2: Initialize: Generate the 4-connected neighbor-

hood graph G

3: Insertion of hidden saddle vertices.

4: LBP Categorisation of the vertices into the criti-

cal, non-critical and degenerated vertices.

5: while #(degenerated vertices) >0 do

6: Search for contraction kernel

K that optimizes

Equation 4.

7: Set the respective critical vertices as the surviv-

ing vertices and eliminate the edges connecting

two critical vertices from

K .

8: Contraction of edges e ∈

K .

9: Update the changes in the LBP category of the

degenerated vertices.

10: Simpliﬁcation of graph by removal of redun-

dant multiple edges.

11: end while

12: while #(non-critical vertices) >0 do

13: Search for contraction kernel

K that optimizes

Equation 4.

14: Set the respective critical vertices as the surviv-

ing vertices and eliminate the edges connecting

two critical vertices.

15: Contraction of edges e ∈

K .

16: Simpliﬁcation of graph by removal of redun-

dant multiple edges.

17: end while

18: end

Theorem 2. All the faces at the top level of the pyra-

mid built by Algorithm 1 are slope regions.

Proof. Insertion of the hidden saddles in step 3 of

the Algo. 1 converts all the non-well-composed con-

ﬁgurations into well-composed conﬁgurations. After

this step, all the faces in the graph are already slope

regions as proven in (Kropatsch et al., 2019)[Proposi-

tion 2]. Now the proof boils down to preserving the

slope regions without changing the connection of the

critical vertices.

For a surface, the topology of its contours changes at

the function value of the critical points. For exam-

ple, the surface contours will collapse to a point at

a non-degenerated extremum and multiple contours

will intersect at a saddle point. In steps 7 and 14

of Algo. 1, we preserve the critical vertices by ﬁx-

ing them as the surviving vertices and preserve the

connection between them by eliminating the edges

connecting two critical vertices. Thus all the edges

selected for contraction belong to a monotonic path

connecting two critical vertices that are not adjacent

to each other. Since the monotonic connections be-

tween the critical vertices are intact, the topology of

the contours will remain the same. Consequently, the

slope regions and the topology of the critical vertices

are preserved. 

4 DICTIONARY FOR THE

CONNECTED COMPONENTS

In this section, we introduce the concept of the dictio-

nary D for the connected components of the contrac-

tion kernel, which comprises sub-graphs conceived as

an element of the contraction kernel. The dictionary is

particularly designed for implementations with com-

binatorial maps as the data structure. The elements

of the dictionary highly depend on the input data, on

the application, and on the the data structure used for

the implementation. Considering Algorithm 1, the

outcome of the algorithm is focused on obtaining the

structure of an image, represented as a graph of criti-

cal vertices. Since the algorithm operates on an irreg-

ular image pyramid, the geometry of the vertices and

edges especially in presence of multiple edges and

self-loops cannot be captured by the adjacency matrix

or adjacency lists. The edge contraction process may

subsequently generate a vertex with a higher degree

and a complex structure bounded by the complexity

of the input data. Therefore, we use combinatorial

maps as the data structure, that implicitly encodes and

characterizes the inclusion relationships.

Combinatorial pyramids (Brun and Kropatsch,

2001) introduced by Brun et. al. is a stack of succes-

sively reduced combinatorial maps. It may be under-

stood as explicit encoding of the edge orientation (ei-

ther in clockwise or anti-clockwise direction) around

the vertex. The combinatorial map M = (D, σ, α) en-

coding consists of three components: (a) a set of darts

D, (b) a permutation σ and (c) an involution α. An

edge e connecting two vertices v, w is composed of

two darts d

, d

. Darts d

and d

belonging to the

ICPRAM 2022 - 11th International Conference on Pattern Recognition Applications and Methods

same edge e, are related to each other by involution

α such that α(d

) = d

and α(d

) = d

. The permuta-

tion σ relates each dart with the following dart around

the same vertex in clockwise or counter-clockwise di-

rection. The direction of encoding is implementation

speciﬁc. Fig. 3 displays an example of a simple graph

encoded as a combinatorial map.

Figure 3: An example of a simple graph encoded as a com-

binatorial map.

To maintain generality, the elements of the dictio-

nary are independent of the degree of the vertices and

the geometry of the edges at any level of the pyra-

mid. The elements of the dictionary are differentiated

based on the contraction ratio deﬁned as the reduction

in the number of vertices after contraction of edges in

the connected component C. Table 1 enumerates the

different classes of the connected components C in the

dictionary:

Table 1: Contraction factor for different classes of con-

nected components C in the dictionary D.

|C| contraction factor in |V

0 1:1

1 2:1

2 3:1

3 4:1

4 5:1

Fig. 4 displays the graphical representation of the

connected components in the dictionary, where the

green vertex represents the surviving vertex and the

blue colored vertex represents the non-surviving ver-

tices. The solid line edges are the edges selected for

the contraction kernel and the dashed lined edges are

excluded from the contraction kernel. The solid line

edges are oriented from the non-surviving vertex to

the surviving vertex, while the dashed line edges are

not oriented. The upper bound on the number of the

dashed line edges is dependent on the complexity of

the data and can be incident to the surviving vertex

Figure 4: Elements of the dictionary D for the connected

component of the contraction kernel.

at various positions apart from the examples shows in

Fig. 4.

Fig.4(a) shows an isolated vertex with respect to

the contraction kernel that does not have any incident

edges selected for contraction operation. A very com-

mon example of an isolated edge is a local extremum

surrounded by other critical vertices. In such conﬁgu-

rations, none of the edges are selected for contraction.

Fig.4(b) shows a sub-graph with 1 solid edge se-

lected for the edge contraction and 2 dashed edges

that are not selected for the contraction. After per-

forming the edge contraction on a single edge of

graph G

= (V

, E

), the blue colored vertex will

merge with the green vertex creating a new graph

k+1

= (V

k+1

, E

k+1

). The resulting cardinatlity of

edge set and vertex set will be as follows: |E

k+1

| =

− 1 and |V

k+1

| = V

− 1 respectively. The similar

explanation applies for Fig.4(c),(d) and (e).

Following are the remarks and discussion on the

dictionary D:

1. The maximum number of edges selected for the

contraction that is incident on a single vertex is 4.

2. A vertex can either be a surviving vertex or a non-

surviving vertex. The maximum distance between

the surviving vertex of the connected component

and the non-surviving vertices is limited to 1.

A Step Towards Learning Contraction Kernels for Irregular Image Pyramid

3. A non-surviving vertex can only have a single

surviving vertex. In other words, a single non-

surviving vertex cannot have multiple surviving

vertices.

The three principal reasons for the above-

mentioned remarks are:

(a) At the base level of the pyramid with graph G

, all

the vertices inside the boundary have a degree 4.

(b) Edge contraction operation increases the degree of

the surviving vertex and consequently increases the

complexity for the removal of the redundant edges

(for example: multiple edges and self-loops) for

graph simpliﬁcation. Therefore, contraction of paths

(as shown in Fig.5) with more than 1 edges are ex-

cluded from the dictionary D. Fig.5(a) shows a sub-

graph with 2 edges e

and e

selected for the con-

traction. The solid green vertex is the single survivor

of the connected component. We do not consider

bi-colored vertices in our contraction kernels. A bi-

colored vertex is a survivor for edge e

and a non-

survivor for edge e

. Identifying such paths contain-

ing vertices that act as both survivor and non-survivor

is linear in complexity and requires expensive compu-

tation.

Kropatsch, 2001) as the data structure for all the el-

ements in the dictionary D, the darts incident on a

single vertex can be easily traversed by computing

the permutations σ starting from a randomly selected

dart incident on the surviving vertex. Conversely, the

process of identifying a path of contraction kernel is

competitively more complicated and time consuming.

Figure 5: Contraction of a path containing more than 1 edge

for contraction, excluded from dictionary D.

5 EXPERIMENTS

This section shows the statistical evidence and ex-

perimental veriﬁcation of the claims made in Sec-

tion 3. The implementation of the theoretical frame-

work mentioned in this paper can be optimized de-

pending on the application and the data structure in

use. To keep the statistics more general and indepen-

dent of the data-structure, we compared the histogram

of ξ(e) in Equation 1 for different values of multiplier

m for a total of 400 images randomly selected from

the Linnaeus Database (Malmberg et al., 2010). The

size of the images are 32 × 32 (i.e., |E

| = 1984) and

the histogram is calculated for ξ(e), ∀e ∈ E

of 400

images (1984 ∗ 400 = 793600 edges). For the exper-

iments, the contrast of the edges are normalized with

respect to the maximum contrast of the image, as re-

quired in Eq. 1.

Figure 6: Histogram of ξ(e) for m = 0.5 in Eq.1.

From Fig. 6 we can observe that for m = 0.5, most

of the edges (around 90%) fall in the ﬁrst two bins

with 0 < ξ(e) < 0.2, while for m = 1 and m = 3 in

Fig. 7,8 there are approximately 75% and 45% of the

edges in the ﬁrst two bins of ξ(e) histogram respec-

tively. It shows that the multiplier m plays a vital role

in manipulating the contrast of the edges and set the

priority of the edges for the selection process. The

results for m < 0.5 are not displayed due to lack of

visible difference in the histograms. In essence, lower

the value of m, higher number of edges will have low

cost ξ(e) making them eligible for the selection of the

contraction kernel.

Now, let us investigate the cost of the contraction ker-

nel β(K ) as per Eq. 3 for various values of multiplier

m and Lagrange multiplier λ. The graphics displayed

in Fig. 9, 11 were computed for a single gray scale im-

ICPRAM 2022 - 11th International Conference on Pattern Recognition Applications and Methods

Figure 7: Histogram of ξ(e) for m = 1 in Eq.1.

Figure 8: Histogram of ξ(e) for m = 3 in Eq.1.

age of size 32 × 32 containing 1984 edges in E

. The

x-axis represents the size of the contraction kernel |K |

such that the edges are selected in the ascending or-

der of their cost ξ(e) from Eq.1. The red dots mark the

minimum values of the curves.The x-coordinate of the

red dot corresponds to the

|K | and the y-coordinate of

the red dot corresponds to the cost of the optimal ker-

nel β(

|K |). Each ﬁgure contains curves for different

value of m = 0.15,0.12, 0.09, 0.06, 0.03 and a single

value of λ.

On detailed observation of the above graphs and

the corresponding value of ξ(e), ∀e ∈ K , following

were our observations:

1. At the initial stage of the pyramid,

K comprises

edges with ξ(e) > 0 i.e. edges that are not a part

of plateau regions and level curves. This is explic-

itly displayed by the orange curve in Fig.11 with

m = 0.03 and λ = 16. The minimum value of the

curve β(K ) is attained at |K | = 1700 edges (ap-

Figure 9: Number of edges selected for the contraction ker-

nel |K | vs. cost of the contraction kernel β(K ) for λ = 2.

Figure 10: Number of edges selected for the contraction

kernel |K | vs. cost of the contraction kernel β(K ) for λ = 8.

Figure 11: Number of edges selected for the contraction

kernel |K | vs. cost of the contraction kernel β(K ) for λ =

16.

A Step Towards Learning Contraction Kernels for Irregular Image Pyramid

proximately). As a result there is a higher chance

of the edges with a higher contrast to be selected

for the contraction kernel. This may change the

topology of the image if step 7 and 14 of Algo.1

are not followed. A large size of the contraction

kernel will eventually increase the complexity for

the removal of the edge and graph simpliﬁcation.

2. For a lower value of λ (typically < 2) and a short

range of 0 < m < 1, the minimum of all the curves

β(K ) for different values of m are very close to

each other and tend to result in the selection of the

same contraction kernel. Fig. 9 is a good example

where only 4 minima are visible because the red

marker for the minimum for curve m = 0.12 and

m = 0.09 coincide each other. Conversely, with a

higher value of λ (typically > 10), we can easily

differentiate between the minimum of the curves

as displayed in Fig. 10,11.

5.1 Estimating the Original Image from

a Blurred Binary Image

In this experiment, we used the proposed method for

estimating the original binary image from its blurred

version utilizing the concept of connected component

labelling (CCL). We assume that the edges present in

the interior of a component will have a lower con-

trast as compared to the edges connecting two distinct

components. By varying the value of the multiplier m

and λ in Eq. 3, we managed to obtain a contraction

kernel containing edges connecting two components

with a small contrast. In Fig. 12, to display an easily

observable and intuitive result, we generated an arti-

ﬁcial image. Fig. 12a shows an image after Gaussian

blur with standard deviation of 0.6 while Fig. 12b dis-

plays CCL with 126 components for m = 0.15 and

λ = 15 and |

K | = 76.26% of total edges. By fur-

ther reducing the value of m to 0.05 and increasing

the value of λ to 25, we modify ξ(e) such that higher

number of edges are eligible for the contraction kernel

and |

K | = 77.6% of the total edges. As a result the

number of connected components reduce to 66 as dis-

played in Fig. 12c. Repeating the process further, by

lowering m to 0.01 and increasing λ to 45, the number

of edges eligible for contraction raise to |

K | = 92.7%,

resulting in 6 connected components corresponding to

the original binary image as shown in Fig. 12d.

The experiments of connected component la-

belling were performed on the YACCLAB database

(Grana et al., 2016). Fig. 13 shows the results

of deblurring MRI binary image from YACCLAB

database. Fig. 13b shows the CCL on the original

image with 1070 connected components. Fig. 13c

and Fig. 13d shows the CCL on the deblurred im-

(a) input image

after Gaussian blur

with std. dev. = 0.6

(b)

#CC = 126 for

m = 0.15 and λ = 15

m = 0.05 and λ = 25

(d) #CC = 6 for

m = 0.01 and λ = 45

Figure 12: Connected component labelling of blurred bi-

nary image with different value of parameter m and λ.

Table 2: Values of parameter m, λ and the resulting number

of connected components for Fig. 12a.

output m λ

number of connected

components (#CC)

Fig. 12b 0.15 15 126

Fig. 12c 0.05 25 66

Fig. 12d 0.01 45 6

ages resulted after tuning the values of m and λ. With

m = 0.01 and λ = 100, we achieved a slight deblur-

ring but the number of connected components were

still high around 12155 as shown in Fig. 13c. With

further ﬁne tuning the values- m = 0.008 and λ = 200,

we achieved a better deblurred image with 2642 num-

ber of connected components (refer Fig. 13d). Table 3

summarizes the results displayed in Fig. 13. From

our observations, the deblurring was not optimal for

higher amount of Gaussian blurring especially when

the small connected components were placed to each

other.

The experiments performed on the YACCLAB

database (Grana et al., 2016) are available: ”click

here” (or visit: https://www.prip.tuwien.ac.at/people/

darshan/more/publications/9 TR.pdf ). The docu-

mented results display both desirable and undesirable

results for connected component labelling and deblur-

ring of binary images.

ICPRAM 2022 - 11th International Conference on Pattern Recognition Applications and Methods

(a) input image

after Gaussian blur

with std. dev. = 0.6

(b) CCL of original image

#CC = 1070 for

m = 0.05 and λ = 90

m = 0.01 and λ = 100

(d) #CC = 2642 for

m = 0.008 and λ = 200

Figure 13: Connected component labelling of blurred bi-

nary image with different value of parameter m and λ

Table 3: Values of parameter m, λ and the resulting number

of connected components for Fig. 13a.

output m λ

number of connected

components (#CC)

Fig. 13b 0.05 90 1070

Fig. 13c 0.01 100 12155

Fig. 13d 0.008 200 2642

Remarks: We observe that given a dataset of

blurred images, the values of parameter m and λ can

be learned to achieve an optimal deblurring of a test

sample with similar amount of blurring.

As per the TIIP algorithm mentioned (Batavia

et al., 2020), the plateau regions are contracted before

classiﬁcation of the vertices followed by the contrac-

tion of the edges with c(e) > 0, while preserving the

critical points. Assume we have a 2D image P with R

as the plateau region with the largest diameter d

and

C as the gray scale region with the largest diameter d

after contraction of R. Then the computational com-

plexity of this algorithm for construction of an irregu-

lar pyramid is O(log(d

)+log(d

)). With the help of

the objective function we can further reduce the com-

plexity, since we can contract the edges with c(e) > 0

simultaneously with the contraction of the plateau re-

gion. Thus the complexity is reduced to O(log(d

))

where d

corresponds to the diameter of the largest

connected component that may include the gray scale

region C and the plateau region R.

6 CONCLUSIONS

The paper introduces a novel approach for the selec-

tion of the contraction kernel using an objective func-

tion. We proved that the proposed algorithm based

on the objective function navigates the construction

of the irregular pyramid to decompose a 2D image

into monotonically connected image surface regions

(slope regions). The objective function attempts to

replicate the TIIP algorithm mentioned in (Batavia

et al., 2020; Gonzalez-Diaz et al., 2021) and enables

easier modiﬁcations for application-oriented results.

It can be envisioned as a step towards learning the

contraction kernel for the construction of an irregu-

lar image pyramid. Later we showed statistical obser-

vations that assist in tuning the parameter values of

the objective function. The experiments were focused

on deblurring of blurred images to recover the origi-

nal binary image and connected component labelling.

Lastly, the paper establishes the concept of the dictio-

nary of the connected components of the contraction

kernel. This dictionary is similar to the dictionary of

the shapes but instead is designed for the graphs es-

pecially considering combinatorial maps as the data

structure. The elements of the dictionary can be used

for the optimization of the objective function and to

realize the solution for the decomposition of an im-

age. We leave the optimization method and the gram-

mar of connected components for future research.

7 FUTURE WORK

The objective function in Equation 4 revolves around

the contrast of an edge. However, it does not help

with the location of the edges in binary images or with

plateau regions, since the contrast of edges belonging

to the same component in a binary image (or plateau

regions) is zero. Edges with same contrast have the

equal priority as per the proposed objective function.

We intend to produce a grammar of the contraction

kernel that repeats its occurrence to deliver an easy

solution. Fig. 14 shows an example of a 9 × 9 grid

like structure of the plateau region such that the solid

and the dashed lines correspond to the edges and the

crossing of the lines correspond to the vertices. All

the edges e have the same contrast c(e) = 0. The set

of the oriented edges form the contraction kernel and

the edges are oriented from the non-surviving vertex

A Step Towards Learning Contraction Kernels for Irregular Image Pyramid

to the surviving vertex. The position of the surviving

vertex forms a pattern that corresponds to the knight’s

move in chess. This repetition ensures that the con-

nected components are independent and upon con-

traction, it results in a rotated version of the grid like

structure as shown in Fig.15. The grammar can be

reused on higher levels until the grid structure exists.

Computation of redundant edges, double edges in our

case as shown in Fig.15 can be pre-computed without

an expensive search.

? ?



 



 



6 6

Figure 14: Knight’s vertex contraction kernel for 9 × 9

plateau region.

Figure 15: After contraction of Knight’s vertex contraction

kernel for 9 × 9 plateau region..

Similar to the graph edit distance, we can com-

pute the pyramid edit distance for an irregular image

pyramid, based on the cost of the contraction kernel

at each level.

REFERENCES

Batavia, D., Gonz

alez-D

ıaz, R., and Kropatsch, W. G.

(2020). Image = structure + few colors. In Struc-

tural, Syntactic, and Statistical Pattern Recognition

- Joint IAPR Int. Workshops, S+SSPR, LNCS, pages

365–375. Springer.

Batavia, D., Hlad

uvka, J., and Kropatsch, W. G. (2019).

Partitioning 2D Images into Prototypes of Slope Re-

gion. In Int. Conference on Computer Analysis of Im-

ages and Patterns, LNCS, pages 363–374. Springer.

Brun, L. and Kropatsch, W. (2001). Introduction to com-

binatorial pyramids. In Digital and image geometry,

pages 108–128. Springer.

Cerman, M., Janusch, I., Gonzalez-Diaz, R., and Kropatsch,

W. G. (2016). Topology-based image segmentation

using LBP pyramids. Machine Vision and Applica-

tions, 27(8):1161–1174.

Comic, L., De Floriani, L., and Iuricich, F. (2010). Build-

ing Morphological Representations for 2D and 3D

Scalar Fields. In Puppo, E., Brogni, A., and Floriani,

L. D., editors, Eurographics Italian Chapter Confer-

ence 2010, pages 103–110.

Edelsbrunner, H., Harer, J., and Zomorodian, A. (2003). Hi-

erarchical morse-smale complexes for piecewise lin-

ear 2-manifolds. Discrete and computational Geome-

try, 30(1):87–107.

Gonzalez-Diaz, R., Batavia, D., Casablanca, R. M., and

Kropatsch, W. G. (2021). Characterizing slope re-

gions. Journal of Combinatorial Optimization, pages

1–20.

Grana, C., Bolelli, F., Baraldi, L., and Vezzani, R. (2016).

YACCLAB - Yet Another Connected Components La-

beling Benchmark. In 23rd International Conference

on Pattern Recognition. ICPR.

Gyulassy, A., G

unther, D., Levine, J. A., Tierny, J., and

Pascucci, V. (2014). Conforming Morse-Smale Com-

plexes. IEEE Transactions on Visualization and Com-

puter Graphics, 20(12):2595–2603.

Helman, J. L. and Hesselink, L. (1991). Visualizing vector

ﬁeld topology in ﬂuid ﬂows. IEEE CGA, 11(3):36–46.

Kropatsch, W. G., Casablanca, R. M., Batavia, D., and

Gonzalez-Diaz, R. (2019). On the space between crit-

ical points. In Int. Conference on Discrete Geom-

etry for Computer Imagery, LNCS, pages 115–126.

Springer.

Latecki, L., Eckhardt, U., and Rosenfeld, A. (1995). Well-

composed sets. Comput. Vis. Image Underst., 61:70–

83.

Lee, R. N. (1984). Two-dimensional critical point conﬁgu-

ration graphs. IEEE Transactions on Pattern Analysis

and Machine Intelligence, (4):442–450.

Malmberg, G., Nilsson, L.-G., and Weinehall, L. (2010).

Longitudinal data for interdisciplinary ageing re-

search. design of the linnaeus database. Scandinavian

journal of public health, 38(7):761–767.

Morales-Gonz

alez, A. and Garc

ıa-Reyes, E. B. (2013).

Simple object recognition based on spatial relations

and visual features represented using irregular pyra-

mids. Multimedia tools and applications, 63(3):875–

897.

Shinagawa, Y., Kunii, T., and Kergosien, Y. (1991). Sur-

face coding based on morse theory. IEEE Computer

Graphics and Applications, 11(5):66–78.

Stein, E., Milnor, J. W., Spivak, M., Wells, R., Wells, R.,

and Mather, J. N. (1963). Morse Theory. Princeton

University Press.

ICPRAM 2022 - 11th International Conference on Pattern Recognition Applications and Methods